首页猿问如何根据通过熊猫从LAMMPS输出...

如何根据通过熊猫从LAMMPS输出文件导入的键合数据将原子细分为三组？

Python

慕田峪9158850 2022-09-27 09:55:37

我是编程和分子动力学模拟的新手。我正在使用LAMMPS来模拟物理气相沉积（PVD）过程，并确定不同时间步长中原子之间的相互作用。在我执行分子动力学模拟后，LAMMPS为我提供了一个输出键文件，其中包含每个单个原子的记录（作为原子ID），它们的类型（向特定元素的编号），以及与这些特定原子键合的其他原子的信息。典型的债券文件如下所示。我的目标是根据原子的类型（如组1：氧-氢-氢）对原子进行三组分类，方法是考虑它们从键输出文件中的键合信息，并计算每个时间步长的组数。我使用熊猫，并为每个时间步长创建了一个数据帧。df = pd.read_table(directory, comment="#", delim_whitespace= True, header=None, usecols=[0,1,2,3,4,5,6] )headers= ["ID","Type","NofB","bondID_1","bondID_2","bondID_3","bondID_4"]df.columns = headersdf.fillna(0,inplace=True)df = df.astype(int)timestep = int(input("Number of Timesteps: ")) #To display desired number of timesteps.total_atom_number = 53924 #Total number of atoms in the simulation.t= 0 #code starts from 0th timestep.firstTime = []while(t <= timestep): open('file.txt', 'w').close() #In while loop = displays every timestep individually, Out of the while loop = displays results cumulatively. i = 0 df_tablo =(df[total_atom_number*t:total_atom_number*(t+1)]) #Creates a new dataframe that inlucdes only t'th timestep. df_tablo.reset_index(inplace=True) print(df_tablo)请参阅此示例，该示例说明了我对 3 个原子进行分组的算法。键合列显示与其行中的原子键合在一起的不同原子（按原子 ID）。例如，通过使用该算法，我们可以对[1，2，5]和[1，2，6]进行分组，但不能对[1，2，1]进行分组，因为原子不能与自身建立键。此外，我们可以在分组后将这些原子ID（第一列）转换为它们的原子类型（第二列），例如[1，3，7]到[1，1，3]。通过遵循上面提到的键，1）我可以成功地将原子相对于它们的ID分组，2）将它们转换为它们的原子类型，3）分别计算每个时间步中存在的组数。第一个 while 循环（上图）计算每个时间步的组，而第二个 while 循环（下图）将每行中的原子（等于存在的每个原子 ID）与数据帧中不同行的相应绑定原子进行分组。

查看完整描述

1 回答

繁星coding

TA贡献1797条经验获得超4个赞

我不确定我是否理解了逻辑，看看这是否有帮助。

对于100000三重奏，需要41秒。

loc，get_loc是非常广泛的操作，所以把你的表放在字典里，而不是验证一切都是唯一的，把它放在一个集合中

import pandas as pd

import random

from collections import defaultdict as dd

from collections import Counter

import time

# create 100000 unique trios of numbers

ids = list(range(50000))

trios_set = set()

while len(trios_set)<100000:

trio = random.sample(ids,3)

trios_set.add(frozenset(trio))

ids_dict = dd(list) # a dictionery where id is the key and value is all the id who are partner with it in a list

for s in trios_set:

for id in s:

for other_id in s:

if id!= other_id:

ids_dict[id].append(other_id)

ids_dict = dict(ids_dict)

for_df = []

type_list = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]

for id in ids_dict:

massage = {}

massage["id"] = id

other_id_index = 1

for other_id in ids_dict[id]:

massage["id_"+str(other_id_index)] = other_id

other_id_index+=1

massage["type"] = random.choice(type_list)

for_df.append(massage)

df = pd.DataFrame(for_df) # a table with id column and all ids who are with it in trios in id_1 id_2.. and type column with a letter

#------------------------------------------------------------------

#till here we built the input table

start_time = time.time() #till here we build the input table, now check the time for 100000 atoms

type_dict = {}

from_df = dd(set)

for i,r in df.iterrows(): #move the dataframe to a dict of id as key and value as list of ids who connected to it

for col in df:

if "id_"in col and str(r[col])!="nan":

from_df[r["id"]].add(r[col])

type_dict[r["id"]] = r["type"] #save the type of id in a dictionery

from_df = dict(from_df)

out_trio_set = set()

for id in from_df:

for other_id in from_df[id]:

if other_id!= id and str(other_id)!="nan":

for third_id in from_df[other_id]:

current_trio = frozenset([id, other_id,third_id])

if len(current_trio)==3:

out_trio_set.add(current_trio)

type_conter = Counter()

for trio in out_trio_set:

type_list = []

for id in trio:

type_list.append(type_dict[id])

type_list = sorted(type_list)

atom_type = "".join(type_list)

type_conter[atom_type] +=1

out_df = pd.DataFrame(type_conter, index = [1]) # in here put index as timestamp

out_df.to_excel(r"D:\atom.xlsx")

print("--- %s seconds ---" % (time.time() - start_time))

反对回复 2022-09-27

1 回答
0 关注
81 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何根据通过熊猫从LAMMPS输出文件导入的键合数据将原子细分为三组？

如何根据通过熊猫从LAMMPS输出文件导入的键合数据将原子细分为三组？

1 回答

添加回答