在这里,我试图将一个表与具有数百万个记录的2个表进行比较,所以我想分别比较dem并在主表“ tabl”名称状态中创建一个单独的列,以便我可以自己更新步骤:在主表中的电子邮件名称“ status”旁边创建一个新列步骤2:必须在与tab1,tab2表进行比较时更新该列list_of_tables=['tab1','tab2'] for tab in list_of_tables: cursor.execute("select main.*,if({}.email is not null ,'MATCH','NONMATCH') stataus from main left join {} on main.email={}.email".format(tab,tab,tab)) data_2 = cursor.fetchall() print data_2 data3=list(data_2) data_3=pd.DataFrame(data3) upload(ftp,data_3,FILEPATH)def upload(ftp,data_3,FILEPATH): data_4=data_3.to_csv(Out_file,index=False,header=None)main:emailabc@gamil.comxyz@email.comijk@gmail.comghi@gmail.compqr@gmail.comyup@gmail.comtab1:emailijk@gmail.comyup@gmail.comtab2:emailxyz@email.compqr@gmail.com要求的结果email validabc@gamil.com non-matchxyz@email.com matchijk@gmail.com matchghi@gmail.com non-matchpqr@gmail.com matchyup@gmail.com matchbut getting like dis:abc@gamil.com non-matchxyz@email.com non-matchijk@gmail.com matchghi@gmail.com non-matchpqr@gmail.com non-matchyup@gmail.com matchabc@gamil.com nonmatchxyz@email.com matchijk@gmail.com nonmatchghi@gmail.com nonmatchpqr@gmail.com nonmatchyup@gmail.com nonmatch
添加回答
举报
0/150
提交
取消