以下数据集表示购买行为:user_id, product_code, bought_date, time_spent, store_id, product_type, refurbished, unqiue_visit_id001, e.12, 20120102, 104, 101, computer, yes, 1010002, e.24, 20120201, 100, 101, infant-dress, no, 2001003, s.32, 20130302, 230, 101, shoes, no, 2121004, y.23, 20130404, 212, 103, computer, yes, 2422005, s.43, 20130803, 104, 101, laptop, yes, 2342001, a.12, 20120102, 104, 101, computer, yes, 1011002, b.24, 20120201, 100, 101, infant-dress, no, 2001003, c.32, 20130302, 230, 101, shoes, no, 2122004, e.23, 20130404, 212, 103, computer, yes, 2424005, f.43, 20130803, 104, 101, laptop, yes, 2340001, g.12, 20120102, 104, 101, computer, yes, 1013002, h.24, 20120201, 100, 101, infant-dress, no, 2031003, l.32, 20130302, 230, 101, shoes, no, 2000004, m.23, 20130404, 212, 103, computer, yes, 1422005, d.43, 20130803, 104, 101, laptop, yes, 1142001, d.12, 20120102, 104, 101, desk, yes, 1110002, f.24, 20120201, 100, 101, glass, no, 1111003, n.32, 20130302, 230, 101, liquid, no, 2021004, t.23, 20130404, 212, 103, liquid, yes, 22005, u.43, 20130803, 104, 101, dress, yes, 2942001, d.12, 20120102, 104, 101, desk, yes, 1910002, f.24, 20120201, 100, 101, glass, no, 2901003, n.32, 20130302, 230, 101, liquid, no, 2921004, t.23, 20130404, 212, 103, liquid, yes, 2922005, u.43, 20130803, 104, 101, dress, yes, 2942001, kk.12, 20120103, 105, 101, desk, yes, 410003, n.32, 20130303, 230, 101, liquid, no, 2621最终目标是使用以下步骤为用户分配产品类型。首先,我对进行分组user_id,product_type并获得用户访问过的访问次数(次数)product_type。如果组()中的计数相等user_id,则product_id选择用户最近访问的产品类型,并将其分配给用户。如果访问日期相等,那么我们通过查看refurbished值来打破平局(yes > no)。visit_counts = merged_visits_df.groupby(['user_id','product_type'], as_index=False).agg({'unique_visits_id': 'nunique'})上面给出了访问次数,试图找出其余的过程。
添加回答
举报
0/150
提交
取消