我正在尝试为每个用户获取15个最相关的项目,但是我尝试的每个功能都花了很长时间。(超过6个小时后,我将其关闭了...)我有418个独特用户,3718个独特项目。U2tfifd dict也有418个条目,并且tfidf_feature_names中有32645个单词。我的interacts_full_df的形状是(40733,3)我试过了 : def index_tfidf_users(user_id) : return [users for users in U2tfifd[user_id].flatten().tolist()]def get_relevant_items(user_id): return sorted(zip(tfidf_feature_names, index_tfidf_users(user_id)), key=lambda x: -x[1])[:15]def get_tfidf_token(user_id) : return [words for words, values in get_relevant_items(user_id)]然后 interactions_full_df["tags"] = interactions_full_df["user_id"].apply(lambda x : get_tfidf_token(x))或者def get_tfidf_token(user_id) : tags = [] v = sorted(zip(tfidf_feature_names, U2tfifd[user_id].flatten().tolist()), key=lambda x: -x[1])[:15] for words, values in v : tags.append(words) return tags或者def get_tfidf_token(user_id) : v = sorted(zip(tfidf_feature_names, U2tfifd[user_id].flatten().tolist()), key=lambda x: -x[1])[:15] tags = [words for words in v] return tagsU2tfifd是具有键= user_id,值=数组的字典
添加回答
举报
0/150
提交
取消