3 回答
TA贡献1796条经验 获得超10个赞
只需使用pythons json库
import json
j1 = """{
"ajax": {
"params": {
"asinMetadataKeys": "adId",
"featureId": "SimilaritiesCarousel",
"reftagPrefix": "pd_sbs_60",
"widgetTemplateClass": "PI::Similarities::ViewTemplates::Carousel::Desktop",
"imageHeight": 160,
"faceoutTemplateClass": "PI::P13N::ViewTemplates::Product::Desktop::CarouselFaceout",
"auiDeviceType": "desktop",
"imageWidth": 160,
"schemaVersion": 2,
"productDetailsTemplateClass": "PI::P13N::ViewTemplates::ProductDetails::Desktop::Base",
"forceFreshWin": 0,
"productDataFlavor": "Faceout",
"relatedRequestID": "H21WNBAW5EGZX90ND4PN",
"maxLineCount": 6
},
"id_list": ["B01M8QSY16:", "B017XBDBI6:", "B01GL5MYCE:", "B0751DHYXC:", "B01AHWOH54:", "B01M7XYENW:", "B01N7FKKXV:", "B07C1NLKS5:", "B00R25QZDC:", "B01AJB1VFW:", "B079K773M7:", "B07DX3W41P:", "B01GL5606A:", "B07654YLSB:", "B01GFL6MZE:", "B00WLI5E3M:", "B01CTE28DG:", "B01BELELVC:", "B00ZY7H91M:", "B077TPG2WK:", "B01G503MC6:", "B01LYZFC4V:", "B00ID9UQYK:", "B07C3T52LB:", "B07DX39RNS:", "B076551MZP:", "B0761RWKPQ:", "B00T8FD9YM:", "B07653JBYS:", "B07G316H74:", "B01FSEBC9K:", "B014QKBVH0:", "B01BVA2I4S:", "B01CVOZNAE:", "B07D19JDH9:", "B018ACDMJK:", "B00V0H83YW:", "B07C432PK3:", "B07B9P4T4V:", "B076H4WWLK:", "B077G3Y86F:", "B077Z7XLJF:", "B01NCFB2BB:", "B01M4I7FMC:", "B01BEVFJCM:", "B01FSEBC8G:", "B07DXCTKB6:", "B01NBHYAR0:", "B07DGWJ887:", "B00SLP58SU:", "B01N55H5AE:", "B013AZCPLS:", "B076PC3NYV:", "B01BVA2JHE:", "B07FF38J8C:", "B07DHGTS81:", "B00R25QZHS:"],
"url": "/gp/p13n-shared/faceout-partial",
"id_param_name": "asins"
},
"baseAsin": "B01GL56060",
"name": "desktop-dp-sims_session-similarities",
"set_size": 57
}"""
d1 = json.loads(j1)
id_list = [elem.replace(":", "") for elem in d1["ajax"]['id_list']]
id_list
输出:
['B01M8QSY16',
'B017XBDBI6',
...
'B00R25QZHS']
我不得不删除“linkGetParameters : ...”这一行,因为它似乎不符合 json 格式。
TA贡献1829条经验 获得超7个赞
既然你不能使用 JSON 库,你可以试试这个 here 表达式(在 Python3 上测试):
result = [ id.strip('":') for id in re.search('"id_list": \[(.*)\],', jsonstr).group(1).split(", ") ]
(其中jsonstr
是包含所有原始 JSON 代码的字符串)。
为了更容易理解,上面的代码使用了
re.search
(不像re.filterall
您建议的那样)广泛定位和选择该行,group
缩小选择范围,split
将字符串转换为列表,以及strip
修剪掉每个列表项中不必要的字符
给您留下一个 ID 列表,例如您在问题中指定的 ID。
添加回答
举报