1 回答
TA贡献1806条经验 获得超8个赞
对于单个单元格:
import re
f_clean = lambda x: re.sub('[$\-,\|]', '', x)
f_float = lambda x: float(x) if x!='' else np.NaN
print(f_float(f_clean('$10,000.00')))
# 10000.0
对于整个列:
# Here's a solution I use for large datasets:
import re
def lookup_money(s):
"""
This is an extremely fast approach to parsing currency to floats.
For large data, the same currencies are often repeated. Rather than
re-parse these, we store all unique values, parse them, and
use a lookup to convert all figures.
(Should be 10X faster than without lookup dict)
"""
f_clean = lambda x: re.sub('[$\-,\|]', '', x)
f_float = lambda x: float(x) if x!='' else np.NaN
currencies = {curr:f_float(f_convert(curr)) for curr in s.unique()}
return s.map(currencies)
# apply to some df as such:
# df['curr'] = df['curr'].apply(lookup_money)
添加回答
举报