2 回答
TA贡献1868条经验 获得超4个赞
您可以将字典转换为返回的字符串,然后拆分为字典。
这可能看起来像这样:
return (images + " " + labels + " " + new value)
然后在您的其他功能中:
l = map_func(image, label).split(" ")
d['images'] = l[0]
d[
...
TA贡献1863条经验 获得超2个赞
我也遇到过这个问题(我想使用非 TF 函数预处理文本数据,但将所有内容都保留在 Tensorflow 的 Dataset 对象的保护伞下)。事实上,不需要双重map()解决方法。在处理每个示例时,只需嵌入 Python 函数。
这是完整的示例代码;也在 colab 上进行了测试(前两行用于安装依赖项)。
!pip install tensorflow-gpu==2.0.0b1
!pip install tensorflow-datasets==1.0.2
from typing import Dict
import tensorflow as tf
import tensorflow_datasets as tfds
# Get a textual dataset using the 'tensorflow_datasets' library
dataset_builder = tfds.text.IMDBReviews()
dataset_builder.download_and_prepare()
# Do not randomly shuffle examples for demonstration purposes
ds = dataset_builder.as_dataset(shuffle_files=False)
training_ds = ds[tfds.Split.TRAIN]
print(training_ds)
# <_OptionsDataset shapes: {text: (), label: ()}, types: {text: tf.string,
# label: tf.int64}>
# Print the first training example
for example in training_ds.take(1):
print(example['text'])
# tf.Tensor(b"As a lifelong fan of Dickens, I have ... realised.",
# shape=(), dtype=string)
# some global configuration or object which we want to access in the
# processing function
we_want_upper_case = True
def process_string(t: tf.Tensor) -> str:
# This function must have been called as tf.py_function which means
# it's always eagerly executed and we can access the .numpy() content
string_content = t.numpy().decode('utf-8')
# Now we can do what we want in Python, i.e. upper-case or lower-case
# depending on the external parameter.
# Note that 'we_want_upper_case' is a variable defined in the outer scope
# of the function! We cannot pass non-Tensor objects as parameters here.
if we_want_upper_case:
return string_content.upper()
else:
return string_content.lower()
def process_example(example: Dict[str, tf.Tensor]) -> Dict[str, tf.Tensor]:
# I'm using typing (Dict, etc.) just for clarity, it's not necessary
result = {}
# First, simply copy all the tensor values
for key in example:
result[key] = tf.identity(example[key])
# Now let's process the 'text' Tensor.
# Call the 'process_string' function as 'tf.py_function'. Make sure the
# output type matches the 'Tout' parameter (string and tf.string).
# The inputs must be in a list: here we pass the string-typed Tensor 'text'.
result['text'] = tf.py_function(func=process_string,
inp=[example['text']],
Tout=tf.string)
return result
# We can call the 'map' function which consumes and produces dictionaries
training_ds = training_ds.map(lambda x: process_example(x))
for example in training_ds.take(1):
print(example['text'])
# tf.Tensor(b"AS A LIFELONG FAN OF DICKENS, I HAVE ... REALISED.",
# shape=(), dtype=string)
添加回答
举报