为了账号安全,请及时绑定邮箱和手机立即绑定

Tensorflow抛出distributed_function错误

Tensorflow抛出distributed_function错误

MM们 2022-07-19 15:18:55
我是 ML 和 tensorflow 的新手,正在尝试训练和使用标准的文本生成模型。当我去训练模型时,我得到了这个错误:Train for 155 stepsEpoch 1/5  2/155 [..............................] - ETA: 4:49 - loss: 2.5786---------------------------------------------------------------------------InvalidArgumentError                      Traceback (most recent call last)<ipython-input-133-d70c02ff4270> in <module>()----> 1 model.fit(dataset, epochs=epochs, callbacks=[checkpoint_callback])11 frames/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)InvalidArgumentError: 2 root error(s) found.  (0) Invalid argument:  indices[58,87] = 63 is not in [0, 63)     [[node sequential_12/embedding_12/embedding_lookup (defined at <ipython-input-131-d70c02ff4270>:1) ]]     [[VariableShape/_24]]  (1) Invalid argument:  indices[58,87] = 63 is not in [0, 63)     [[node sequential_12/embedding_12/embedding_lookup (defined at <ipython-input-131-d70c02ff4270>:1) ]]0 successful operations.0 derived errors ignored. [Op:__inference_distributed_function_95797]Errors may have originated from an input operation.Input Source operations connected to node sequential_12/embedding_12/embedding_lookup: sequential_12/embedding_12/embedding_lookup/92192 (defined at /usr/lib/python3.6/contextlib.py:81)Input Source operations connected to node sequential_12/embedding_12/embedding_lookup: sequential_12/embedding_12/embedding_lookup/92192 (defined at /usr/lib/python3.6/contextlib.py:81)Function call stack:distributed_function -> distributed_function数据data['title'] = [['Sentence'],['Sentence2'], ...]数据准备tokenizer = keras.preprocessing.text.Tokenizer(num_words=209, lower=False, char_level=True)tokenizer.fit_on_texts(df['title'])df['encoded_with_keras'] = tokenizer.texts_to_sequences(df['title'])dataset = df['encoded_with_keras']dataset = tf.keras.preprocessing.sequence.pad_sequences(dataset, padding='post')dataset = dataset.flatten()dataset = tf.data.Dataset.from_tensor_slices(dataset)sequences = dataset.batch(seq_len+1, drop_remainder=True)
查看完整描述

4 回答

?
萧十郎

TA贡献1815条经验 获得超13个赞

尝试将 batch_size 更改为 32、16 或 8 之类的值。显然,对于 rtx 2060/70/80,有一个 tensorflow 错误导致其内存不足。



查看完整回答
反对 回复 2022-07-19
?
慕容708150

TA贡献1831条经验 获得超4个赞

在类似的情况下,以下代码段有所帮助。


import tensorflow as tf

physical_devices = tf.config.list_physical_devices('GPU')

tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)在类似的情况下,以下代码段有所帮助。


import tensorflow as tf

physical_devices = tf.config.list_physical_devices('GPU')

tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)


查看完整回答
反对 回复 2022-07-19
?
慕勒3428872

TA贡献1848条经验 获得超6个赞

我通过将validation_data添加到fit()函数中解决了这个问题

model.fit(X,validation_data = y)


查看完整回答
反对 回复 2022-07-19
?
狐的传说

TA贡献1804条经验 获得超3个赞

我认为这里的大多数答案都忽略了这里问题的症结所在。这里的 Tensorflow 模型尝试对定义的 Embedding Layer 中不存在的索引执行 Embedding Lookup 操作。大多数答案都指向 VRAM 问题,但很可能由于简单的查找问题而出现此消息。


要解决此问题,您可以定义自己的字典并对标签进行编码,并且对于每个未知标签,您可以返回0或-1保留一个未知类别。


一些解决此类问题的示例代码(受这篇文章的启发,似乎适用于测试数据):


自定义字典处理类:


class EmbeddingMapping:

    """

    An instance of this class should be defined

    for each categorical variable you want to use.

    """

    def __init__(self, series: pd.Series) -> None:

        # get a list of unique values

        values = series.unique().tolist()


        # dictionary mapping

        self.embedding_dict: Dict[str, int] = {value: int_value + 1 for int_value, value in enumerate(values)}

        self.num_values: int = len(values) + 1  # +1 for unknown categories


    def get_mapping(self, value: str) -> int:

        # return value if it was seen in training

        if value in self.embedding_dict:

            return self.embedding_dict[value]

        else:

            return 0

构建映射:


# build mappings

res_dict_train: Dict[str, EmbeddingMapping] = {}

for var in categorical_features:

    embd_train = EmbeddingMapping(X_train_categorical[var])

    temp_series_train = X_train_categorical[var].apply(embd_train.get_mapping)


    res_dict_train[var] = temp_series_train


X_train_categorical = X_train_categorical.assign(**res_dict_train)

结合分类和数值特征的模型:


# Keras

# Categorical vars

models_lst = []

inputs = []

for cat_feature in categorical_features:

    print('---------------------------------------')

    print(f'Info for categorical feature {cat_feature}')

    input_i = Input(shape=(1,), dtype='int32')

    inputs.append(input_i)

    num_categories = EmbeddingMapping(X_train_categorical[cat_feature]).num_values

    print(f"Number of categories: {num_categories}")

    embedding_size = min(np.ceil(num_categories/2), 50)     # rule of thumb

    embedding_size = int(embedding_size)

    print(f'Embedding size: {embedding_size}')

    model_i = Embedding(input_dim=num_categories, output_dim=embedding_size, input_length=1, name=f'embedding_{cat_feature}')(input_i)

    model_i2 = Reshape(target_shape=(embedding_size,))(model_i)


    models_lst.append(model_i2)


# layer for numerical

input_numerical = Input(shape=(len(numerical_features),), dtype='float32')

numerical_model = Reshape(target_shape=(2,))(input_numerical)

models_lst.append(numerical_model)

inputs.append(input_numerical)


concatenated = concatenate(models_lst, axis=-1)

mymodel = Dense(50, activation="relu")(concatenated)

mymodel2 = Dense(15, activation="relu")(mymodel)

mymodel3 = Dense(1, activation='sigmoid')(mymodel2)


final_model = models.Model(inputs, mymodel3)


final_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['acc', 'binary_accuracy'])


final_model.fit(x=train_input_list, validation_split=0.2, y=y_train, epochs=1, batch_size=128)

为了解释代码,它创建了一个嵌入层,如果嵌入查找在任何情况下都失败,我们分配一个未知变量。如果您有一个自定义的 Data 对象,例如 Pandas DataFrame,您可以使您的数值和分类特征离散,并以这种方式应用模型,或者仅使用具有上述映射的分类模型代码。另一种方法是使用 Scikit-Learn OrdinalEncoder(自 SKLearn 0.24.2 起添加),但我发现这更简单,因为它易于维护。


查看完整回答
反对 回复 2022-07-19
  • 4 回答
  • 0 关注
  • 204 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信