首页猿问 Keras：如何在 LSTM...

Keras：如何在 LSTM 模型中显示注意力权重

Python

蓝山帝景 2021-06-06 12:34:37

我使用带有注意力层的 LSTM 制作了一个文本分类模型。我的模型做得很好，效果很好，但是我无法显示评论（输入文本）中每个单词的注意力权重和重要性/注意力。该模型使用的代码是：def dot_product(x, kernel): if K.backend() == 'tensorflow': return K.squeeze(K.dot(x, K.expand_dims(kernel)),axis=-1) else: return K.dot(x, kernel)class AttentionWithContext(Layer): """Attention operation, with a context/query vector, for temporal data."Hierarchical Attention Networks for Document Classification"by using a context vector to assist the attention# Input shape 3D tensor with shape: (samples, steps, features).# Output shape 2D tensor with shape: (samples, features).How to use:Just put it on top of an RNN Layer (GRU/LSTM/SimpleRNN) with return_sequences=True.The dimensions are inferred based on the output shape of the RNN.Note: The layer has been tested with Keras 2.0.6Example: model.add(LSTM(64, return_sequences=True)) model.add(AttentionWithContext()) # next add a Dense layer (for classification/regression) or whatever """def __init__(self, W_regularizer=None, u_regularizer=None, b_regularizer=None, W_constraint=None, u_constraint=None, b_constraint=None, bias=True, **kwargs): self.supports_masking = True self.init = initializers.get('glorot_uniform') self.W_regularizer = regularizers.get(W_regularizer) self.u_regularizer = regularizers.get(u_regularizer) self.b_regularizer = regularizers.get(b_regularizer) self.W_constraint = constraints.get(W_constraint) self.u_constraint = constraints.get(u_constraint) self.b_constraint = constraints.get(b_constraint) self.bias = bias super(AttentionWithContext, self).__init__(**kwargs)

查看完整描述

3 回答

慕慕森

TA贡献1856条经验获得超17个赞

看完以上综合答案，我终于明白了如何提取注意力层的权重。总的来说，@李翔和@Okorimi Manoury 的想法都是正确的。对于@Okorimi Manoury 的代码段，来自以下链接：Textual attention visualization。

现在，让我逐步解释该过程：

(1). 你应该有一个训练有素的模型，你需要加载模型并提取注意力层的权重。要提取某些层权重，您可以使用model.summary()来检查模型架构。然后，您可以使用：

layer_weights = model.layers[3].get_weights() #suppose your attention layer is the third layer

layer_weights是一个列表，例如对于HAN注意力的词级注意力，该列表layer_weights具有三个元素：W、b和u。换句话说，layer_weights[0] = W, layer_weights[1] = b, and layer_weights[2] = u。

(2). 您还需要在注意力层之前获得层输出。在这个例子中，我们需要得到第二层输出。您可以使用以下代码执行以下操作：

new_model = Model(inputs=model.input, outputs=model.layers[2].output) output_before_att = new_model.predict(x_test_sample) #extract layer output

(3). 查看详情：假设你输入的是一个100字300维度的文本段（输入是[100, 300]），第二层之后维度是128，那么形状output_before_att就是[100, 128]。相应地，layer_weights[0](W)为[128, 128]，layer_weights[1](b)为[1, 128]，layer_weights[2](u)为[1,128]。然后，我们需要以下代码：

eij = np.tanh(np.dot(output_before_att, layer_weights[0]) + layer_weights[1]) #Eq.(5) in the paper

eij = np.dot(eij, layer_weights[2]) #Eq.(6)

eij = eij.reshape((eij.shape[0], eij.shape[1])) # reshape the vector

ai = np.exp(eij) #Eq.(6)

weights = ai / np.sum(ai) # Eq.(6)

这weights是一个列表（100 维），每个元素是 100 个输入词的注意力权重（重要性）。之后，您可以可视化注意力权重。

希望我的解释能帮到你。

反对回复 2021-06-09

拉风的咖菲猫

TA贡献1995条经验获得超2个赞

您可以使用get_weights()自定义图层的方法来获取所有权重的列表。您可以在此处找到更多信息。

您需要在模型创建期间对代码进行以下修改：

model1.add(TimeDistributed(Dense(200)))

atn = AttentionWithContext()

model1.add(atn)

然后，训练后，只需使用：

atn.get_weights()[index]

将权重矩阵提取W为numpy数组（我认为index应该设置为0，但您必须自己尝试）。然后你可以使用pyplot's imshow/matshow 方法来显示矩阵。

反对回复 2021-06-09

3 回答
0 关注
548 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

Keras：如何在 LSTM 模型中显示注意力权重

Keras：如何在 LSTM 模型中显示注意力权重

3 回答

添加回答