首页猿问如何从 Pytorch...

如何从 Pytorch 中的单个图像中提取特征向量？

Python

ibeautiful 2023-06-20 13:23:41

我正在尝试更多地了解计算机视觉模型，并且正在尝试探索它们的工作原理。为了更好地理解如何解释特征向量，我尝试使用 Pytorch 来提取特征向量。下面是我从不同地方拼凑而成的代码。import torchimport torch.nn as nnimport torchvision.models as modelsimport torchvision.transforms as transformsfrom torch.autograd import Variablefrom PIL import Imageimg=Image.open("Documents/01235.png")# Load the pretrained modelmodel = models.resnet18(pretrained=True)# Use the model object to select the desired layerlayer = model._modules.get('avgpool')# Set model to evaluation modemodel.eval()transforms = torchvision.transforms.Compose([ torchvision.transforms.Resize(256), torchvision.transforms.CenterCrop(224), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) def get_vector(image_name): # Load the image with Pillow library img = Image.open("Documents/Documents/Driven Data Competitions/Hateful Memes Identification/data/01235.png") # Create a PyTorch Variable with the transformed image t_img = transforms(img) # Create a vector of zeros that will hold our feature vector # The 'avgpool' layer has an output size of 512 my_embedding = torch.zeros(512) # Define a function that will copy the output of a layer def copy_data(m, i, o): my_embedding.copy_(o.data) # Attach that function to our selected layer h = layer.register_forward_hook(copy_data) # Run the model on our transformed image model(t_img) # Detach our copy function from the layer h.remove() # Return the feature vector return my_embeddingpic_vector = get_vector(img)当我这样做时，出现以下错误：RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 224, 224] instead我确定这是一个基本错误，但我似乎无法弄清楚如何解决这个问题。我的印象是“totensor”转换会使我的数据成为 4-d，但它似乎无法正常工作或者我误解了它。感谢我可以用来了解更多信息的任何帮助或资源！

查看完整描述

3 回答

萧十郎

TA贡献1815条经验获得超13个赞

pytorch 中的所有默认值都nn.Modules需要一个额外的批次维度。如果模块的输入是形状 (B, ...) 那么输出也将是 (B, ...) （尽管后面的维度可能会根据层而改变）。此行为允许同时对 B 批输入进行有效推理。为了使您的代码符合您的要求，您可以在将张量发送到您的模型以使其成为 (1, ...) 张量之前，在张量unsqueeze的前面增加一个单一维度。如果你想将它复制到你的一维张量中t_img，你还需要在存储它之前flatten的输出。layermy_embedding

其他几件事：

您应该在上下文中进行推断torch.no_grad()以避免计算梯度，因为您将不需要它们（请注意，model.eval()只是更改某些层的行为，如 dropout 和批归一化，它不会禁用计算图的构建，但会torch.no_grad()禁用）。
我认为这只是一个复制粘贴问题，但它transforms是一个导入模块的名称以及一个全局变量。
o.data只是返回o. 在旧Variable界面（大约 PyTorch 0.3.1 及更早版本）中，这曾经是必需的，但该Variable界面在 PyTorch 0.4.0中已被弃用，不再做任何有用的事情；现在它的使用只会造成混乱。不幸的是，许多教程仍在使用这种陈旧且不必要的界面编写。

更新后的代码如下：

import torch

import torchvision

import torchvision.models as models

from PIL import Image

img = Image.open("Documents/01235.png")

# Load the pretrained model

model = models.resnet18(pretrained=True)

# Use the model object to select the desired layer

layer = model._modules.get('avgpool')

# Set model to evaluation mode

model.eval()

transforms = torchvision.transforms.Compose([

torchvision.transforms.Resize(256),

torchvision.transforms.CenterCrop(224),

torchvision.transforms.ToTensor(),

torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),

])

def get_vector(image):

# Create a PyTorch tensor with the transformed image

t_img = transforms(image)

# Create a vector of zeros that will hold our feature vector

# The 'avgpool' layer has an output size of 512

my_embedding = torch.zeros(512)

# Define a function that will copy the output of a layer

def copy_data(m, i, o):

my_embedding.copy_(o.flatten()) # <-- flatten

# Attach that function to our selected layer

h = layer.register_forward_hook(copy_data)

# Run the model on our transformed image

with torch.no_grad(): # <-- no_grad context

model(t_img.unsqueeze(0)) # <-- unsqueeze

# Detach our copy function from the layer

h.remove()

# Return the feature vector

return my_embedding

pic_vector = get_vector(img)

反对回复 2023-06-20

qq_笑_17

TA贡献1818条经验获得超7个赞

您可以使用create_feature_extractorfrom 从torchvision.models.feature_extraction模型中提取所需层的特征。

ResNet18 中最后一个隐藏层的节点名称flatten基本上是扁平化的 1D avgpool。你可以通过在下面的字典中添加它们来提取你想要的任何层return_nodes。

from torchvision.io import read_image

from torchvision.models import resnet18, ResNet18_Weights

from torchvision.models.feature_extraction import create_feature_extractor

# Step 1: Initialize the model with the best available weights

weights = ResNet18_Weights.DEFAULT

model = resnet18(weights=weights)

model.eval()

# Step 2: Initialize the inference transforms

preprocess = weights.transforms()

# Step 3: Create the feature extractor with the required nodes

return_nodes = {'flatten': 'flatten'}

feature_extractor = create_feature_extractor(model, return_nodes=return_nodes)

# Step 4: Load the image(s) and apply inference preprocessing transforms

image = "?"

image = read_image(image).unsqueeze(0)

model_input = preprocess(image)

# Step 5: Extract the features

features = feature_extractor(model_input)

flatten_fts = features["flatten"].squeeze()

print(flatten_fts.shape)

反对回复 2023-06-20

潇潇雨雨

TA贡献1833条经验获得超4个赞

model(t_img)而不是这个

在这里做——

model(t_img[None])

这将增加一个额外的维度，因此图像将具有形状[1,3,224,224]并且可以使用。

反对回复 2023-06-20

3 回答
0 关注
239 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何从 Pytorch 中的单个图像中提取特征向量？

如何从 Pytorch 中的单个图像中提取特征向量？

3 回答

添加回答