4 回答
TA贡献1826条经验 获得超6个赞
那这个呢?
new_dict = df.set_index('image').stack().groupby('image').apply(list).to_dict()
print(new_dict)
{'bookstore_video0_40.jpg': [763,
899,
806,
940,
'pedestrian',
1026,
754,
1075,
797,
'pedestrian',
868,
770,
927,
822,
'biker',
413,
1010,
433,
1040,
'pedestrian'],
'bookstore_video0_80.jpg': [866,
278,
917,
328,
'pedestrian',
761,
825,
820,
865,
'biker']}
TA贡献1836条经验 获得超13个赞
这是一个基于您的示例的工作示例,但读取实际的 XML 文件除外。非常感谢。我怀疑您的回答会很有用,因为这是机器视觉领域的人们在进行诸如切割已经注释的 4K 图像之类的事情时会遇到的问题。
import sys
import glob
import numpy as np
import pandas as pd
from lxml import etree
from pathlib import Path, PurePosixPath
from xml.etree import ElementTree as ET
df = pd.DataFrame(dict(
image = '40.jpg 40.jpg 40.jpg 40.jpg 80.jpg 80.jpg'.split(),
xmin = [763, 1026, 868, 413, 866, 761],
ymin = [899, 754, 770, 1010, 278, 825],
xmax = [806, 1075, 927, 433, 917, 820],
ymax = [940, 797, 822, 1040, 328, 865],
label = 'pedestrian pedestrian biker pedestrian pedestrian biker'.split(),
))
for img in df['image'].unique():
img_df = df[df['image']==img].drop(columns = 'image').reset_index()
boxes = range(img_df.shape[0])
print(img, '\n', img_df)
# Ideally your custom voc writer can be inited here
# with something like:
image = img
# v_writer = VocWriter(f'path/{img[:-4]}.xml')
print("New custom VOC Writer instance inited here!")
depth = 3
filepath = PurePosixPath('image')
annotation = ET.Element('annotation')
ET.SubElement(annotation, 'folder').text = str(image)
ET.SubElement(annotation, 'filename').text = str(image)
ET.SubElement(annotation, 'segmented').text = '0'
size = ET.SubElement(annotation, 'size')
ET.SubElement(size, 'width').text = str('0')
ET.SubElement(size, 'height').text = str('0')
ET.SubElement(size, 'depth').text = str('3')
for box in boxes:
xmin = img_df.loc[box,'xmin']
ymin = img_df.loc[box,'ymin']
xmax = img_df.loc[box,'xmax']
ymax = img_df.loc[box,'ymax']
label = img_df.loc[box,'label']
print(xmin, ymin, xmax, ymax)
# Inside of this loop,
# you can add each box to your VocWriter object
# something like:
ob = ET.SubElement(annotation, 'object')
ET.SubElement(ob, 'name').text = str(img_df.loc[box,'label'])
ET.SubElement(ob, 'pose').text = 'Unspecified'
ET.SubElement(ob, 'truncated').text = '0'
ET.SubElement(ob, 'difficult').text = '0'
bbox = ET.SubElement(ob, 'bndbox')
ET.SubElement(bbox, 'xmin').text = str(img_df.loc[box,'xmin'])
ET.SubElement(bbox, 'ymin').text = str(img_df.loc[box,'ymin'])
ET.SubElement(bbox, 'xmax').text = str(img_df.loc[box,'xmax'])
ET.SubElement(bbox, 'ymax').text = str(img_df.loc[box,'ymax'])
# Once you exit that inner loop,
# you can save your data to your .xml file
# with something like:
# v_writer.save(f'path/{img[:-4]}.xml')
print(".xml file saved here!")
fileName = str(img)
tree = ET.ElementTree(annotation)
tree.write("./mergedxml/" + fileName + ".xml", encoding='utf8')
TA贡献2012条经验 获得超12个赞
也许您需要在groupby上使用 dict 和tuple/list:
images_dict = dict(tuple(df.groupby('image')))
TA贡献1801条经验 获得超16个赞
我想将此作为评论而不是答案,但链接太长:
我写了一个voc作家。我只需要能够以这样的方式传递数据,以便我可以遍历它。我有一个不同的数据集,我在其中做类似的事情,但数据已经是一种易于使用的形式。对于我的项目,我花了很多时间编辑、清理、转换等数据。对我来说不好玩😁 – Robi Sen
你的 voc 作家是如何工作的?它是否类似于我链接到的那个(即使用 OPP 并具有用于将 bbox 数据添加到 xml 编写器实例的类方法,然后是另一种将该实例保存到 xml 文件的方法?)评论写得不好,这里有一个更好的例子来说明我的意思:
import pandas as pd
df = pd.DataFrame(dict(
image = '40.jpg 40.jpg 40.jpg 40.jpg 80.jpg 80.jpg'.split(),
xmin = [763, 1026, 868, 413, 866, 761],
ymin = [899, 754, 770, 1010, 278, 825],
xmax = [806, 1075, 927, 433, 917, 820],
ymax = [940, 797, 822, 1040, 328, 865],
label = 'pedestrian pedestrian biker pedestrian pedestrian biker'.split(),
))
for img in df['image'].unique():
img_df = df[df['image']==img].drop(columns = 'image').reset_index()
boxes = range(img_df.shape[0])
print(img, '\n', img_df)
# Ideally your custom voc writer can be inited here
# with something like:
# v_writer = VocWriter(f'path/{img[:-4]}.xml')
print('New custom VOC XML Writer instance inited here!')
for box in boxes:
xmin = img_df.loc[box,'xmin']
ymin = img_df.loc[box,'ymin']
xmax = img_df.loc[box,'xmax']
ymax = img_df.loc[box,'ymax']
label = img_df.loc[box,'label']
print(xmin, ymin, xmax, ymax)
# Inside of this loop,
# you can add each box to your VocWriter object
# something like:
# v_writer.addObject(label, xmin, ymin, xmax, ymax)
print('New bbox object added to writer instance here!')
# Once you exit that inner loop,
# you can save your data to your .xml file
# with something like:
# v_writer.save(f'path/{img[:-4]}.xml')
print(f'path/{img[:-4]}.xml file saved here!')
添加回答
举报