遇到一个正则表达式获取key-value键值对的问题???
如下例子:
<table xmlns:util="org.dspace.app.xmlui.utils.XSLUtils" xmlns:oreatom="http://www.openarchives.org/ore/atom/" xmlns:ore="http://www.openarchives.org/ore/terms/" xmlns:atom="http://www.w3.org/2005/Atom" class="ds-includeSet-table detailtable">
<tbody>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.contributor.author</td>
<td>Roberts, John M. K.</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.contributor.author</td>
<td>Anderson, Denis L.</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.contributor.author</td>
<td>Tay, Wee Tek</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.coverage.spatial</td>
<td>Papua New Guinea</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.coverage.spatial</td>
<td>Solomon Islands</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.coverage.spatial</td>
<td>Papua</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.coverage.spatial</td>
<td>Indonesia</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.date.accessioned</td>
<td>2015-03-31T16:29:05Z</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.date.available</td>
<td>2015-03-31T16:29:05Z</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.date.issued</td>
<td>2015-04-03</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.identifier</td>
<td>doi:10.5061/dryad.vt536</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.identifier.citation</td>
<td>Roberts JMK, Anderson DL, Tay WT (2015) Multiple host-shifts by the emerging honeybee parasite, Varroa jacobsoni. Molecular Ecology 24(10): 2379-2391.</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.identifier.uri</td>
<td>http://hdl.handle.net/10255/dryad.83323</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.description</td>
<td>Host shifts are a key mechanism of parasite evolution and responsible for the emergence of many economically important pathogens. Varroa destructor has been a major factor in global honeybee (Apis mellifera) declines since shifting hosts from the Asian honeybee (Apis cerana) > 50 years ago. Until recently, only two haplotypes of V. destructor (Korea and Japan) had successfully host shifted to A. mellifera. In 2008, the sister species V. jacobsoni was found for the first time parasitizing A. mellifera in Papua New Guinea (PNG). This recent host shift presents a serious threat to world apiculture but also provides the opportunity to examine host shifting in this system. We used 12 microsatellites to compare genetic variation of V. jacobsoni on A. mellifera in PNG with mites on A. cerana in both PNG and surrounding regions. We identified two distinct lineages of V. jacobsoni reproducing on A. mellifera in PNG. Our analysis indicated independent host shift events have occurred through small numbers of mites shifting from local A. cerana populations. Additional lineages were found in the neighbouring Papua and Solomon Islands that had partially host shifted to A. mellifera, that is producing immature offspring on drone brood only. These mites were likely in transition to full colonization of A. mellifera. Significant population structure between mites on the different hosts suggested host shifted V. jacobsoni populations may not still reproduce on A. cerana, although limited gene flow may exist. Our studies provide further insight into parasite host shift evolution and help characterize this new Varroa mite threat to A. mellifera worldwide.</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.relation.haspart</td>
<td>doi:10.5061/dryad.vt536/1</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.relation.isreferencedby</td>
<td>doi:10.1111/mec.13185</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.relation.isreferencedby</td>
<td>PMID:25846956</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.subject</td>
<td>host-shifts</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.subject</td>
<td>population genetics</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.subject</td>
<td>invasion biology</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.subject</td>
<td>apiculture</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.subject</td>
<td>parasites</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dc.title</td>
<td>Data from: Multiple host-shifts by the emerging honeybee parasite, Varroa jacobsoni</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dc.type</td>
<td>Article</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dwc.ScientificName</td>
<td>Varroa jacobsoni</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dwc.ScientificName</td>
<td>Varroa destructor</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dwc.ScientificName</td>
<td>Apis mellifera</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dwc.ScientificName</td>
<td>Apis cerana</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>prism.publicationName</td>
<td>Molecular Ecology<img src="/themes/Mirage/images/authority_control/invisible.gif" class="ds-authority-confidence cf-accepted " title="This controlled term has been confirmed by the user."> </td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dryad.dansTransferDate</td>
<td>2018-04-25T15:36:58.524+0000</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row odd ">
<td>dryad.dansEditIRI</td>
<td>https://easy.dans.knaw.nl/sword2/container/3e576bf7-26e1-404c-9cc8-bc8bd53c9591</td>
<td></td>
</tr>
<tr xmlns:confman="org.dspace.core.ConfigurationManager" xmlns:xlink="http://www.w3.org/TR/xlink/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:mets="http://www.loc.gov/METS/" xmlns:dri="http://di.tamu.edu/DRI/1.0/" xmlns:i18n="http://apache.org/cocoon/i18n/2.1" class="ds-table-row even ">
<td>dryad.dansArchiveDate</td>
<td>2018-04-25T18:10:28.981+0000</td>
<td></td>
</tr>
</tbody>
</table>
其中 ,我要获取里面的 <td></td>标签中的内容,获取规则是这样, 以下面举例
<td>dryad.dansEditIRI</td>
<td>https://easy.dans.knaw.nl/sword2/container/3e576bf7-26e1-404c-9cc8-bc8bd53c9591</td>
第一个<td>作为key,第二个<td>作为value,请问我是用正则表达式该怎样获取呢?或者有什么办法能够方便的取得这样的key-value值呢??
3 回答
慕容3067478
TA贡献1773条经验 获得超3个赞
使用BeautifulSoup
from bs4 import BeautifulSoup
s = """
<table>...</table>
"""
soup = BeautifulSoup(s, "lxml")
result = [{tr.find_all("td")[0].text: tr.find_all("td")[1].text} for tr in soup.find_all("tr")]
添加回答
举报
0/150
提交
取消