首页猿问从 NER 获取全名

从 NER 获取全名

Java

梦里花落0921 2023-06-14 16:22:03

通过阅读文档和使用 API，看起来 CoreNLP 会告诉我每个标记的 NER 标签，但它不会帮助我从句子中提取全名。例如：Input: John Wayne and Mary have coffee CoreNLP Output: (John,PERSON) (Wayne,PERSON) (and,O) (Mary,PERSON) (have,O) (coffee,O) Desired Result: list of PERSON ==> [John Wayne, Mary]除非我错过了一些标志，否则我相信要做到这一点，我将需要解析标记并将标记为 PERSON 的连续标记粘合在一起。有人可以确认这确实是我需要做的吗？我主要想知道 CoreNLP 中是否有一些标志或实用程序可以为我做这样的事情。如果有人有实用程序（最好是 Java，因为我使用的是 Java API）可以执行此操作并希望分享，则可加分 :)谢谢！

查看完整描述

2 回答

白板的微信

TA贡献1883条经验获得超3个赞

您可能正在寻找实体提及而不是 NER 标签。例如使用简单 API：

new Sentence("Jimi Hendrix was the greatest").nerTags()

[PERSON, PERSON, O, O, O]

new Sentence("Jimi Hendrix was the greatest").mentions()

[Jimi Hendrix]

StanfordCoreNLP上面的链接有一个使用旧管道的传统非简单 API 的示例

反对回复 2023-06-14

qq_笑_17

TA贡献1818条经验获得超7个赞

这是完整的 Java API 示例，其中有一个关于实体提及的部分：

import edu.stanford.nlp.coref.data.CorefChain;

import edu.stanford.nlp.ling.*;

import edu.stanford.nlp.ie.util.*;

import edu.stanford.nlp.pipeline.*;

import edu.stanford.nlp.semgraph.*;

import edu.stanford.nlp.trees.*;

import java.util.*;

public class BasicPipelineExample {

public static String text = "Joe Smith was born in California. " +

"In 2017, he went to Paris, France in the summer. " +

"His flight left at 3:00pm on July 10th, 2017. " +

"After eating some escargot for the first time, Joe said, \"That was delicious!\" " +

"He sent a postcard to his sister Jane Smith. " +

"After hearing about Joe's trip, Jane decided she might go to France one day.";

public static void main(String[] args) {

// set up pipeline properties

Properties props = new Properties();

// set the list of annotators to run

props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,depparse,coref,kbp,quote");

// set a property for an annotator, in this case the coref annotator is being set to use the neural algorithm

props.setProperty("coref.algorithm", "neural");

// build pipeline

StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

// create a document object

CoreDocument document = new CoreDocument(text);

// annnotate the document

pipeline.annotate(document);

// examples

// 10th token of the document

CoreLabel token = document.tokens().get(10);

System.out.println("Example: token");

System.out.println(token);

System.out.println();

// text of the first sentence

String sentenceText = document.sentences().get(0).text();

System.out.println("Example: sentence");

System.out.println(sentenceText);

System.out.println();

// second sentence

CoreSentence sentence = document.sentences().get(1);

// list of the part-of-speech tags for the second sentence

List<String> posTags = sentence.posTags();

System.out.println("Example: pos tags");

System.out.println(posTags);

System.out.println();

// list of the ner tags for the second sentence

List<String> nerTags = sentence.nerTags();

System.out.println("Example: ner tags");

System.out.println(nerTags);

System.out.println();

// constituency parse for the second sentence

Tree constituencyParse = sentence.constituencyParse();

System.out.println("Example: constituency parse");

System.out.println(constituencyParse);

System.out.println();

// dependency parse for the second sentence

SemanticGraph dependencyParse = sentence.dependencyParse();

System.out.println("Example: dependency parse");

System.out.println(dependencyParse);

System.out.println();

// kbp relations found in fifth sentence

List<RelationTriple> relations =

document.sentences().get(4).relations();

System.out.println("Example: relation");

System.out.println(relations.get(0));

System.out.println();

// entity mentions in the second sentence

List<CoreEntityMention> entityMentions = sentence.entityMentions();

System.out.println("Example: entity mentions");

System.out.println(entityMentions);

System.out.println();

// coreference between entity mentions

CoreEntityMention originalEntityMention = document.sentences().get(3).entityMentions().get(1);

System.out.println("Example: original entity mention");

System.out.println(originalEntityMention);

System.out.println("Example: canonical entity mention");

System.out.println(originalEntityMention.canonicalEntityMention().get());

System.out.println();

// get document wide coref info

Map<Integer, CorefChain> corefChains = document.corefChains();

System.out.println("Example: coref chains for document");

System.out.println(corefChains);

System.out.println();

// get quotes in document

List<CoreQuote> quotes = document.quotes();

CoreQuote quote = quotes.get(0);

System.out.println("Example: quote");

System.out.println(quote);

System.out.println();

// original speaker of quote

// note that quote.speaker() returns an Optional

System.out.println("Example: original speaker of quote");

System.out.println(quote.speaker().get());

System.out.println();

// canonical speaker of quote

System.out.println("Example: canonical speaker of quote");

System.out.println(quote.canonicalSpeaker().get());

System.out.println();

}

反对回复 2023-06-14

2 回答
0 关注
131 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

从 NER 获取全名

从 NER 获取全名

2 回答

添加回答