TAC Knowledge Base Population (KBP) 2017

Evaluation: February-November, 2017
Workshop: November 13-14, 2017

Conducted by:
U.S. National Institute of Standards and Technology (NIST)

With support from:
U.S. Department of Defense

Overview

The goal of TAC Knowledge Base Population (KBP) is to develop and evaluate technologies for populating knowledge bases (KBs) from unstructured text. KBP includes component tracks that develop specific components and capabilities for KBP, as well as an end-to-end KB construction task called "Cold Start", which builds a KB from scratch by integrating selected components as their technology matures. The capabilities required in the component tracks can be both "more" and "less" than what is exercised in the Cold Start KB task. The component tracks are "more" than Cold Start in the sense that each track may explore pilot tasks that are not immediately integrated into the Cold Start task; and they are "less" in the sense that integrating the components into a single KB requires additional coordination and reconciliation of mismatches between the various components so that the KB conforms with the KB schema (e.g., the KB cannot assert that an entity is the "PLACE" for an event if it also asserts that the entity is a "PERSON").

Standalone component tasks are offered in the following KBP tracks:

Entity Discovery and Linking (EDL): The main EDL task is to extract name and nominal mentions of specific individual person (PER), organization (ORG), geopolitical entity (GPE), location (LOC), and facility (FAC) entities mentioned in the evaluation docoument collection, and to link each mention to its KB node (either a node in the TAC reference KB, or a newly created NIL node if it doesn't have a corresponding KB entry). Additionally, the 2017 EDL track includes an EDL pilot (which is not part of the 2017 Cold Start KB task) for 10 new languages.
Slot Filling (SF): The slot filling task is to search the document collection to fill in values for specific attributes ("slots") for specific entities.
Event: The event track aims to extract information about events from unstructured text, such that the information would be suitable as input into a structured KB. The track includes Event Nugget (EN) tasks to detect and link event nuggets (i.e., mentions of events in text), and Event Argument (EAL) tasks to extract event arguments and link arguments that belong to the same event. Additionally, an event (nugget) sequencing pilot is offered in English (but is not part of the 2017 Cold Start task).
Belief and Sentiment (BeSt): The Belief and Sentiment track detects belief and sentiment of an entity toward another entity, relation, or event.

The end-to-end Cold Start KB Construction task is to build a KB from scratch, using a predefined KB schema and a collection of unstructured text. The KB schema for Cold Start 2017 consists of:

Entities: entities and entity mentions as defined in the main task of the EDL track.
SF Relations: entity attributes ("slots") as defined in the SF track.
Events: events (hoppers) and event nuggets, as defined in the EN track.
Event Arguments: event arguments, as defined in the EAL track.
Sentiment: Sentiment from a source entity toward a target entity, as defined in the BeSt track.

Except for the 10-language EDL pilot and the English EN sequencing pilot, all KBP 2017 tasks will be trilingual (English, Chinese, and Spanish). Participants are encouraged to participate in all three languages for each task, although diagnostic scores will also be provided for each individual language. The source corpus for the trilingual KBP 2017 tasks will comprise the same set of up to 90K English, Chinese, and Spanish newswire and discussion forum documents. Of these, approximately 500 "core" documents will be annotated with entities, relations, and events (ERE) according to the guidelines for Rich ERE, and will be used to evaluate system responses for the trilingual EDL, Event Nugget, Event Argument, and Belief and Sentiment tasks. Post-submission assessment procedures will be used to evaluate system responses for the Cold Start KB and SF tasks.

Systems in the Cold Start KB track and the SF track will operate on the 90K documents in the source corpus, while systems in the trilingual EDL track, Event Track, and BeSt track will operate on only the 500 "core" documents.

Tracks

Cold Start KB (CSKB)
The Cold Start KB track builds a knowledge base from scratch using a given document collection and a predefined KB schema.
Track home page: http://tac.nist.gov/2017/KBP/ColdStart/
Track coordinators: Hoa Dang ([email protected]) and Shahzad Rajput ([email protected])

Entity Discovery and Linking (EDL)
The Entity Discovery and Linking (EDL) track aims to extract entity mentions from a source collection of textual documents in multiple languages, and link them to a reference knowledge base; an EDL system is also required to cluster mentions for those entities that don't have corresponding KB entries.
Track home page: http://nlp.cs.rpi.edu/kbp/2017/
Track coordinator: Heng Ji ([email protected])

Slot Filling (SF)
The slot filling task is to search a document collection to fill in values for predefined slots (attributes) for a given entity.
Track home page: http://tac.nist.gov/2017/KBP/ColdStart/
Track coordinators: Hoa Dang ([email protected]) and Shahzad Rajput ([email protected])

Event
The goal of the Event track is to extract information about events such that the information would be suitable as input to a knowledge base. The track includes Event Nugget (EN) tasks to detect and link event nuggets, and Event Argument (EAL) tasks to extract event arguments and link arguments that belong to the same event.
Track home page: http://tac.nist.gov/2017/KBP/Event/
Event Nugget coordinators: Eduard Hovy ([email protected]) and Teruko Mitamura ([email protected])
Event Argument coordinator: Marjorie Freedman ([email protected])

Belief/Sentiment (BeSt)
The Belief and Sentiment track detects belief and sentiment of an entity toward another entity, relation, or event.
Track home page: http://www.cs.columbia.edu/~rambow/best-eval-2017/
Track coordinator: Owen Rambow ([email protected])

Schedule

Preliminary TAC KBP 2017 Schedule
February 20	Track registration opens
June 15	Deadline for registration for track participation
June - October	Track evaluation windows (varies by track)
June 29 - July 27	Cold Start KB Evaluation Window
July 13-27	Slot Filling Evaluation Window
September 25 - October 2	Trilingual EDL Evaluation Window
September 25 - October 2	Event Argument Evaluation Window
September 25 - October 2	Event Nugget Detection and Coreference Evaluation Window
October 3-10	Event Sequencing Evaluation Window
October 3-10	Belief and Sentiment Evaluation Window
October ~~5-16~~ 5-23	EDL Pilot Evaluation Window
By mid October	Release of individual evaluated results to participants (most tracks)
October 15	Deadline for short system descriptions
October 15	Deadline for workshop presentation proposals
October 20	Notification of acceptance of presentation proposals
November 1	Deadline for system reports (workshop notebook version)
November 13-14	TAC 2017 workshop in Gaithersburg, Maryland, USA
February 28, 2018	Deadline for system reports (final proceedings version)

Mailing List

Subscribe yourself to the [email protected] mailing list (if not already subscribed). Registering to participate in a track does not automatically add you to the mailing list. If you were previously subscribed to the mailing list, you do not have to re-subscribe (the mailing list is for anyone interested in TAC KBP, rather than specifically for TAC KBP participants, and thus carries over from year to year). In order for your messages to get posted to the list, you must send them from the email address used when you subscribed to the list.

To subscribe to the [email protected] mailing list, send a message to [email protected], subject = subscribe. You will receive an automatic email with a confirmation code, which you must respond to in order to complete your subscription request.

To unsubscribe from the [email protected] mailing list, send a message to [email protected], subject = unsubscribe. You will receive an automatic email with a confirmation code, which you must respond to in order to complete your request to be removed from the mailing list.

For additional information on how to use the [email protected] mailing list, send a message to [email protected], subject = help.

Organizing Committee

Hoa Trang Dang (U.S. National Institute of Standards and Techonology)
Jason Duncan (MITRE)
Joe Ellis (Linguistic Data Consortium)
Marjorie Freedman (ISI)
Ralph Grishman (New York University)
Eduard Hovy (Carnegie Mellon University)
Heng Ji (Rensselaer Polytechnic Institute)
James Mayfield (Johns Hopkins University)
Teruko Mitamura (Carnegie Mellon University)
Boyan Onyshkevych (U.S. Department of Defense)
Shahzad Rajput (U.S. National Institute of Standards and Techonology)
Owen Rambow (Columbia University)
Zhiyi Song (Linguistic Data Consortium)
Stephanie Strassel (Linguistic Data Consortium)

Last updated:
Comments to: [email protected]