TAC Knowledge Base Population (KBP) 2017
Evaluation: February-November, 2017
Workshop: November 13-14, 2017
Conducted by:
U.S. National Institute of Standards and Technology (NIST)
With support from:
U.S. Department of Defense
The goal of TAC Knowledge Base Population (KBP) is to develop and
evaluate technologies for populating knowledge bases (KBs) from
unstructured text. KBP includes component tracks that develop specific
components and capabilities for KBP, as well as an end-to-end KB
construction task called "Cold Start", which builds a KB from scratch
by integrating selected components as their technology matures. The
capabilities required in the component tracks can be both "more" and
"less" than what is exercised in the Cold Start KB task. The
component tracks are "more" than Cold Start in the sense that each
track may explore pilot tasks that are not immediately integrated into
the Cold Start task; and they are "less" in the sense that integrating the components into a single KB
requires additional coordination and reconciliation of mismatches
between the various components so that the KB conforms with the KB
schema (e.g., the KB cannot assert that an entity is the "PLACE" for
an event if it also asserts that the entity is a "PERSON").
Standalone component tasks are offered in the following KBP tracks:
- Entity Discovery and Linking (EDL): The main EDL task is to extract
name and nominal mentions of specific individual person (PER), organization (ORG), geopolitical entity (GPE), location (LOC),
and facility (FAC) entities mentioned in the evaluation docoument collection, and to link
each mention to its KB node (either a node in the TAC reference KB, or
a newly created NIL node if it doesn't have a corresponding KB
entry). Additionally, the 2017 EDL track includes an EDL pilot (which is not part of the 2017 Cold Start KB task) for 10 new languages.
- Slot Filling (SF): The slot filling task is to search the
document collection to fill in values for specific attributes ("slots") for specific
- Event: The event track aims to extract information about
events from unstructured text, such that the information would be
suitable as input into a structured KB. The track includes Event
Nugget (EN) tasks to detect and link event nuggets (i.e., mentions of events in text), and Event Argument (EAL)
tasks to extract event arguments and link arguments that belong to the
same event. Additionally, an event (nugget) sequencing pilot is offered in English (but is not part of the 2017 Cold Start task).
- Belief and Sentiment (BeSt): The Belief and Sentiment track
detects belief and sentiment of an entity toward another entity,
relation, or event.
The end-to-end Cold Start KB Construction task is to build a KB from scratch,
using a predefined KB schema and a collection of unstructured
text. The KB schema for Cold Start 2017 consists of:
- Entities: entities and entity mentions as defined in the main task of the EDL track.
- SF Relations: entity attributes ("slots") as defined in the SF track.
- Events: events (hoppers) and event nuggets, as defined in the EN track.
- Event Arguments: event arguments, as defined in the EAL track.
- Sentiment: Sentiment from a source entity toward a target entity, as defined in the BeSt track.
Except for the 10-language EDL pilot and the English EN
sequencing pilot, all KBP 2017 tasks will be trilingual (English,
Chinese, and Spanish). Participants are encouraged to participate in
all three languages for each task, although diagnostic scores will
also be provided for each individual language. The source corpus for
the trilingual KBP 2017 tasks will comprise the same set of up to 90K
English, Chinese, and Spanish newswire and discussion forum documents.
Of these, approximately 500 "core" documents will be annotated with
entities, relations, and events (ERE) according to the guidelines for
ERE, and will be used to evaluate system responses for the
trilingual EDL, Event Nugget, Event Argument, and Belief and Sentiment
tasks. Post-submission assessment procedures will be used to evaluate
system responses for the Cold Start KB and SF tasks.
Systems in the Cold Start KB track and the SF track will operate on
the 90K documents in the source corpus, while systems in the
trilingual EDL track, Event Track, and BeSt track will operate on only
the 500 "core" documents.
- Cold Start KB (CSKB)
The Cold Start KB track builds a knowledge base from scratch using
a given document collection and a predefined KB schema.
Track home page: http://tac.nist.gov/2017/KBP/ColdStart/
Track coordinators: Hoa Dang (hoa.dang@nist.gov) and Shahzad Rajput (shahzad.rajput@nist.gov)
- Entity Discovery and Linking (EDL)
The Entity Discovery and Linking (EDL) track aims to extract entity mentions from a source collection of textual documents in multiple languages, and link them to a reference knowledge base; an EDL system is also required to cluster mentions for those entities that don't have corresponding KB entries.
Track home page: http://nlp.cs.rpi.edu/kbp/2017/
Track coordinator: Heng Ji (jih@rpi.edu)
- Slot Filling (SF)
The slot filling task is to search a document collection to fill in values for predefined slots (attributes) for a given entity.
Track home page: http://tac.nist.gov/2017/KBP/ColdStart/
Track coordinators: Hoa Dang (hoa.dang@nist.gov) and Shahzad Rajput (shahzad.rajput@nist.gov)
- Event
The goal of the Event track is to extract information about events such that the information would be suitable as input to a knowledge base. The track includes Event Nugget (EN) tasks to detect and link event nuggets, and Event Argument (EAL) tasks to extract event arguments and link arguments that belong to the same event.
Track home page: http://tac.nist.gov/2017/KBP/Event/
Event Nugget coordinators: Eduard Hovy (ehovy@andrew.cmu.edu) and Teruko Mitamura (teruko@cs.cmu.edu)
Event Argument coordinator: Marjorie Freedman (mrf@isi.edu)
- Belief/Sentiment (BeSt)
The Belief and Sentiment track detects belief and sentiment of an entity toward another entity, relation, or event.
Track home page: http://www.cs.columbia.edu/~rambow/best-eval-2017/
Track coordinator: Owen Rambow (rambow@ccls.columbia.edu)
Preliminary TAC KBP 2017 Schedule |
February 20 | Track registration opens |
June 15 | Deadline for registration for track participation |
June - October | Track evaluation windows (varies by track) |
June 29 - July 27 | Cold Start KB Evaluation Window |
July 13-27 | Slot Filling Evaluation Window |
September 25 - October 2 | Trilingual EDL Evaluation Window |
September 25 - October 2 | Event Argument Evaluation Window |
September 25 - October 2 | Event Nugget Detection and Coreference Evaluation Window |
October 3-10 | Event Sequencing Evaluation Window |
October 3-10 | Belief and Sentiment Evaluation Window |
October 5-16 5-23 | EDL Pilot Evaluation Window |
By mid October | Release of individual evaluated results to participants (most tracks) |
October 15 | Deadline for short system descriptions |
October 15 | Deadline for workshop presentation proposals |
October 20 | Notification of acceptance of presentation proposals |
November 1 | Deadline for system reports (workshop notebook version) |
November 13-14 | TAC 2017 workshop in Gaithersburg, Maryland, USA |
February 28, 2018 | Deadline for system reports (final proceedings version) |
Organizing Committee
Hoa Trang Dang (U.S. National Institute of Standards and Techonology)
Jason Duncan (MITRE)
Joe Ellis (Linguistic Data Consortium)
Marjorie Freedman (ISI)
Ralph Grishman (New York University)
Eduard Hovy (Carnegie Mellon University)
Heng Ji (Rensselaer Polytechnic Institute)
James Mayfield (Johns Hopkins University)
Teruko Mitamura (Carnegie Mellon University)
Boyan Onyshkevych (U.S. Department of Defense)
Shahzad Rajput (U.S. National Institute of Standards and Techonology)
Owen Rambow (Columbia University)
Zhiyi Song (Linguistic Data Consortium)
Stephanie Strassel (Linguistic Data Consortium)