Go to the NIST home page Go to the TAC Home Page TAC Banner

Return to TAC Homepage

TAC 2018 Tracks
Call for Participation
Track Registration
Reporting Guidelines
TAC 2018 Workshop

Streaming Multimedia Knowledge Base Population (SM-KBP) 2018

Evaluation: February-November, 2018
Workshop: November 13-14, 2018

Conducted by:
U.S. National Institute of Standards and Technology (NIST)

With support from:
U.S. Department of Defense


In scenarios such as natural disasters or international conflicts, analysts and the public are often confronted with a variety of information coming through multiple media sources. There is a need for technologies to analyze and extract knowledge from multimedia to develop and maintain an understanding of events, situations, and trends as they unfold around the world.

The goal of DARPA's Active Interpretation of Disparate Alternatives (AIDA) Program is to develop a multi-hypothesis semantic engine that generates explicit alternative interpretations of events, situations, and trends from a variety of unstructured sources, for use in noisy, conflicting, and potentially deceptive information environments. This engine must be capable of mapping knowledge elements (KE) automatically derived from multiple media sources into a common semantic representation, aggregating information derived from those sources, and generating and exploring multiple hypotheses about the events, situations, and trends of interest. This engine must establish confidence measures for the derived knowledge and hypotheses, based on the accuracy of the analysis and the coherence of the semantic representation of each hypothesis.

The streaming multimedia KBP track will assess the performance of systems that have been developed in support of AIDA program goals. Systems will be asked to extract knowledge elements from a stream of heterogeneous documents containing multilingual multimedia sources including text, speech, images, videos, and pdf files; aggregate the knowledge elements from multiple documents without access to the raw documents themselves (maintaining multiple interpretations and confidence values for KEs extracted or inferred from the documents); and develop semantically coherent hypotheses, each of which represents an interpretation of the document stream.

The SM-KBP tasks will be run at TAC/TRECVID 2018 as pilot evaluations whose goals are to test evaluation protocols and metrics and to learn lessons that can inform how subsequent evaluations will be structured. The purpose of the pilot is to exercise the evaluation infrastructure, not to test systems' performance. As such, the pilot intends to be flexible and at the same time to follow the protocol of the official evaluation. It is expected that the SM-KBP track will be run for 3 evaluation cycles after the initial pilot evaluation:

  • Pilot Evaluation: August-September 2018
  • Evaluation 1 (short cycle): March-April 2019
  • Evaluation 2 (18-month cycle): August-September 2020
  • Evaluation 3 (18-month cycle): March-April 2022


SM-KBP evaluation is over a small set of topics for a single scenario. There will be a new scenario and related set of languages for each evaluation cycle. For the 2018 pilot, the scenario is the Russian/Ukrainian conflict (2014-2015) and the scenario languages are English, Russian, and Ukrainian. Early in the evaluation cycle, all task participants will receive an ontology of entities, events, event arguments, relations, and SEC (sentiment, emotion, and cognitive state), defining the KEs that are in scope for the evaluation tasks. For the 2018 pilot, the ontology will be an extension of the DEFT Rich ERE entity, relation, and event types; a different (expanded) ontology is expected for subsequent evaluation cycles.

The SM-KBP track has three main evaluation tasks:

  • Task 1: Extraction of KEs and KE mentions from a stream of multi-media documents, including linking of mentions of the same KE within each document to produce a document-level knowledge graph for each document. Extraction and linking will be conditioned on two kinds of contexts:
    • a) generic background context
    • b) generic background context plus a "what if" hypothesis
  • Task 2: Construction of a KB by aggregating and linking document-level knowledge graphs produced by one or more Task 1 teams.
  • Task 3: Generation of hypotheses from KBs produced by one or more Task 2 teams.

Tasks 1a and 2 are open to all researchers who find the evaluation tasks of interest. Tasks 1b and 3 and limited to teams that are part of DARPA's AIDA program.

The source corpus for the pilot will comprise approximately 90K English, Russian, and Ukrainian documents. Systems in Task 1 will operate on the 90K documents in the source corpus; systems in Task 2 will operate on the output of one or more systems from Task 1a and will not have access to the source documents; systems in Task 3 will operate on the output of one or more systems from Task 2, and also will not have access to the source documents. There are many use cases in which analytic engines cannot have access to original documents; for example, provenance for an assertion may have never been recorded in the first place, or provenance may need to be redacted for legal or security reasons.

Novel characteristics of the open evaluation tasks (Task 1a and Task 2) include:

  • Task 1: Multimodal multilingual extraction and linking of information within a document
  • Task 1 and 2: Processing of streaming input
  • Task 1 and 2: Confidence estimation and maintenance of multiple possible interpretations
  • Task 2: Cross-document aggregation and linking of information without access to original documents

Novel characteristics of the AIDA program-internal evaluation tasks (Task 1b and Task 3) include:

  • Document-level extraction and linking conditioned on "feedback hypotheses" providing context.
  • Generation of semantically coherent hypotheses, each representing a different interpretation of the document stream.


    Preliminary TAC SM-KBP 2018 Schedule
    July 15Deadline for registration for track participation
    August 12 - August 19Task 1a Evaluation Window
    August 20 - September 3Task 1b Evaluation Window
    August 20 - September 3Task 2 Evaluation Window
    September 4 - September 7Queries applied to output of Tasks 1 and 2
    September 4 - September 10Task 3 Evaluation Window
    September 11 - September 13Queries applied to output of Task 3
    Mid OctoberRelease of individual evaluated results to participants (most tasks)
    October 15Deadline for short system descriptions
    October 15Deadline for workshop presentation proposals
    October 20Notification of acceptance of presentation proposals
    November 1Deadline for system reports (workshop notebook version)
    November 13-14TAC 2018 workshop in Gaithersburg, Maryland, USA
    February 15, 2019Deadline for system reports (final proceedings version)

Mailing List

Subscribe yourself to the sm-kbp@list.nist.gov mailing list (if not already subscribed): Registering to participate in a track does not automatically add you to the mailing list. If you were previously subscribed to the mailing list, you do not have to re-subscribe (the mailing list is for anyone interested in SM-KBP, rather than specifically for SM-KBP participants, and thus carries over from year to year).

Organizing Committee

Hoa Trang Dang (U.S. National Institute of Standards and Techonology)
Oleg Aulov (U.S. National Institute of Standards and Techonology)
George Awad (U.S. National Institute of Standards and Techonology)
Asad Butt (U.S. National Institute of Standards and Techonology)
Shahzad Rajput (U.S. National Institute of Standards and Techonology)
Jason Duncan (MITRE)
Boyan Onyshkevych (U.S. Department of Defense)
Stephanie Strassel (Linguistic Data Consortium)
Jennifer Tracey (Linguistic Data Consortium)

NIST is an agency of the
U.S. Department of Commerce

privacy policy / security notice / accessibility statement

Last updated: Monday, 14-May-2018 11:33:15 MDT
Comments to: tac-web@nist.gov