PASCAL Recognizing Textual Entailment Challenge (RTE-6) at TAC 2010

Track coordinators:
- Luisa Bentivogli, CELCT and FBK, Italy ([email protected])
- Danilo Giampiccolo, CELCT, Italy ([email protected])
RTE-6 Call for Participation
Past RTE Data
RTE Resource Pool

RTE-6 Main and Novelty Detection Task Guidelines
RTE-6 Main and Novelty Detection Sample Topic (superseded by Development Set)
RTE-6 Main and Novelty Detection Development Set (Available as Past TAC Data)
RTE-6 Main Development Set disagreements (Available as Past TAC Data)
RTE-6 Main and Novelty Detection Test Set (Available as Past TAC Data)
RTE-6 Main and Novelty Detection Tasks Annotated Test Set (Available as Past TAC Data)
RTE-6 Main/Novelty Evaluation Results (Available as Past TAC Data)

RTE-6 KBP Validation Pilot Guidelines
RTE-6 KBP Validation Pilot Development Data (LDC2010E32 distributed by the LDC)
RTE-6 KBP Validation Pilot Test Data (LDC2010E56 distributed by the LDC)
RTE-6 KBP Validation Pilot Annotated Test Data (Available as Past TAC Data)

Introduction

Given two text fragments called 'Text' and 'Hypothesis', Textual Entailment Recognition is the task of determining whether the meaning of the Hypothesis is entailed (can be inferred) from the Text. The goal of the first RTE Challenge was to provide the NLP community with a benchmark to test progress in recognizing textual entailment, and to compare the achievements of different groups. Since its inception in 2004, the PASCAL RTE Challenges have promoted research in textual entailment recognition as a generic task that captures major semantic inference needs across many natural language processing applications, such as Question Answering (QA), Information Retrieval (IR), Information Extraction (IE), and multi-document Summarization.

After the first three highly successful PASCAL RTE Challenges, RTE became a track at the 2008 Text Analysis Conference, which brought it together with communities working on NLP applications. The interaction has provided the opportunity to apply RTE systems to specific applications and to move the RTE task towards more realistic application scenarios.

RTE-6 Tasks

The RTE-6 tasks focus on recognizing textual entailment in two application settings: Summarization and Knowledge Base Population.

Main Task (Summarization scenario): Given a corpus and a set of "candidate" sentences retrieved by Lucene from that corpus, RTE systems are required to identify all the sentences from among the candidate sentences that entail a given Hypothesis. The RTE-6 Main Task is based on the TAC Update Summarization Task. In the Update Summarization Task, each topic contains two sets of documents ("A" and "B"), where all the "A" documents chronologically precede all the "B" documents. An RTE-6 Main Task "corpus" consists of 10 "A" documents, while Hypotheses are taken from sentences in the "B" documents.
KBP Validation Pilot (Knowledge Base Population scenario): Based on the TAC Knowledge Base Population (KBP) Slot-Filling task, the new KBP validation pilot task is to determine whether a given relation (Hypothesis) is supported in an associated document (Text). Each slot fill that is proposed by a system for the KBP Slot-Filling task would create one evaluation item for the RTE-KBP Validation Pilot: The Hypothesis would be a simple sentence created from the slot fill, while the Text would be the source document that was cited as supporting the slot fill.

RTE-6 does not include the traditional RTE Main Task which was carried out in the first five RTE challenges; i.e., there will be no task to make entailment judgments over isolated T-H pairs drawn from multiple applications. Instead, the new Main Task for RTE-6 is based on only the Summarization application setting. The RTE-6 Main Task is similar to the RTE-5 Search Pilot, with the following changes:

RTE-6 hypotheses will be taken from sentences in the "B" documents, rather than from Summary Content Units created from human-authored summaries of the "A" documents.
Rather than searching for entailing sentences from the entire corpus, a Lucene baseline will first be run to retrieve a smaller number of candidate sentences for entailment.
The exploratory effort on resource evaluation will continue through ablation tests for the new RTE-6 Main Task.

Schedule

RTE-6 Schedule
April 30	Main Task: Release of Development Set
May 10	KBP Validation Pilot: Release of Development Set
May 21	Deadline for TAC 2010 track registration
August 17	KBP Validation Pilot: Release of Test Set
August 30	Main Task: Release of Test Set
September 9	Main Task: Deadline for task submissions
September 16	Main Task: Release of individual evaluated results
September 17	KBP Validation Pilot: Deadline for task submissions
September 24	KBP Validation Pilot: Release of individual evaluated results
September 26	Deadline for TAC 2010 workshop presentation proposals
September 30	Main Task: Deadline for ablation tests submissions
October 7	Main Task: Release of individual ablation test results
October 27	Deadline for systems' reports
November 15-16	TAC 2010 Workshop

Mailing List

The mailing list for the RTE Track is [email protected]. The list is used to discuss and define the task guidelines for the track, as well as for general discussion related to textual entailment and its evaluation. To subscribe, send a message to [email protected] such that the body consists of the line:
subscribe rte <FirstName> <LastName>
In order for your messages to get posted to the list, you must send them from the email address used when you subscribed to the list. To unsubscribe, send a message from the subscribed email address to [email protected] such that the body consists of the line:
unsubscribe rte
For additional information on how to use mailing lists hosted at NIST, send a message to [email protected] such that the body consists of the line:
HELP

Organizing Committee

BACK to Information for All TAC 2010 Track Participants

Last updated:
Comments to: [email protected]