RTE-5: Call for Participation FIFTH RECOGNIZING TEXTUAL ENTAILMENT CHALLENGE at TAC 2009 (http://www.nist.gov/tac/2009/RTE/) Since 2004, RTE Challenges have promoted research in textual entailment recognition as a task that captures major semantic inference needs across many natural language processing applications, such as Question Answering (QA), Information Retrieval (IR), Information Extraction (IE), and multi-document summarization. Over the years the encouraging progress, in terms of both the number of researchers involved and results achieved, has spurred the community to further investigate the phenomena involved by adding innovations to the challenge every year and moving it toward more realistic scenarios. Capitalizing on the favorable response obtained so far, the RTE Organizing Committee is glad to launch the Fifth Recognizing Textual Entailment Challenge, proposed for the second year as a track of the Text Analysis Conference (TAC). Organizations interested in participating in the RTE-5 Challenge are invited to submit a track registration form by May 31, 2009, at the TAC 2009 web site: http://www.nist.gov/tac/2009/ WHAT IS NEW IN RTE-5 1) A Textual Entailment Search Pilot task will be proposed, based on the data used in the Summarization task at TAC 2008/2009. 2) The main RTE-5 task will be similar to the RTE-4 task, with the following changes: * Texts will be longer, usually corresponding to a portion of the source document that a reader would naturally select, such as a paragraph or a group of related sentences. * Texts will come from a variety of sources and will not be edited from their source documents. Thus, systems will be asked to handle real text that may include typographical errors and ungrammatical sentences. * A development set will be released. * The textual entailment recognition task will be based on only three application settings: QA, IE, and IR. * Mandatory ablation tests for major knowledge resources will be required for those systems that employ these resources. MAIN TASK RTE is the task of recognizing that the meaning of one text, termed Hypothesis (H), can be inferred by the content of another, termed Text (T). Given a set of pairs of T's and H's as input, the systems must recognize whether each T entails the corresponding H, deciding whether: * T entails H * T contradicts H, or shows it false * the veracity of H is unknown on the basis of T. The RTE-5 main task will consist of two sub-tasks: 1) The three-way RTE task, where the system must decide whether: * T entails H - in which case the pair will be marked as ENTAILMENT * T contradicts H - in which case the pair will be marked as CONTRADICTION * The truth of H cannot be determined on the basis of T - in which case the pair will be marked as UNKNOWN 2) The two-way RTE task is to decide whether: * T entails H - in which case the pair will be marked as ENTAILMENT * T does not entail H - in which case the pair will be marked as NO ENTAILMENT Systems can decide whether to participate in either or both tasks. System results will be compared to a human-annotated gold-standard test corpus. Examples of three-way judgments are given at the end this document. As in previous challenges, the test data sets will be based on multiple data sources, intended to be representative of typical problems encountered by applied systems. Specifically, data types corresponding to the following application areas will be used: 1) Question Answering (QA): simulating a QA scenario in which the hypothesized answer has to be inferred from the candidate text passage 2) Information Retrieval (IR): choosing propositional queries as hypotheses, and proposing relevant and irrelevant sentences retrieved by IR systems as texts 3) Information Extraction/Relation Extraction (IE): generating T-H pairs, picking positive and negative examples of typical outputs of IE systems More details are provided in the guidelines for participants available at the RTE-5 website (http://www.nist.gov/tac/2009/RTE/). PILOT TASK The Textual Entailment Search Pilot, representing a first step towards more realistic scenarios in the Textual Entailment Recognition task, is aimed at: 1) producing a data set which reflects the natural distribution of entailment in a corpus and presents problems that can arise when detecting textual entailment in a natural setting 2) analyzing the potential impact of textual entailment recognition on a real NLP application task, namely the Summarization task. The Textual Entailment Search task consists in finding all the sentences in a set of documents that entail a given Hypothesis. The task is situated in the Summarization application setting, where the Hypothesis (H) is taken from a Summary Content Unit (SCU), and the systems must find all the entailing sentences (Ts) in a corpus of 10 newswire documents about a common topic. The following example is taken from the development set: Russia requested international help to rescue the AS-28. At Moscow's request, Japan has dispatched four naval vessels to help rescue a Russian submarine snagged on the floor of the Pacific Ocean, but the ships aren't expected to arrive at the scene until early next week. Navy spokesman Capt. Igor Dygalo said the U.S. Navy has also been asked for assistance, the RIA-Novosti news agency reported. Russian authorities hope British and American unmanned submersibles, sent after a Russian plea for help, can cut the submarine loose. As can be seen from the example above, in the Entailment Search task both Text and Hypothesis are to be interpreted in the context of the corpus and contain explicit and implicit references to entities, events, dates, places, situations, etc. pertaining to the topic. As this Pilot requires the retrieval of entailing sentences only, contradicting sentences are not to be taken into account, and thus the entailment judgment may be seen as a two-way decision between "yes" and "no" entailment. The guidelines for participants, together with one topic taken from the development set, are available at the RTE-5 website (http://www.nist.gov/tac/2009/RTE/). THE RTE RESOURCE POOL AT ACLwiki (http://www.aclweb.org/aclwiki/index.php?title=Textual_Entailment_Resource_Pool) The RTE Resource Pool, set up for the first time during RTE-3, serves as a portal and forum for publicizing and tracking resources, and reporting on their use. All the RTE participants and other members of the NLP community who develop or use relevant resources are encouraged to contribute to this important resource. This year we are also planning to update and integrate the RTE Resource Pool with a section specifically dedicated to knowledge resources used. The new page will mainly contain a list of the "standard" RTE resources, which have been selected and exploited majorly in the design of RTE systems during the RTE challenges held so far, together with the links to the locations where they are made available. Moreover, a shortlist of the "top" resources will be provided, as well as some results of the data analyses which have been conducted so far on the resources presented in the page. TENTATIVE SCHEDULE Pilot Development Set release 3 April 2009 Main Development Set release: 29 May 2009 Track registration deadline: 31 May 2009 Main and Pilot Test Set release: 2 September 2009 Submissions: 9 September 2009 Release of individual evaluated results: 18 September 2009 TAC 2009 Workshop: 16-17 November 2009 TRACK COORDINATORS AND ORGANIZERS: Luisa Bentivogli, CELCT and FBK, Italy (Track coordinator, bentivo@fbk.it) Ido Dagan, Bar Ilan University, Israel Hoa Trang Dang, NIST, USA Danilo Giampiccolo, CELCT, Italy (Track coordinator, giampiccolo@celct.it) Bernardo Magnini, FBK, Italy ---------- Examples of main task three-way judgments taken from RTE 4 test set (downloadable from http://www.nist.gov/tac/data/): - A 66-year-old man has been sentenced to life in prison by a French court for murdering seven girls and young women. Michel Fourniret, dubbed the "Ogre of the Ardennes", had admitted kidnapping and killing his victims between 1987 and 2001. Michel Fourniret was sentenced to life imprisonment. - Syrian officials have said the bombed building was an empty military warehouse. They have refused to let nuclear inspectors visit the location, which was bulldozed after the bombing. Nuclear inspectors are to visit Syria. - British and American diplomats were today attacked as they tried to investigate political violence in Zimbabwe, the US Embassy in Harare has said. Diplomats were detained in Zimbabwe. - African Union leaders ended their summit in Egypt yesterday refusing to condemn President Mugabe, cementing his hold on power even as they urged the establishment of a national unity government in Zimbabwe. African Union leaders had a meeting in Egypt. - Adopting just a couple of elements of the Mediterranean diet could cut the risk of cancer by 12%, say scientists. A study of 26,000 Greek people found just using more olive oil alone cut the risk by 9%. Mediterranean foods increase the risk of cancer because of olive oil. - Speaking at a press conference held by video link from Lebanon, Shiekh Hassan Nasrallah said that the Shia Islamist group had also agreed to supply Israel with information on the airman Ron Arad, who went missing in 1986. Shiekh Hassan Nasrallah is from Lebanon. - The acceleration of the shrinking of Arctic ice continues to threaten the survival of these animals. Scientists predict that the numbers of polar bears will fall by about a third, if sea ice in the Arctic continues to melt at its present rate. The level of Arctic ice will fall by a third. - Much of the world has moved toward democracy and freedom, but China hasn't moved much and Russia seems headed in the opposite direction. Of the two, China is probably easier to deal with. It appears to have a collective leadership, which gives a certain continuity to its policy. China and Russia will move toward democracy. - Political analyst Earl Ofari Hutchinson says Barack Obama has to capture the votes of Latinos for his Democratic presidential bid in the March 4 Texas primary. Latino voters are crucial for Obama in Texas. - A new report by the International Federation of Journalists (IFJ) documents 129 cases where media workers have been killed because of their work during 2004. They expect the number to increase as more information reaches them. This could make 2004 the deadliest year ever. 49 casualties (close to 40%) occurred in Iraq, making it by far the deadliest country for journalists. At least 20 of those appeared to be cases where journalists were directly targeted because of their profession. 49 media workers were killed in Iraq in 2004.