4TH TEXTUAL ENTAILMENT CHALLENGE AT TAC 2008 (http://www.nist.gov/tac/2008/rte/index.html) The three Recognising Textual Entailment (RTE) challenges (see RTE websites, http://www.pascal-network.org/Challenges/RTE/; http://www.pascal-network.org/Challenges/RTE2/; http://www.pascal-network.org/Challenges/RTE3/) held so far have shown that the interest for Textual Entailment research has steadily grown in years. The RTE organizing committee is now glad to announce the 4th round of the Recognizing Textual Entailment (RTE) Challenge, organized as a track within NIST's new Text Analysis Conference (TAC, http://www.nist.gov/tac/). WHAT IS NEW IN RTE4 Although RTE4 will maintain the basic structure of the previous challenges, some important changes will be introduced in order to make the task more stimulating and bring the research in textual entailment to the next level. In particular: - the task will include the 3-way classification task piloted in RTE3 (see http://nlp.stanford.edu/RTE3-pilot/), allowing the systems to make a further distinction between hypotheses unknown on the basis of the texts and hypotheses contradicted or proved false by the texts (submitting 2-way classifications will still be possible, and 2-way results will be announced as well). - the number of pairs will be increased to 300 in two of the application settings, namely Information Extraction and Information Retrieval, as they have proven to be more difficult from the analysis of the results in previous challenges. The test set will be made of 1000 pairs (300 each for IE and IR, 200 each for SUM and QA). - there will be no development data set for RTE4, and the participants are invited to use the past RTE data for training. - in 2008 the RTE challenge will be carried out for the first time as a track at the Text Analysis Conference, organised by NIST (more details at http://www.nist.gov/tac/). Registration is open until June 27, 2008, at the TAC 2008 RTE Track website (http://www.nist.gov/tac/2008/rte/index.html) TASK AND DATA DESCRIPTION RTE is the task of recognizing that the meaning of one text, termed H(ypothesis), can be inferred by the content of another, termed T(ext). Given a set of pairs of T's and H's as input, the systems must recognise whether each T entails the corresponding H, deciding whether: -T entails H -T contradicts H, or shows it false -the veracity of H is unknown on the basis of T. System results will be compared to a human-annotated gold-standard test corpus. Examples of three-way judgments taken from last year's pilot task are given at the bottom of this message. As in previous challenges, the test data sets will be based on multiple data sources, intended to be representative of typical problems encountered by applied systems. Specifically, data types corresponding to the following application areas will be used: -Question Answering (QA): simulating a QA scenario in which the hypothesized answer has to be inferred from the candidate text passage -Information Retrieval (IR): choosing propositional queries as hypotheses, and proposing relevant and irrelevant sentences retrieved by IR systems as texts -Information Extraction/Relation Extraction (IE): generating T-H pairs, picking positive and negative examples of typical outputs of IE systems -Summarization (SUM): converting sentence pairs produced by multi-document text summarization systems into T-H pairs More details can be found at the RTE-3 website (http://www.pascal-network.org/Challenges/RTE3/). The guidelines for participants will be available at the track website shortly. THE RTE RESOURCE POOL AT ACLwiki (http://aclweb.org/aclwiki/index.php?title=Recognizing_Textual_Entailment) The RTE Resource Pool, set up for the first time during RTE3, serves as a portal and forum for publicizing and tracking resources, and reporting on their use. RTE participants and other members of the NLP community who develop or use relevant resources are encouraged to contribute to this important resource. TENTATIVE SCHEDULE Registration deadline: 27 June 2008 Test Set Release: 2 September 2008 Submissions: 9 September 2008 Release of individual evaluated results: 12 September 2008 Workshop: 17-19 November 2008, at TAC 2008 TRACK COORDINATORS AND ORGANIZERS: Danilo Giampiccolo, CELCT (Trento), Italy (Coordinator, giampiccolo@celct.it) Hoa Dang, NIST, USA (Coordinator, hoa.dang@nist.gov) Ido Dagan, Bar Ilan University, Israel Bill Dolan, Microsoft Research, USA Bernardo Magnini, FBK-irst (Trento), Italy SCIENTIFIC COMMITTEE: Johan Bos, University of Rome "La Sapienza", Italy Christopher Manning, Stanford, USA Dan Moldovan, University of Texas at Dallas, USA Dan Roth, UIUC, USA Annie Zaenen, Palo Alto Research Center, USA Fabio Massimo Zanzotto, University of Rome "Tor Vergata", Italy ---------- Examples of three-way judgments taken from last year's pilot task: T: After his release, the clean-shaven Magdy el-Nashar told reporters outside his home that he had nothing to do with the July 7 transit attacks, which killed 52 people and the four bombers. H: 52 people and four bombers were killed on July 7. Entailment: YES T: Mrs. Bush's approval ratings have remained very high, above 80%, even as her husband's have recently dropped below 50%. H: 80% approve of Mr. Bush. Entailment: NO T: Recent Dakosaurus research comes from a complete skull found in Argentina in 1996, studied by Diego Pol of Ohio State University, Zulma Gasparini of Argentinas National University of La Plata, and their colleagues. H: A complete Dakosaurus was discovered by Diego Pol. Entailment: UNKNOWN T: The British tabloids portrayed Nicholas Leeson as a working-class villain who single-handedly brought down Barings PLC, a 233-year-old London merchant bank that helped finance the Napoleonic wars. H: Barings was Britain's oldest merchant bank. Entailment: UNKOWN T: The floods were exceptional since they affected an extensive area across Europe from the UK to Spain and as far east as the Black Sea coast. Economic losses amounted to EUR 9.2 bn in Germany, EUR 2.9 bn in Austria and EUR 2.3 bn in the Czech Republic. Total economic damage exceeds EUR 15 bn. H: Flooding in Europe causes major economic losses. Entailment: YES T: Oscar-winning director Franco Zeffirelli has been awarded an honorary knighthood for his "valuable services to British performing arts". H: Italian director is awarded an honorary Oscar. Entailment: NO