NAACL-HLT 2012 Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

Overview
Call for papers
Submission
Important dates
Workshop program
Workshop proceedings
Organizers
Program committee
Contact

NAACL-HLT 2012
Workshop on Evaluation Metrics and System Comparison
for Automatic Summarization

June 8, 2012
Montreal, Quebec, Canada

Latest News:

05/07/2012: The workshop program has been posted. A significant part of the workshop will be devoted to presentation and discussion of a new community summarization task on summarization of scientific literature, planned for late 2012 or early 2013.

03/29/2012: Drago Radev of the University of Michigan will give the invited talk on summarization of academic articles.

01/27/2012: Please make your hotel reservations early (now might be a good time), as hotel space will be extremely limited during the days surrounding the workshop due to people coming in for the Montreal Grand Prix Formula 1 race.

01/27/2012: The paper submission deadline has been extended to Sunday, April 1.

Workshop Description

Interest in summarization research has been steadily growing in the past decade, with numerous new methods being proposed for generic and topic-focused summarization of news. Other genres and domains, most notably related to spoken input, have also become well established, including summarization of broadcast news, meetings, spoken conversations and lectures.

At the same time, development of evaluation metrics for summarization and of resources for some genres and domains has lagged behind. Manual evaluation protocols (Pyramid scores for content selection, scores for linguistic quality and overall responsiveness) show considerable disparity between human performance and the performance of systems for multi-document summarization of news; however, the widely used suite for automatic evaluation of content, ROUGE, shows much narrower difference between machine and human performance and even fails to distinguish the two. For speech summarization ROUGE also does not properly reflect the difference between human and automatic summarizers and, unlike for written news, has low correlations with manual evaluation protocols. The challenge of automatic evaluation of linguistic quality of summaries has also only recently started to be addressed.

It has also become harder to identify the most competitive approaches to summarization. This is partly due to confusing or inconsistent evidence that comes from different test sets. Evaluating the same system configuration against several test sets will make possible a fairer comparison between methods and will further stimulate research on automatic evaluation metrics.

For this workshop we will seek submission on a wide range of topics related to evaluation and system comparison in summarization. Topics of interest include:

system comparison on several evaluation datasets. For example for multi-document summarization we will seek systems evaluated on multiple years of DUC/TAC data with emphasis on measuring statistically significant differences
manual evaluation protocols for summarization in new genres where existing methods may not apply
manual evaluation protocols for abstractive summarization, which assess the degree of text-to-text generation capabilities of the systems and rewards successful generation capabilities
automatic evaluation metrics of linguistic quality
automatic evaluation metrics that better reflect the differences in human and machine performance
automatic metrics that significantly outperform ROUGE in content selection evaluation for news summarization
automatic metrics that perform evaluation without the use of human goldstandards
analysis of domain and genre difference that expose weaknesses of currently adopted evaluation metrics and proposals for addressing these weaknesses

Paper Submission

Submissions will consist of regular full papers of up to 8 pages, plus additional pages for references. Shorter papers are also welcome. All papers should be formatted following the NAACL-HLT 2012 guidelines. As the reviewing will be blind, the paper must not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ..." must be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ..."

We encourage individuals who are submitting papers on automatic methods for summarization and evaluation to evaluate their approaches using multiple publicly available datasets, such as those from DUC and the TAC Summarization track.

Both submission and review processes will be handled electronically using the Softconf submission software (https://www.softconf.com/naaclhlt2012/WEAS2012/).

The submission deadline is Sunday, April 1, 2012 by 11:59PM Pacific Standard Time (GMT-8).

Important Dates

Apr 01: Paper due date (EXTENDED deadline)
Apr 25: Notification of acceptance
May 04: Camera-ready deadline
Jun 08: Workshop at NAACL-HLT 2012

Organizers

John Conroy (IDA Center for Computing Sciences)
Hoa Dang (National Institute of Standards and Technology)
Ani Nenkova (University of Pennsylvania)
Karolina Owczarzak (National Institute of Standards and Technology)

Program Committee

Enrique Amigo (UNED, Madrid)
Giuseppe Carenini (University of British Columbia)
Katja Filippova (Google Research)
George Giannakopoulos (NCSR Demokritos)
Dan Gillick (University of California at Berkeley)
Min-Yen Kan (National University of Singapore)
Guy Lapalme (University of Montreal)
Yang Liu (University of Texas, Dallas)
Annie Louis (University of Pennsylvania)
Kathy McKeown (Columbia University)
Gabriel Murray (University of British Columbia)
Dianne O'Leary (University of Maryland)
Drago Radev (University of Michigan)
Steve Renals (University of Edinburgh)
Horacio Saggion (Universitat Pompeu Fabra)
Judith Schlesinger (IDA Center for Computing Sciences)
Josef Steinberger (European Commission Joint Research Centre)
Stan Szpakowicz (University of Ottawa)
Lucy Vanderwende (Microsoft Research)
Stephen Wan (CSIRO ICT Centre)
Xiaodan Zhu (National Research Council Canada)

Contact

Please contact us by email: [email protected]

Last updated:
Comments to: [email protected]