===================================================================== TAC KBP 2015 EVENT ARGUMENT EXTRACTION AND LINKING EVALUATION RESULTS ===================================================================== Team ID: CMU_CS_event Organization: CMU-CS Run ID: CMU_CS_event1 Did the run access the live Web during the evaluation window: No Did the run perform any cross-sentence reasoning: No Did the run use any distributed representations (e.g., of words): Yes Did the run return meaningful confidence values: Yes Run ID: CMU_CS_event2 Did the run access the live Web during the evaluation window: No Did the run perform any cross-sentence reasoning: No Did the run use any distributed representations (e.g., of words): Yes Did the run return meaningful confidence values: Yes ************************************************************* ### The following are scores for from the TAC 2015 Event Argument and Linking Evaluation. ### For all scoring breakdowns, the summaries report: Precision, Recall, F1, EAArg Score, and Overall score. ### Details of the scoring and the scoring software can be found on the TAC 2015 EAL webpage. ### ### Scores are reported on the full data set (all_genre) and broken down by genre-- discussion forum only(df) newswire only(nw). ### ### The official score (withRealis) incorporates the correctness of the (ACTUAL, GENERIC, and OTHER) distinction ### and the correctness of canonical argument string resolution. As a diagnostic, we also report (a) a score ### that ignores the realis distinction (neutralizeRealis) and (b) a score that ignores both the realis distinction ### and canonical argument string resolution(neutraliseRealisCoref). ### ### Scores are reported over two data sets. Dataset1 (all_event_types), consists of 81 documents assessed for the ### full TAC EAL event taxonomy as specified in the 2015 evaluation plan. Dataset 2(restricted_event_types), ### consists of 201 documents assessed for only 6 event types (assertions outside of the 6 were ignored). Dataset2 ### includes the documents in Dataset1. Dataset 2 was assessed to allow a more in depth evaluation of event-specific ### performance (and variance across performance by event type). The 6 event types included in Dataset2 are: ### - Transaction.Transfer-Money ### - Movement.Transport-Artifact ### - Life.Marry ### - Contact.Meet ### - Conflict.Demonstrate ### - Conflict.Attack ### ### One participant (ZJU) submitted an submission an offset error. This system output was automatically fixed by BBN (the organizer) and ### the system by ZJU (the participant). Because the modifications were different, both numbers are reported. ### ### One participant (ver-CMU) participated in "verification" version of the task. This system took as its input all ### other system submissions. This submission included the ZJU submission which had broken offsets and ### did not include either BBN's fix or ZJU's fix. Thus it is not comparable to the other systems in task performed. ### ### The LDC submission was produced with an LDC annotator spending 45-60 minutes on the task of extracting arguments ### and grouping them. The low recall of the LDC submission is due at least in part to the time limitation. ### ### While all scores provide interesting diagnostic information, the "official" evaluation metric is Dataset1(all_event_types) on the ### both genres (all_genre) using the official(withRealis) metric. #################################### ###### All Event Types ###### ####### Genre: all_genre ####### ##### Scoring Configuration: withRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 30.5 9.9 14.9 5.2 3.6 4.4 CMU_CS_event2 30.5 10.1 15.2 5.4 3.5 4.5 ##### Scoring Configuration: neutralizeRealisCoref ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 52.8 17.1 25.8 13.5 7.7 10.6 CMU_CS_event2 52.5 17.2 25.9 13.6 7.3 10.5 ##### Scoring Configuration: neutralizeRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 34.9 11.7 17.5 6.8 3.9 5.4 CMU_CS_event2 35.4 12.0 17.9 7.1 4.0 5.5 ####### Genre: df ####### ##### Scoring Configuration: withRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 31.5 8.5 13.4 4.8 3.2 4.0 CMU_CS_event2 32.5 8.4 13.3 4.9 2.8 3.9 ##### Scoring Configuration: neutralizeRealisCoref ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 51.6 13.4 21.3 10.7 6.7 8.7 CMU_CS_event2 53.3 13.2 21.2 10.6 5.8 8.2 ##### Scoring Configuration: neutralizeRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 34.8 9.5 14.9 5.7 3.2 4.5 CMU_CS_event2 36.3 9.5 15.1 5.9 2.9 4.4 ####### Genre: nw ####### ##### Scoring Configuration: withRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 30.0 10.8 15.9 5.4 3.9 4.7 CMU_CS_event2 29.7 11.1 16.2 5.7 3.9 4.8 ##### Scoring Configuration: neutralizeRealisCoref ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 53.4 19.4 28.5 15.3 8.3 11.8 CMU_CS_event2 52.2 19.8 28.7 15.5 8.3 11.9 ##### Scoring Configuration: neutralizeRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 35.0 13.1 19.1 7.5 4.3 5.9 CMU_CS_event2 35.1 13.6 19.6 7.8 4.6 6.2 #################################### ###### Restricted Event Types ###### ####### Genre: all_genre ####### ##### Scoring Configuration: withRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 31.7 9.1 14.1 6.0 3.6 4.8 CMU_CS_event2 31.7 9.2 14.3 5.9 3.3 4.6 ##### Scoring Configuration: neutralizeRealisCoref ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 53.3 15.3 23.8 12.8 6.7 9.8 CMU_CS_event2 53.0 15.4 23.9 12.8 6.5 9.7 ##### Scoring Configuration: neutralizeRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 38.1 11.3 17.4 8.0 4.2 6.1 CMU_CS_event2 37.8 11.3 17.4 7.8 4.1 5.9 ####### Genre: df ####### ##### Scoring Configuration: withRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 25.5 5.3 8.8 3.3 1.7 2.5 CMU_CS_event2 26.3 5.5 9.1 3.3 1.5 2.4 ##### Scoring Configuration: neutralizeRealisCoref ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 46.3 9.5 15.8 7.8 4.5 6.2 CMU_CS_event2 46.2 9.4 15.6 7.6 3.8 5.7 ##### Scoring Configuration: neutralizeRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 31.4 6.7 11.0 4.5 2.1 3.4 CMU_CS_event2 32.2 6.8 11.2 4.7 1.9 3.3 ####### Genre: nw ####### ##### Scoring Configuration: withRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 34.9 12.3 18.2 8.3 5.0 6.7 CMU_CS_event2 34.4 12.3 18.1 8.0 4.7 6.4 ##### Scoring Configuration: neutralizeRealisCoref ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 56.8 20.3 29.9 17.2 8.4 12.8 CMU_CS_event2 56.3 20.5 30.1 17.3 8.7 13.0 ##### Scoring Configuration: neutralizeRealis ##### submission P R F1 EAArg EALink Overall CMU_CS_event1 41.5 15.2 22.3 10.9 5.8 8.4 CMU_CS_event2 40.6 15.1 22.0 10.5 5.8 8.2