=====================================================================
TAC KBP 2015 EVENT ARGUMENT EXTRACTION AND LINKING EVALUATION RESULTS
=====================================================================


Team ID:  BBN
Organization:  BBN

Run ID:  BBN1
Did the run access the live Web during the evaluation window:  No
Did the run perform any cross-sentence reasoning: Yes
Did the run use any distributed representations (e.g., of words): Yes
Did the run return meaningful confidence values: Yes

Run ID:  BBN2
Did the run access the live Web during the evaluation window:  No
Did the run perform any cross-sentence reasoning: Yes
Did the run use any distributed representations (e.g., of words): Yes
Did the run return meaningful confidence values: Yes

Run ID:  BBN3
Did the run access the live Web during the evaluation window:  No
Did the run perform any cross-sentence reasoning: Yes
Did the run use any distributed representations (e.g., of words): Yes
Did the run return meaningful confidence values: Yes

Run ID:  BBN4
Did the run access the live Web during the evaluation window:  No
Did the run perform any cross-sentence reasoning: Yes
Did the run use any distributed representations (e.g., of words): Yes
Did the run return meaningful confidence values: Yes

Run ID:  BBN5
Did the run access the live Web during the evaluation window:  No
Did the run perform any cross-sentence reasoning: Yes
Did the run use any distributed representations (e.g., of words): Yes
Did the run return meaningful confidence values: Yes


*************************************************************
### The following are scores for from the TAC 2015 Event Argument and Linking Evaluation.
### For all scoring breakdowns, the summaries report: Precision, Recall, F1, EAArg Score, and Overall score. 
### Details of the scoring and the scoring software can be found on the TAC 2015 EAL webpage. 
###
### Scores are reported on the full data set (all_genre) and broken down by genre-- discussion forum only(df) newswire only(nw).
###
### The official score (withRealis) incorporates the correctness of the (ACTUAL, GENERIC, and OTHER) distinction 
### and the correctness of canonical argument string resolution.  As a diagnostic, we also report (a) a score 
### that ignores the realis distinction (neutralizeRealis) and (b) a score that ignores both the realis distinction 
### and canonical argument string resolution(neutraliseRealisCoref).
### 
### Scores are reported over two data sets.  Dataset1 (all_event_types), consists of 81 documents assessed for the 
### full TAC EAL event taxonomy as specified in the 2015 evaluation plan.  Dataset 2(restricted_event_types), 
### consists of 201 documents assessed for only 6 event types (assertions outside of the 6 were ignored).  Dataset2 
### includes the documents in Dataset1. Dataset 2 was assessed to allow a more in depth evaluation of event-specific 
### performance (and variance across performance by event type).  The 6 event types included in Dataset2 are: 
###	- Transaction.Transfer-Money
###	- Movement.Transport-Artifact
### - Life.Marry
###	- Contact.Meet
###	- Conflict.Demonstrate
###	- Conflict.Attack
### 
### One participant (ZJU) submitted an submission an offset error. This system output was automatically fixed by BBN (the organizer) and 
### the system by ZJU (the participant).  Because the modifications were different, both numbers are reported.
###
### One participant (ver-CMU) participated in "verification" version of the task.  This system took as its input all 
### other system submissions. This submission included the ZJU submission which had broken offsets and 
### did not include either BBN's fix or ZJU's fix. Thus it is not comparable to the other systems in task performed.  
###
### The LDC submission was produced with an LDC annotator spending 45-60 minutes on the task of extracting arguments
### and grouping them.  The low recall of the LDC submission is due at least in part to the time limitation.  
###
### While all scores provide interesting diagnostic information, the "official" evaluation metric is Dataset1(all_event_types) on the
### both genres (all_genre) using the official(withRealis) metric.

####################################
###### All Event Types ######


####### Genre: all_genre #######

##### Scoring Configuration: withRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	36.8 	39.2 	38.0 	23.6 	23.3 	23.5 	
BBN2 	34.2 	36.8 	35.5 	21.1 	22.6 	21.9 	
BBN3 	36.9 	35.8 	36.3 	21.5 	20.3 	20.9 	
BBN4 	36.9 	38.8 	37.8 	23.6 	23.3 	23.4 	
BBN5 	46.1 	29.2 	35.8 	21.3 	16.2 	18.7 	

##### Scoring Configuration: neutralizeRealisCoref #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	52.0 	49.9 	50.9 	38.6 	30.0 	34.3 	
BBN2 	49.5 	48.0 	48.7 	36.1 	29.0 	32.5 	
BBN3 	52.5 	45.6 	48.8 	35.4 	26.0 	30.7 	
BBN4 	52.3 	49.6 	50.9 	38.5 	30.0 	34.3 	
BBN5 	64.5 	36.9 	46.9 	31.9 	20.5 	26.2 	

##### Scoring Configuration: neutralizeRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	45.4 	45.5 	45.4 	32.3 	26.6 	29.4 	
BBN2 	42.5 	43.1 	42.8 	29.1 	25.9 	27.5 	
BBN3 	45.7 	41.7 	43.6 	29.5 	23.1 	26.3 	
BBN4 	45.7 	45.4 	45.5 	32.2 	26.6 	29.4 	
BBN5 	55.3 	33.4 	41.6 	26.8 	18.1 	22.4 	


####### Genre: df #######

##### Scoring Configuration: withRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	39.0 	38.5 	38.7 	25.6 	24.9 	25.2 	
BBN2 	35.4 	35.0 	35.2 	22.0 	23.5 	22.8 	
BBN3 	38.5 	34.4 	36.3 	22.6 	21.2 	21.9 	
BBN4 	38.6 	37.9 	38.2 	24.9 	24.6 	24.7 	
BBN5 	49.1 	29.2 	36.6 	22.6 	17.7 	20.1 	

##### Scoring Configuration: neutralizeRealisCoref #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	54.0 	46.8 	50.1 	37.4 	31.2 	34.3 	
BBN2 	51.3 	44.2 	47.5 	34.4 	28.8 	31.6 	
BBN3 	56.2 	43.6 	49.1 	35.4 	26.2 	30.8 	
BBN4 	53.7 	46.6 	49.9 	37.1 	30.7 	33.9 	
BBN5 	68.5 	36.5 	47.6 	32.5 	21.6 	27.1 	

##### Scoring Configuration: neutralizeRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	47.6 	42.9 	45.1 	32.1 	28.0 	30.1 	
BBN2 	44.1 	39.8 	41.8 	28.5 	26.3 	27.4 	
BBN3 	48.9 	39.7 	43.8 	29.9 	23.4 	26.6 	
BBN4 	46.8 	42.5 	44.5 	31.3 	27.4 	29.3 	
BBN5 	59.5 	33.0 	42.5 	27.8 	19.6 	23.7 	


####### Genre: nw #######

##### Scoring Configuration: withRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	35.5 	39.7 	37.5 	22.4 	22.4 	22.4 	
BBN2 	33.6 	37.9 	35.6 	20.6 	22.1 	21.3 	
BBN3 	36.0 	36.6 	36.3 	20.9 	19.8 	20.3 	
BBN4 	36.0 	39.4 	37.6 	22.8 	22.6 	22.7 	
BBN5 	44.5 	29.2 	35.3 	20.5 	15.3 	17.9 	

##### Scoring Configuration: neutralizeRealisCoref #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	51.0 	51.8 	51.4 	39.4 	29.3 	34.3 	
BBN2 	48.6 	50.4 	49.5 	37.1 	29.1 	33.1 	
BBN3 	50.7 	46.8 	48.7 	35.4 	25.9 	30.7 	
BBN4 	51.5 	51.5 	51.5 	39.4 	29.6 	34.5 	
BBN5 	62.3 	37.2 	46.6 	31.6 	19.9 	25.7 	

##### Scoring Configuration: neutralizeRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	44.3 	47.2 	45.7 	32.4 	25.7 	29.1 	
BBN2 	41.6 	45.2 	43.3 	29.5 	25.7 	27.6 	
BBN3 	44.0 	42.9 	43.4 	29.3 	23.0 	26.2 	
BBN4 	45.0 	47.2 	46.1 	32.8 	26.2 	29.5 	
BBN5 	53.0 	33.6 	41.1 	26.2 	17.1 	21.6 	

####################################
###### Restricted Event Types ######


####### Genre: all_genre #######

##### Scoring Configuration: withRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	36.7 	40.7 	38.6 	27.3 	23.5 	25.4 	
BBN2 	34.6 	39.3 	36.8 	25.6 	23.8 	24.7 	
BBN3 	37.5 	38.3 	37.9 	26.1 	21.0 	23.6 	
BBN4 	36.8 	40.2 	38.4 	26.8 	23.0 	24.9 	
BBN5 	45.2 	28.8 	35.2 	22.3 	15.6 	18.9 	

##### Scoring Configuration: neutralizeRealisCoref #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	54.1 	52.2 	53.1 	42.5 	30.4 	36.5 	
BBN2 	50.7 	50.5 	50.6 	40.1 	30.3 	35.2 	
BBN3 	55.4 	48.7 	51.8 	40.2 	27.7 	33.9 	
BBN4 	54.5 	51.3 	52.9 	41.8 	29.8 	35.8 	
BBN5 	66.2 	37.2 	47.6 	33.0 	20.3 	26.7 	

##### Scoring Configuration: neutralizeRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	47.6 	47.8 	47.7 	36.3 	27.1 	31.7 	
BBN2 	44.1 	45.9 	45.0 	33.8 	27.3 	30.5 	
BBN3 	48.4 	44.4 	46.3 	34.1 	24.4 	29.3 	
BBN4 	47.8 	47.0 	47.4 	35.8 	26.6 	31.2 	
BBN5 	57.1 	33.4 	42.1 	28.0 	17.8 	22.9 	


####### Genre: df #######

##### Scoring Configuration: withRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	33.1 	33.6 	33.3 	22.0 	20.5 	21.2 	
BBN2 	33.0 	33.8 	33.4 	22.1 	21.0 	21.6 	
BBN3 	32.5 	31.1 	31.8 	19.7 	17.9 	18.8 	
BBN4 	33.1 	33.1 	33.1 	21.2 	19.2 	20.2 	
BBN5 	38.3 	22.4 	28.3 	16.5 	12.5 	14.5 	

##### Scoring Configuration: neutralizeRealisCoref #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	52.5 	44.9 	48.4 	36.8 	27.3 	32.0 	
BBN2 	50.6 	43.5 	46.8 	34.8 	27.5 	31.2 	
BBN3 	53.2 	42.0 	46.9 	34.6 	25.1 	29.9 	
BBN4 	52.5 	43.9 	47.8 	35.8 	26.3 	31.1 	
BBN5 	61.8 	30.5 	40.8 	26.9 	16.6 	21.7 	

##### Scoring Configuration: neutralizeRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	45.1 	40.0 	42.4 	30.7 	23.2 	26.9 	
BBN2 	44.0 	39.6 	41.7 	30.0 	24.3 	27.2 	
BBN3 	46.1 	37.8 	41.5 	28.9 	21.4 	25.1 	
BBN4 	45.0 	39.2 	41.9 	29.8 	22.0 	25.9 	
BBN5 	52.3 	26.7 	35.4 	22.1 	13.8 	17.9 	


####### Genre: nw #######

##### Scoring Configuration: withRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	39.3 	46.7 	42.7 	31.7 	26.0 	28.9 	
BBN2 	35.7 	44.0 	39.4 	28.5 	26.1 	27.3 	
BBN3 	41.2 	44.5 	42.8 	31.6 	23.5 	27.5 	
BBN4 	39.5 	46.2 	42.6 	31.6 	26.0 	28.8 	
BBN5 	50.3 	34.1 	40.6 	27.1 	18.0 	22.6 	

##### Scoring Configuration: neutralizeRealisCoref #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	55.3 	58.6 	56.9 	47.4 	32.9 	40.2 	
BBN2 	50.8 	56.5 	53.5 	44.7 	32.5 	38.6 	
BBN3 	56.9 	54.5 	55.7 	45.0 	29.8 	37.4 	
BBN4 	55.9 	57.7 	56.8 	47.0 	32.6 	39.8 	
BBN5 	69.3 	43.0 	53.1 	38.4 	23.4 	30.9 	

##### Scoring Configuration: neutralizeRealis #####
submission 	P 	R 	F1 	EAArg 	 EALink 	 Overall 
BBN1 	49.3 	54.5 	51.8 	41.2 	30.4 	35.8 	
BBN2 	44.2 	51.3 	47.5 	37.1 	29.6 	33.4 	
BBN3 	50.1 	50.1 	50.1 	38.6 	26.9 	32.8 	
BBN4 	49.7 	53.9 	51.7 	40.9 	30.4 	35.6 	
BBN5 	60.4 	39.3 	47.6 	33.0 	21.1 	27.0