========================================= TAC 2019 SM-KBP TASK 3 EVALUATION RESULTS ========================================= Team ID: SAMSON Organization: Raytheon BBN Technologies ********************** Description of Columns in Task 3 scores: - Column 1: Topic ID - Column 2: edges_submitted * number of edges in the truncated submitted hypotheses - Column 3: edges_correct * number of edges that have a correct justification (i.e., predicate justification is correct and linkable to object justification) - Column 4: edges_wrong * number of edges that have no correct justification - Column 5: edges_duplicate * number of correct edges that are in the same edge equivalence class as a previously seen correct edge for the same hypothesis - Column 6: edges_skipped * number of correct edges (for relations) that cannot be assigned to an edge equivalence class because the assessment is missing a subject equivalence class * these edges are ignored for the puposes of computing correctness * these edges are counted for all other purposes (e.g. when reporting coherence, relevance, and coverage) - Column 7: edges_coherent * number of edges assessed as coherent - Column 8: KE_submitted * number of event or relation clusters in the truncated submitted hypotheses - Column 9: KE_coherent * number of event or relation clusters assessed as coherent - Column 10: KE_Frel * number of event or relation clusters assessed as FullyRelevant - Column 11: KE_Prel * number of event or relation clusters assessed as PartiallyRelevant - Column 12: hyotheses_submitted * number of hypotheses submitted - Column 13: theories_matched * number of prevailing theories (partially) matched by the submitted hypotheses - Column 14: Mean hypothesis-level correctness * hypothesis-level correctness = (edges_correct - edges_duplicate - edges_skipped) / (edges_submitted - edges_duplicate - edges_skipped) , restricted to edges in hypothesis - Column 15: Mean hypothesis-level edge_coherence * hypothesis-level edge_coherence = edges_coherent / edges_submitted , restricted to edges in hypothesis - Column 16: Mean hypothesis-level KE_coherence * hypothesis-level KE_coherence = KE_coherent / KE_submitted , restricted to KEs in hypothesis - Column 17: Mean hypothesis-level relevance_strict * hypothesis-level relevance_strict = KE_Frel / KE_submitted , restricted to KEs in hypothesis - Column 18: Mean hypothesis-level relevance_lenient * hypothesis-level relevance_lenient = (KE_Frel + KE_Prel) / KE_submitted , restricted to KEs in hypothesis - Column 19: coverage * Recall of edges in prevailing theories - Column 20: Run ID ********************** Task 3 scores for each topic Topic #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 Run_ID E101 47 4 43 0 0 4 33 4 2 0 14 0 0.0679 0.0679 0.0893 0.0417 0.0417 0.0000 BBN_1.BBN_TA2_v3.BBN_TA3_v1a E101 27 26 1 0 0 26 26 26 25 1 14 0 0.9762 0.9762 1.0000 0.9643 1.0000 0.0000 LDC_1.LDC_1.BBN_TA3_v1a E101 479 372 107 51 0 354 144 144 98 46 15 0 0.7494 0.7398 1.0000 0.6800 1.0000 0.0000 LDC_2.LDC_2.BBN_TA3_v2a E101 43 13 30 0 0 13 41 13 9 0 14 0 0.2976 0.2976 0.3095 0.2143 0.2143 0.0000 OPERA_TA1a_hans_V3.OPERA_TA2_hans_V5.BBN_TA3_v1a E102 25 12 13 0 0 10 22 10 10 0 14 0 0.5238 0.4524 0.4643 0.4643 0.4643 0.0000 BBN_1.BBN_TA2_v3.BBN_TA3_v1a E102 47 37 10 0 0 37 31 23 21 2 14 1 0.8095 0.8095 0.7857 0.7143 0.7857 0.0051 LDC_1.LDC_1.BBN_TA3_v1a E102 319 241 78 60 0 226 62 54 19 35 15 5 0.7017 0.7062 0.8767 0.2989 0.8767 0.0325 LDC_2.LDC_2.BBN_TA3_v2a E102 7 7 0 0 0 7 7 7 7 0 4 1 1.0000 1.0000 1.0000 1.0000 1.0000 0.0105 OPERA_TA1a_hans_V2.BBN_TA2_v1.BBN_TA3_v2a E102 46 44 2 0 0 44 28 26 8 0 14 0 0.9286 0.9286 0.9286 0.2857 0.2857 0.0000 OPERA_TA1a_hans_V3.OPERA_TA2_hans_V5.BBN_TA3_v1a E103 41 22 19 1 0 22 17 15 5 8 17 0 0.6148 0.6178 0.8824 0.2941 0.7647 0.0000 BBN_1.BBN_TA2_v2.BBN_TA3_v2a E103 20 0 20 0 0 0 20 0 0 0 14 0 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 BBN_1.BBN_TA2_v3.BBN_TA3_v1a E103 110 106 4 21 0 106 37 37 34 3 15 1 0.9685 0.9746 1.0000 0.9000 1.0000 0.0222 LDC_2.LDC_2.BBN_TA3_v2a E103 3 3 0 0 0 3 3 3 3 0 3 0 1.0000 1.0000 1.0000 1.0000 1.0000 0.0000 OPERA_TA1a_hans_V2.BBN_TA2_v1.BBN_TA3_v2a ALL 41 22 19 1 0 22 17 15 5 8 17 0 0.6148 0.6178 0.8824 0.2941 0.7647 0.0000 BBN_1.BBN_TA2_v2.BBN_TA3_v2a ALL 92 16 76 0 0 14 75 14 12 0 42 0 0.1972 0.1734 0.1845 0.1687 0.1687 0.0000 BBN_1.BBN_TA2_v3.BBN_TA3_v1a ALL 74 63 11 0 0 63 57 49 46 3 28 1 0.8929 0.8929 0.8929 0.8393 0.8929 0.0017 LDC_1.LDC_1.BBN_TA3_v1a ALL 908 719 189 132 0 686 243 235 151 84 45 6 0.8065 0.8069 0.9589 0.6263 0.9589 0.0153 LDC_2.LDC_2.BBN_TA3_v2a ALL 10 10 0 0 0 10 10 10 10 0 7 1 1.0000 1.0000 1.0000 1.0000 1.0000 0.0035 OPERA_TA1a_hans_V2.BBN_TA2_v1.BBN_TA3_v2a ALL 89 57 32 0 0 57 69 39 17 0 28 0 0.6131 0.6131 0.6190 0.2500 0.2500 0.0000 OPERA_TA1a_hans_V3.OPERA_TA2_hans_V5.BBN_TA3_v1a