======================================================= TAC KBP 2016 CHINESE KB CONSTRUCTION EVALUATION RESULTS ======================================================= Team ID: Stanford Organization: Stanford University ************************************************************* Run ID: Stanford_KB_CMN_1 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: No Entity Discovery Evaluation: ONLY Chinese documents: Prec Recall F1 Metric 0.774 0.431 0.554 strong_mention_match 0.694 0.386 0.496 strong_typed_mention_match 0.000 0.000 0.000 entity_match 0.728 0.273 0.397 b_cubed 0.734 0.408 0.524 mention_ceaf 0.661 0.368 0.472 typed_mention_ceaf Slot Filling Evaluation: Metric RunID Hop Prec Recall F1 SF-ALL-Micro Stanford_KB_CMN_1 0 0.7500 0.0599 0.1110 SF-ALL-Micro Stanford_KB_CMN_1 1 0.5714 0.0174 0.0338 SF-ALL-Micro Stanford_KB_CMN_1 ALL 0.7313 0.0499 0.0935 SF-ALL-Macro Stanford_KB_CMN_1 0 0.0992 0.0797 0.0846 SF-ALL-Macro Stanford_KB_CMN_1 1 0.0199 0.0265 0.0221 SF-ALL-Macro Stanford_KB_CMN_1 ALL 0.0624 0.0551 0.0556 LDC-MAX-ALL-Micro Stanford_KB_CMN_1 0 0.7368 0.0785 0.1419 LDC-MAX-ALL-Micro Stanford_KB_CMN_1 1 0.5714 0.0217 0.0419 LDC-MAX-ALL-Micro Stanford_KB_CMN_1 ALL 0.7188 0.0640 0.1175 LDC-MAX-ALL-Macro Stanford_KB_CMN_1 0 0.1146 0.0917 0.0970 LDC-MAX-ALL-Macro Stanford_KB_CMN_1 1 0.0273 0.0364 0.0303 LDC-MAX-ALL-Macro Stanford_KB_CMN_1 ALL 0.0752 0.0667 0.0669 LDC-MEAN-ALL-Macro Stanford_KB_CMN_1 0 0.1005 0.0812 0.0859 LDC-MEAN-ALL-Macro Stanford_KB_CMN_1 1 0.0136 0.0182 0.0152 LDC-MEAN-ALL-Macro Stanford_KB_CMN_1 ALL 0.0614 0.0528 0.0540 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. NIL-DETECTION P/R/F1: 0.3120 0.9919 0.4747 ************************************************************* Run ID: Stanford_KB_CMN_2 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: No Entity Discovery Evaluation: ONLY Chinese documents: Prec Recall F1 Metric 0.775 0.431 0.554 strong_mention_match 0.695 0.386 0.497 strong_typed_mention_match 0.000 0.000 0.000 entity_match 0.729 0.273 0.397 b_cubed 0.734 0.408 0.525 mention_ceaf 0.661 0.368 0.473 typed_mention_ceaf Slot Filling Evaluation: Metric RunID Hop Prec Recall F1 SF-ALL-Micro Stanford_KB_CMN_2 0 0.1622 0.2423 0.1943 SF-ALL-Micro Stanford_KB_CMN_2 1 0.0155 0.1391 0.0278 SF-ALL-Micro Stanford_KB_CMN_2 ALL 0.0670 0.2181 0.1025 SF-ALL-Macro Stanford_KB_CMN_2 0 0.1188 0.1966 0.1309 SF-ALL-Macro Stanford_KB_CMN_2 1 0.1009 0.1523 0.1140 SF-ALL-Macro Stanford_KB_CMN_2 ALL 0.1105 0.1761 0.1231 LDC-MAX-ALL-Micro Stanford_KB_CMN_2 0 0.1633 0.3290 0.2182 LDC-MAX-ALL-Micro Stanford_KB_CMN_2 1 0.0167 0.1739 0.0304 LDC-MAX-ALL-Micro Stanford_KB_CMN_2 ALL 0.0694 0.2893 0.1119 LDC-MAX-ALL-Macro Stanford_KB_CMN_2 0 0.1419 0.2357 0.1556 LDC-MAX-ALL-Macro Stanford_KB_CMN_2 1 0.1386 0.2091 0.1565 LDC-MAX-ALL-Macro Stanford_KB_CMN_2 ALL 0.1404 0.2237 0.1560 LDC-MEAN-ALL-Macro Stanford_KB_CMN_2 0 0.1325 0.2221 0.1453 LDC-MEAN-ALL-Macro Stanford_KB_CMN_2 1 0.1209 0.1818 0.1369 LDC-MEAN-ALL-Macro Stanford_KB_CMN_2 ALL 0.1272 0.2039 0.1415 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. NIL-DETECTION P/R/F1: 0.2827 0.8618 0.4257