==================================================== TAC KBP 2017 SPANISH SLOT FILLING EVALUATION RESULTS ==================================================== Team ID: STANFORD Organization: Stanford University ************************************************************* Run ID: STANFORD_SF_SPA_1 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: Yes Slot Filling Evaluation (Queries involve ONLY SF slots): Scores based on P/R/F1; only k=1 justification allowed: Metric RunID Hop Prec Recall F1 SF-ALL-Micro STANFORD_SF_SPA_1 0 0.4437 0.2414 0.3127 SF-ALL-Micro STANFORD_SF_SPA_1 1 0.0753 0.0452 0.0565 SF-ALL-Micro STANFORD_SF_SPA_1 ALL 0.2979 0.1683 0.2151 SF-ALL-Macro STANFORD_SF_SPA_1 0 0.2019 0.2069 0.1881 SF-ALL-Macro STANFORD_SF_SPA_1 1 0.0327 0.0421 0.0358 SF-ALL-Macro STANFORD_SF_SPA_1 ALL 0.1397 0.1463 0.1321 LDC-MAX-ALL-Micro STANFORD_SF_SPA_1 0 0.4819 0.2667 0.3433 LDC-MAX-ALL-Micro STANFORD_SF_SPA_1 1 0.0889 0.0430 0.0580 LDC-MAX-ALL-Micro STANFORD_SF_SPA_1 ALL 0.3438 0.1811 0.2372 LDC-MAX-ALL-Macro STANFORD_SF_SPA_1 0 0.2373 0.2312 0.2200 LDC-MAX-ALL-Macro STANFORD_SF_SPA_1 1 0.0333 0.0417 0.0361 LDC-MAX-ALL-Macro STANFORD_SF_SPA_1 ALL 0.1653 0.1643 0.1551 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_1 0 0.2069 0.2102 0.1937 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_1 1 0.0292 0.0375 0.0319 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_1 ALL 0.1442 0.1492 0.1366 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. Scores based on confidence values and Average Precision (AP); only k=1 justification allowed: Metric RunID Hop AP SF-ALL-Macro STANFORD_SF_SPA_1 0 0.1989 SF-ALL-Macro STANFORD_SF_SPA_1 1 0.0089 SF-ALL-Macro STANFORD_SF_SPA_1 ALL 0.1333 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_1 0 0.2035 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_1 1 0.0075 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_1 ALL 0.1343 (PRIMARY METRIC) *ALL-Macro AP refer to mean of corresponding AP values. ************************************************************* Run ID: STANFORD_SF_SPA_2 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: Yes Slot Filling Evaluation (Queries involve ONLY SF slots): Scores based on P/R/F1; only k=1 justification allowed: Metric RunID Hop Prec Recall F1 SF-ALL-Micro STANFORD_SF_SPA_2 0 0.5079 0.2452 0.3307 SF-ALL-Micro STANFORD_SF_SPA_2 1 0.1111 0.0452 0.0642 SF-ALL-Micro STANFORD_SF_SPA_2 ALL 0.3757 0.1707 0.2347 SF-ALL-Macro STANFORD_SF_SPA_2 0 0.2148 0.2096 0.1999 SF-ALL-Macro STANFORD_SF_SPA_2 1 0.0327 0.0421 0.0358 SF-ALL-Macro STANFORD_SF_SPA_2 ALL 0.1479 0.1480 0.1396 LDC-MAX-ALL-Micro STANFORD_SF_SPA_2 0 0.5256 0.2733 0.3596 LDC-MAX-ALL-Micro STANFORD_SF_SPA_2 1 0.1290 0.0430 0.0645 LDC-MAX-ALL-Micro STANFORD_SF_SPA_2 ALL 0.4128 0.1852 0.2557 LDC-MAX-ALL-Macro STANFORD_SF_SPA_2 0 0.2493 0.2357 0.2315 LDC-MAX-ALL-Macro STANFORD_SF_SPA_2 1 0.0333 0.0417 0.0361 LDC-MAX-ALL-Macro STANFORD_SF_SPA_2 ALL 0.1731 0.1672 0.1625 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_2 0 0.2189 0.2147 0.2051 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_2 1 0.0292 0.0375 0.0319 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_2 ALL 0.1519 0.1522 0.1440 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. Scores based on confidence values and Average Precision (AP); only k=1 justification allowed: Metric RunID Hop AP SF-ALL-Macro STANFORD_SF_SPA_2 0 0.2031 SF-ALL-Macro STANFORD_SF_SPA_2 1 0.0089 SF-ALL-Macro STANFORD_SF_SPA_2 ALL 0.1361 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_2 0 0.2093 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_2 1 0.0075 LDC-MEAN-ALL-Macro STANFORD_SF_SPA_2 ALL 0.1382 (PRIMARY METRIC) *ALL-Macro AP refer to mean of corresponding AP values.