========================================================== TAC KBP 2016 CROSS-LINGUAL SLOT FILLING EVALUATION RESULTS ========================================================== Team ID: Stanford Organization: Stanford University ************************************************************* Run ID: Stanford_SF_XLING_1 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: Yes Run number of the English SF system that is most closely configured to the English component of this run: 1 Run number of the Spanish SF system that is most closely configured to the Spanish component of this run: NA Run number of the Chinese SF system that is most closely configured to the Chinese component of this run: 1 Slot Filling Evaluation: Metric RunID Hop Prec Recall F1 SF-ALL-Micro Stanford_SF_XLING_1 0 0.4313 0.0552 0.0979 SF-ALL-Micro Stanford_SF_XLING_1 1 0.1616 0.0188 0.0337 SF-ALL-Micro Stanford_SF_XLING_1 ALL 0.3493 0.0434 0.0772 SF-ALL-Macro Stanford_SF_XLING_1 0 0.1034 0.0819 0.0807 SF-ALL-Macro Stanford_SF_XLING_1 1 0.0235 0.0235 0.0228 SF-ALL-Macro Stanford_SF_XLING_1 ALL 0.0626 0.0521 0.0511 LDC-MAX-ALL-Micro Stanford_SF_XLING_1 0 0.5000 0.1216 0.1956 LDC-MAX-ALL-Micro Stanford_SF_XLING_1 1 0.1543 0.0410 0.0648 LDC-MAX-ALL-Micro Stanford_SF_XLING_1 ALL 0.3772 0.0946 0.1512 LDC-MAX-ALL-Macro Stanford_SF_XLING_1 0 0.2188 0.1754 0.1747 LDC-MAX-ALL-Macro Stanford_SF_XLING_1 1 0.0508 0.0497 0.0487 LDC-MAX-ALL-Macro Stanford_SF_XLING_1 ALL 0.1357 0.1132 0.1124 LDC-MEAN-ALL-Macro Stanford_SF_XLING_1 0 0.1083 0.0898 0.0877 LDC-MEAN-ALL-Macro Stanford_SF_XLING_1 1 0.0291 0.0276 0.0273 LDC-MEAN-ALL-Macro Stanford_SF_XLING_1 ALL 0.0691 0.0590 0.0578 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. NIL-DETECTION P/R/F1: 0.2940 0.9106 0.4445 ************************************************************* Run ID: Stanford_SF_XLING_2 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: Yes Run number of the English SF system that is most closely configured to the English component of this run: 1 Run number of the Spanish SF system that is most closely configured to the Spanish component of this run: NA Run number of the Chinese SF system that is most closely configured to the Chinese component of this run: 1 Slot Filling Evaluation: Metric RunID Hop Prec Recall F1 SF-ALL-Micro Stanford_SF_XLING_2 0 0.4099 0.0584 0.1022 SF-ALL-Micro Stanford_SF_XLING_2 1 0.0947 0.0254 0.0401 SF-ALL-Micro Stanford_SF_XLING_2 ALL 0.2601 0.0477 0.0806 SF-ALL-Macro Stanford_SF_XLING_2 0 0.1235 0.0902 0.0936 SF-ALL-Macro Stanford_SF_XLING_2 1 0.0301 0.0298 0.0291 SF-ALL-Macro Stanford_SF_XLING_2 ALL 0.0757 0.0593 0.0606 LDC-MAX-ALL-Micro Stanford_SF_XLING_2 0 0.4697 0.1282 0.2014 LDC-MAX-ALL-Micro Stanford_SF_XLING_2 1 0.1020 0.0590 0.0748 LDC-MAX-ALL-Micro Stanford_SF_XLING_2 ALL 0.2796 0.1050 0.1527 LDC-MAX-ALL-Macro Stanford_SF_XLING_2 0 0.2659 0.1981 0.2054 LDC-MAX-ALL-Macro Stanford_SF_XLING_2 1 0.0661 0.0637 0.0629 LDC-MAX-ALL-Macro Stanford_SF_XLING_2 ALL 0.1670 0.1316 0.1349 LDC-MEAN-ALL-Macro Stanford_SF_XLING_2 0 0.1331 0.1034 0.1061 LDC-MEAN-ALL-Macro Stanford_SF_XLING_2 1 0.0381 0.0363 0.0360 LDC-MEAN-ALL-Macro Stanford_SF_XLING_2 ALL 0.0861 0.0702 0.0714 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. NIL-DETECTION P/R/F1: 0.2995 0.9350 0.4537 ************************************************************* Run ID: Stanford_SF_XLING_3 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: Yes Run number of the English SF system that is most closely configured to the English component of this run: 1 Run number of the Spanish SF system that is most closely configured to the Spanish component of this run: NA Run number of the Chinese SF system that is most closely configured to the Chinese component of this run: 1 Slot Filling Evaluation: Metric RunID Hop Prec Recall F1 SF-ALL-Micro Stanford_SF_XLING_3 0 0.2010 0.0889 0.1233 SF-ALL-Micro Stanford_SF_XLING_3 1 0.0348 0.0443 0.0390 SF-ALL-Micro Stanford_SF_XLING_3 ALL 0.1047 0.0744 0.0870 SF-ALL-Macro Stanford_SF_XLING_3 0 0.1242 0.1218 0.1091 SF-ALL-Macro Stanford_SF_XLING_3 1 0.0418 0.0534 0.0442 SF-ALL-Macro Stanford_SF_XLING_3 ALL 0.0821 0.0868 0.0759 LDC-MAX-ALL-Micro Stanford_SF_XLING_3 0 0.2837 0.1985 0.2336 LDC-MAX-ALL-Micro Stanford_SF_XLING_3 1 0.0441 0.0951 0.0603 LDC-MAX-ALL-Micro Stanford_SF_XLING_3 ALL 0.1379 0.1638 0.1497 LDC-MAX-ALL-Macro Stanford_SF_XLING_3 0 0.2445 0.2485 0.2249 LDC-MAX-ALL-Macro Stanford_SF_XLING_3 1 0.0925 0.1169 0.0975 LDC-MAX-ALL-Macro Stanford_SF_XLING_3 ALL 0.1693 0.1834 0.1619 LDC-MEAN-ALL-Macro Stanford_SF_XLING_3 0 0.1321 0.1317 0.1186 LDC-MEAN-ALL-Macro Stanford_SF_XLING_3 1 0.0451 0.0567 0.0469 LDC-MEAN-ALL-Macro Stanford_SF_XLING_3 ALL 0.0891 0.0946 0.0832 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. NIL-DETECTION P/R/F1: 0.2690 0.8049 0.4032 ************************************************************* Run ID: Stanford_SF_XLING_4 Did the run access the live Web during the evaluation window: No Did the run extract relations from the Cold Start source corpus: Yes Did the run generate meaningful confidence values: Yes Run number of the English SF system that is most closely configured to the English component of this run: 4 Run number of the Spanish SF system that is most closely configured to the Spanish component of this run: NA Run number of the Chinese SF system that is most closely configured to the Chinese component of this run: 4 Slot Filling Evaluation: Metric RunID Hop Prec Recall F1 SF-ALL-Micro Stanford_SF_XLING_4 0 0.5756 0.0437 0.0813 SF-ALL-Micro Stanford_SF_XLING_4 1 0.2544 0.0148 0.0279 SF-ALL-Micro Stanford_SF_XLING_4 ALL 0.4894 0.0343 0.0642 SF-ALL-Macro Stanford_SF_XLING_4 0 0.0990 0.0708 0.0740 SF-ALL-Macro Stanford_SF_XLING_4 1 0.0189 0.0174 0.0174 SF-ALL-Macro Stanford_SF_XLING_4 ALL 0.0580 0.0435 0.0451 LDC-MAX-ALL-Micro Stanford_SF_XLING_4 0 0.6333 0.0943 0.1641 LDC-MAX-ALL-Micro Stanford_SF_XLING_4 1 0.2958 0.0344 0.0617 LDC-MAX-ALL-Micro Stanford_SF_XLING_4 ALL 0.5378 0.0742 0.1304 LDC-MAX-ALL-Macro Stanford_SF_XLING_4 0 0.2109 0.1557 0.1615 LDC-MAX-ALL-Macro Stanford_SF_XLING_4 1 0.0440 0.0378 0.0387 LDC-MAX-ALL-Macro Stanford_SF_XLING_4 ALL 0.1283 0.0974 0.1007 LDC-MEAN-ALL-Macro Stanford_SF_XLING_4 0 0.1081 0.0846 0.0866 LDC-MEAN-ALL-Macro Stanford_SF_XLING_4 1 0.0265 0.0230 0.0235 LDC-MEAN-ALL-Macro Stanford_SF_XLING_4 ALL 0.0677 0.0541 0.0554 *ALL-Macro Prec, Recall and F1 refer to mean-precision, mean-recall and mean-F1. NIL-DETECTION P/R/F1: 0.3085 0.9756 0.4688