========================================================================= TAC KBP 2016 SLOT FILLER VALIDATION CHINESE ENSEMBLING EVALUATION RESULTS ========================================================================= Team ID: SAFT_ISI Organization: USC Information Sciences Institute ************************************************************* Run ID: SAFT_ISI1 Did the run access the live Web during the evaluation window: No Did this run judge each candidate slot filler independently of all other candidate slot fillers in the evaluation dataset: Yes Did this run judge candidate slot fillers for each slot-filling run independently of all other slot-filling runs in the evaluation dataset: No Did this run judge candidate slot fillers for each slot-filling team independently of all other slot-filling teams in the evaluation dataset: No Did this run make use of the slot filler or justification offsets provided for each candidate slot filler: Yes Did this run make use of the confidence values provided for each candidate slot filler: No Did this run make use of the system profiles for the slot filling runs: No Did this run make use of the preliminary assessments provided for some of the slot filler candidates: Yes CSSF micro-average Precision, Recall, and F1, at each hop level: hop0_P hop0_R hop0_F hop1_P hop0_R hop1_F ALL_P ALL_R ALL_F SAFT_ISI1.CMN.ensemble 0.4561 0.0399 0.0733 0.3182 0.0341 0.0617 0.4177 0.0385 0.0705 hltcoe_KB_CMN_4 (best hop0 F1 input) 0.5200 0.1994 0.2882 0.2905 0.2537 0.2708 0.4242 0.2124 0.2830 hltcoe_KB_CMN_3 (best hop1 F1 input) 0.5246 0.1963 0.2857 0.2989 0.2537 0.2744 0.4306 0.2100 0.2824 hltcoe_KB_CMN_4 (best ALL F1 input) 0.5200 0.1994 0.2882 0.2905 0.2537 0.2708 0.4242 0.2124 0.2830 MAX micro-average Precision, Recall, and F1, at each hop level: hop0_P hop0_R hop0_F hop1_P hop0_R hop1_F ALL_P ALL_R ALL_F SAFT_ISI1.CMN.ensemble 0.4583 0.0466 0.0846 0.3182 0.0424 0.0749 0.4143 0.0455 0.0820 hltcoe_KB_CMN_4 (best hop0 F1 input) 0.5378 0.2564 0.3472 0.2905 0.3152 0.3023 0.4282 0.2716 0.3324 hltcoe_KB_CMN_1 (best hop1 F1 input) 0.5459 0.2521 0.3449 0.3023 0.3152 0.3086 0.4385 0.2684 0.3330 hltcoe_KB_CMN_1 (best ALL F1 input) 0.5459 0.2521 0.3449 0.3023 0.3152 0.3086 0.4385 0.2684 0.3330 MEAN macro-average Precision, Recall, and F1, at each hop level: hop0_P hop0_R hop0_F hop1_P hop0_R hop1_F ALL_P ALL_R ALL_F SAFT_ISI1.CMN.ensemble 0.1053 0.0980 0.0891 0.0090 0.0024 0.0038 0.0610 0.0540 0.0499 hltcoe_KB_CMN_4 (best hop0 F1 input) 0.1856 0.1796 0.1752 0.1327 0.1465 0.1348 0.1613 0.1644 0.1566 Stanford_KB_CMN_2 (best hop1 F1 input) 0.1206 0.2031 0.1324 0.1242 0.1804 0.1391 0.1223 0.1927 0.1355 hltcoe_KB_CMN_4 (best ALL F1 input) 0.1856 0.1796 0.1752 0.1327 0.1465 0.1348 0.1613 0.1644 0.1566 ************************************************************* Run ID: SAFT_ISI2 Did the run access the live Web during the evaluation window: No Did this run judge each candidate slot filler independently of all other candidate slot fillers in the evaluation dataset: Yes Did this run judge candidate slot fillers for each slot-filling run independently of all other slot-filling runs in the evaluation dataset: No Did this run judge candidate slot fillers for each slot-filling team independently of all other slot-filling teams in the evaluation dataset: No Did this run make use of the slot filler or justification offsets provided for each candidate slot filler: Yes Did this run make use of the confidence values provided for each candidate slot filler: No Did this run make use of the system profiles for the slot filling runs: No Did this run make use of the preliminary assessments provided for some of the slot filler candidates: Yes CSSF micro-average Precision, Recall, and F1, at each hop level: hop0_P hop0_R hop0_F hop1_P hop0_R hop1_F ALL_P ALL_R ALL_F SAFT_ISI2.CMN.ensemble 0.4561 0.0399 0.0733 0.3182 0.0341 0.0617 0.4177 0.0385 0.0705 hltcoe_KB_CMN_4 (best hop0 F1 input) 0.5200 0.1994 0.2882 0.2905 0.2537 0.2708 0.4242 0.2124 0.2830 hltcoe_KB_CMN_3 (best hop1 F1 input) 0.5246 0.1963 0.2857 0.2989 0.2537 0.2744 0.4306 0.2100 0.2824 hltcoe_KB_CMN_4 (best ALL F1 input) 0.5200 0.1994 0.2882 0.2905 0.2537 0.2708 0.4242 0.2124 0.2830 MAX micro-average Precision, Recall, and F1, at each hop level: hop0_P hop0_R hop0_F hop1_P hop0_R hop1_F ALL_P ALL_R ALL_F SAFT_ISI2.CMN.ensemble 0.4583 0.0466 0.0846 0.3182 0.0424 0.0749 0.4143 0.0455 0.0820 hltcoe_KB_CMN_4 (best hop0 F1 input) 0.5378 0.2564 0.3472 0.2905 0.3152 0.3023 0.4282 0.2716 0.3324 hltcoe_KB_CMN_1 (best hop1 F1 input) 0.5459 0.2521 0.3449 0.3023 0.3152 0.3086 0.4385 0.2684 0.3330 hltcoe_KB_CMN_1 (best ALL F1 input) 0.5459 0.2521 0.3449 0.3023 0.3152 0.3086 0.4385 0.2684 0.3330 MEAN macro-average Precision, Recall, and F1, at each hop level: hop0_P hop0_R hop0_F hop1_P hop0_R hop1_F ALL_P ALL_R ALL_F SAFT_ISI2.CMN.ensemble 0.1053 0.0980 0.0891 0.0090 0.0024 0.0038 0.0610 0.0540 0.0499 hltcoe_KB_CMN_4 (best hop0 F1 input) 0.1856 0.1796 0.1752 0.1327 0.1465 0.1348 0.1613 0.1644 0.1566 Stanford_KB_CMN_2 (best hop1 F1 input) 0.1206 0.2031 0.1324 0.1242 0.1804 0.1391 0.1223 0.1927 0.1355 hltcoe_KB_CMN_4 (best ALL F1 input) 0.1856 0.1796 0.1752 0.1327 0.1465 0.1348 0.1613 0.1644 0.1566