=========================================================================
TAC KBP 2016 SLOT FILLER VALIDATION ENGLISH ENSEMBLING EVALUATION RESULTS
=========================================================================


Team ID:  SAFT_ISI
Organization:  USC Information Sciences Institute


*************************************************************

Run ID:  SAFT_ISI1
Did the run access the live Web during the evaluation window:  No
Did this run judge each candidate slot filler independently of all other candidate slot fillers in the evaluation dataset: Yes
Did this run judge candidate slot fillers for each slot-filling run independently of all other slot-filling runs in the evaluation dataset: No
Did this run judge candidate slot fillers for each slot-filling team independently of all other slot-filling teams in the evaluation dataset: No
Did this run make use of the slot filler or justification offsets provided for each candidate slot filler: Yes
Did this run make use of the confidence values provided for each candidate slot filler: No
Did this run make use of the system profiles for the slot filling runs: No
Did this run make use of the preliminary assessments provided for some of the slot filler candidates: Yes

CSSF micro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI1.ENG.ensemble			0.3952	0.3146	0.3503		0.1849	0.1290	0.1520		0.3308	0.2525	0.2864
BBN_KB_ENG_4 (best hop0 F1 input)	0.4609	0.2437	0.3188		0.2528	0.1320	0.1734		0.3918	0.2063	0.2703
BBN_KB_ENG_1 (best hop1 F1 input)	0.4657	0.2408	0.3174		0.2542	0.1320	0.1737		0.3947	0.2043	0.2693
BBN_KB_ENG_4 (best ALL F1 input)	0.4609	0.2437	0.3188		0.2528	0.1320	0.1734		0.3918	0.2063	0.2703

MAX micro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI1.ENG.ensemble			0.4060	0.3109	0.3522		0.1675	0.1236	0.1422		0.3288	0.2487	0.2832
BBN_KB_ENG_4 (best hop0 F1 input)	0.4839	0.2591	0.3375		0.2237	0.1313	0.1655		0.3921	0.2167	0.2791
BBN_KB_ENG_1 (best hop1 F1 input)	0.4838	0.2572	0.3358		0.2252	0.1313	0.1659		0.3925	0.2154	0.2781
BBN_KB_ENG_4 (best ALL F1 input)	0.4839	0.2591	0.3375		0.2237	0.1313	0.1655		0.3921	0.2167	0.2791

MEAN macro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI1.ENG.ensemble			0.3045	0.3259	0.2880		0.0721	0.1142	0.0833		0.2127	0.2422	0.2071
Stanford_SF_ENG_3 (best hop0 F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244
Stanford_SF_ENG_3 (best hop1 F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244
Stanford_SF_ENG_3 (best ALL F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244


*************************************************************

Run ID:  SAFT_ISI2
Did the run access the live Web during the evaluation window:  No
Did this run judge each candidate slot filler independently of all other candidate slot fillers in the evaluation dataset: Yes
Did this run judge candidate slot fillers for each slot-filling run independently of all other slot-filling runs in the evaluation dataset: No
Did this run judge candidate slot fillers for each slot-filling team independently of all other slot-filling teams in the evaluation dataset: No
Did this run make use of the slot filler or justification offsets provided for each candidate slot filler: Yes
Did this run make use of the confidence values provided for each candidate slot filler: No
Did this run make use of the system profiles for the slot filling runs: No
Did this run make use of the preliminary assessments provided for some of the slot filler candidates: Yes

CSSF micro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI2.ENG.ensemble			0.3952	0.3146	0.3503		0.1849	0.1290	0.1520		0.3308	0.2525	0.2864
BBN_KB_ENG_4 (best hop0 F1 input)	0.4609	0.2437	0.3188		0.2528	0.1320	0.1734		0.3918	0.2063	0.2703
BBN_KB_ENG_1 (best hop1 F1 input)	0.4657	0.2408	0.3174		0.2542	0.1320	0.1737		0.3947	0.2043	0.2693
BBN_KB_ENG_4 (best ALL F1 input)	0.4609	0.2437	0.3188		0.2528	0.1320	0.1734		0.3918	0.2063	0.2703

MAX micro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI2.ENG.ensemble			0.4060	0.3109	0.3522		0.1675	0.1236	0.1422		0.3288	0.2487	0.2832
BBN_KB_ENG_4 (best hop0 F1 input)	0.4839	0.2591	0.3375		0.2237	0.1313	0.1655		0.3921	0.2167	0.2791
BBN_KB_ENG_1 (best hop1 F1 input)	0.4838	0.2572	0.3358		0.2252	0.1313	0.1659		0.3925	0.2154	0.2781
BBN_KB_ENG_4 (best ALL F1 input)	0.4839	0.2591	0.3375		0.2237	0.1313	0.1655		0.3921	0.2167	0.2791

MEAN macro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI2.ENG.ensemble			0.3045	0.3259	0.2880		0.0721	0.1142	0.0833		0.2127	0.2422	0.2071
Stanford_SF_ENG_3 (best hop0 F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244
Stanford_SF_ENG_3 (best hop1 F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244
Stanford_SF_ENG_3 (best ALL F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244


*************************************************************

Run ID:  SAFT_ISI3
Did the run access the live Web during the evaluation window:  No
Did this run judge each candidate slot filler independently of all other candidate slot fillers in the evaluation dataset: Yes
Did this run judge candidate slot fillers for each slot-filling run independently of all other slot-filling runs in the evaluation dataset: No
Did this run judge candidate slot fillers for each slot-filling team independently of all other slot-filling teams in the evaluation dataset: No
Did this run make use of the slot filler or justification offsets provided for each candidate slot filler: Yes
Did this run make use of the confidence values provided for each candidate slot filler: No
Did this run make use of the system profiles for the slot filling runs: No
Did this run make use of the preliminary assessments provided for some of the slot filler candidates: Yes

CSSF micro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI3.ENG.ensemble			0.2677	0.3752	0.3124		0.1753	0.1496	0.1614		0.2460	0.2996	0.2702
BBN_KB_ENG_4 (best hop0 F1 input)	0.4609	0.2437	0.3188		0.2528	0.1320	0.1734		0.3918	0.2063	0.2703
BBN_KB_ENG_1 (best hop1 F1 input)	0.4657	0.2408	0.3174		0.2542	0.1320	0.1737		0.3947	0.2043	0.2693
BBN_KB_ENG_4 (best ALL F1 input)	0.4609	0.2437	0.3188		0.2528	0.1320	0.1734		0.3918	0.2063	0.2703

MAX micro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI3.ENG.ensemble			0.2832	0.3685	0.3203		0.1631	0.1467	0.1545		0.2525	0.2949	0.2720
BBN_KB_ENG_4 (best hop0 F1 input)	0.4839	0.2591	0.3375		0.2237	0.1313	0.1655		0.3921	0.2167	0.2791
BBN_KB_ENG_1 (best hop1 F1 input)	0.4838	0.2572	0.3358		0.2252	0.1313	0.1659		0.3925	0.2154	0.2781
BBN_KB_ENG_4 (best ALL F1 input)	0.4839	0.2591	0.3375		0.2237	0.1313	0.1655		0.3921	0.2167	0.2791

MEAN macro-average Precision, Recall, and F1, at each hop level:
					hop0_P	hop0_R	hop0_F		hop1_P	hop0_R	hop1_F		ALL_P	ALL_R	ALL_F
SAFT_ISI3.ENG.ensemble			0.2996	0.4219	0.3101		0.0811	0.1460	0.0974		0.2133	0.3129	0.2261
Stanford_SF_ENG_3 (best hop0 F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244
Stanford_SF_ENG_3 (best hop1 F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244
Stanford_SF_ENG_3 (best ALL F1 input)	0.2716	0.3016	0.2640		0.1567	0.1977	0.1638		0.2262	0.2605	0.2244