TAC 2010 Knowledge Base Population (KBP2010) Track

Entity Linking Scoring

We will use the scoring script (by Paul McNamee) from KBP2009 Entity Linking task.

Slot Filling Scoring (by Ralph Grishman)

In contrast to the 2009 evaluation, a uniform scoring metric will be used, based on traditional measures of recall, precision, and F-measure, computed from counts of correct, missing, and spurious responses.  A non-NIL response is correct if it matches a verified non-NIL entry in the key (the human assessment file);  other non-NIL responses are spurious.  A NIL response where the key has a verified non-NIL response is considered missing. NIL system responses matching verified NIL entries in the key are not counted. For single-valued slots only a single system response will be accepted.  For list-valued slots, the verified non-NIL responses will be grouped into equivalence classes.  Multiple responses to a query must come from disjoint classes to be counted as correct;  other responses are counted as spurious.

Release 1.1 of the scorer is now available.

To run, download SFScore.java

javac SFScore.java
java SFScore response-file key-file [flags ...]

where the possible flags are

trace  -- print a line with assessment of each system response
anydoc -- judge response based only on answer string, ignoring doc id
nocase -- ignore case in matching answer string
slots=slotfile -- take list of entityId:slot pairs from slotfile
                 (otherwise list of pairs is taken from system response)

The slotfile controls which slots are evaluated;  if you want your system evaluated on all slots for which you generate an output, the "slots" parameter is not needed.  In that case it is important for your system to generate explicit NILs for slots it cannot fill.

As the key file, you can use one of the newly produced annotation files, or you can run UpdateSFKey (see the tools folder) on the 2009 judgments file.

The anydoc and nocase flags are designed to make the scorer more useful for development by supporting soft match, but will not be used for official scoring.

Archive of releases

  • Scorer V1.1 (Final Release): SFScore.java

  • Scorer V1.0 (Updated to penalize responses marked REDUNDANT in key.): download

  • Scorer V0.9 (Updated to use 2010 format files.): download

  • Scorer V0.8 (First release for 2010, using 2009 format files.): download