Data
Temporal Dependency Trees in Children's Stories
As part of the TERENCE project, we annotated a small corpus of children's fables with temporal dependency trees, that is, where timelines are annotated as tree-structured graphs of temporal links between events. More details about the annotation process are available in the annotation paper, and a description of parsing models for temporal dependency trees are available in the machine learning paper.
Re-annotated TempEval 2010 Time Expressions
In working with the TempEval 2010 time expression recognition task, we found that a few time expressions, such as such as 23-year, a few days and third-quarter were missed by the annotators in the official test data. Our re-annotated version of the TempEval time expression test data can be used as a drop-in replacement to the original data. More details about the re-annotation process and reasons to use this data instead of the original are available in the paper.
Conjoined-Event Temporal and Causal Relations
We annotated a small corpus of conjoined-event temporal-causal relations. For example, given the sentence:
Fuel tanks had leaked and contaminated the soil.
we annotated the relations (leaked BEFORE contaminated) and
(leaked CAUSED contaminated).
The corpus includes 1000 pairs of events taken from the Wall Street Journal,
with each event pair assigned both a temporal and a causal relation.
More details about the data and annotation process are available in
the paper.
Verb-Clause Temporal Relations
We annotated a small corpus of verb-clause temporal relations. For example, given the sentence:
International Business Machines Corp. and Compaq Computer Corp. say the bugs will delay products.
we annotated the temporal relation (say BEFORE delay).
The corpus includes 895 such pairs, taken from the Wall Street Journal section
of the TimeBank.
More details about the data and annotation process are available in the paper.
Opinions and Opinion Holders
We annotated opinions and opinion holders for verbs with clausal complements in both FrameNet data and PropBank data. For example, given the sentence:
Still, Vista officials realize they're relatively fortunate.
we annotated Vista officials as the opinion holder, and they're relatively fortunate as the opinion. More details about the data and the models we built from it are available in the paper.
The corpora on which this annotation was performed have changed somewhat since this work was done, but I've managed to mostly re-align the data. The PropBank data should be almost identical to that used in the paper, while the FrameNet data is about 50 sentences smaller, since some of the sentences we annotated are missing in the newer versions of FrameNet.