Frequently Asked Questions
From CompSemWiki
Revision as of 11:29, 9 March 2010 by Leebecker (talk | contribs) (moved WSJ Treebank Sections Explained to Frequently Asked Questions)
This information comes courtesy of Nianwen Bert Xue:
The standard Dev set is Section 1 and the standard test set is Section 23. Most people don't use 24 and 25. Mitch's explanation, which I think is a
plausible one, is that Section 23 is more ``mature" annotation since it was done after the annotators had been well-trained, vs Section 00 where the annotators had just started learning to annotation.