OntoHierarchy

From CompSemWiki
Jump to navigationJump to search

Directory Hierarchy

  • Root directory
  • Treebank directory
    • $CORPORA/$LANGUAGE/annotations/parse/
  • Propbank directory
    • $CORPORA/$LANGUAGE/annotations/prop/
  • Sense direcotry
    • $CORPORA/$LANGUAGE/annotations/sense/
  • Raw directory 1 (all tokens including traces)
    • $CORPORA/$LANGUAGE/annotations/tokens/
  • Raw directory 2 (all tokens excluding traces)
    • $CORPORA/$LANGUAGE/annotations/words/
  • Frameset directory
    • $CORPORA/$LANGUAGE/metadata/frames/

English Treebank Hierarchy

  • $CORPORA/english/parse/$GENRE/$SOURCE/$SECTION/
    • GENRE = bc | bn | mz | nw | wb
    • SOURCE = source of the corpus (e.g. wsj | sinorama | etc.)
    • SECTION = ##
  • $CORPORA/english/parse/
    • bc/: broadcasting conversations
      • cctv/
      • ccn/
      • ebc/
      • msnbc/
      • phoenix/
      • p2.5_a2e/
      • p2.5_c2e/
    • bn/: broadcasting news
      • abc/
      • cnn/
      • mnb/
      • nbc/
      • pri/
      • voa/
      • p2.5_a2e/
      • p2.5_c2e/
    • mz/: magazines
      • sinorama/
    • nw/: newswires
      • xinhua/
      • wsj/
      • p2.5_a2e/
      • p2.5_c2e/
    • wb/: webtexts
      • ng_a2e/
      • ng_c2e/
      • ng_eng/
      • ng_p2.5_a2e/
      • ng_p2.5_c2e/
      • wl_c2e/
      • wl_eng/
      • wl_p2.5_a2e/
      • wl_p2.5_c2e/