ClearTK Workshop I

From CompSemWiki
Revision as of 09:07, 2 February 2010 by Trood (talk | contribs)
Jump to navigationJump to search

Introduction

Abstraction Paradise / Configuration Hell

  • We are going to avoid a lot of the complexities of UIMA by using UUTUC. This will serve to simplify use of UIMA and ClearTK for this workshop.

Use of Twitter Data

We have been given permission to use Twitter data from the EPIC project under the following conditions (quoting Leysia Palen):

  1. Confirm with me that all attendees are CU people. If not, tell me who else is there, and how many.
  2. The data are not to be made public, and are therefore to put behind a password-protected site. The data must be taken down when workshop is over, and you must confirm to me that this has been done.
  3. you give full credit to Project EPIC for the data source.

The data we have been given is subject to IRB protocols. Here's another quote from Leysia:

 ...we are not prepared to have it be public.  This is data that has been collected by many students on our project, and they and we have the rights to it.... [posting the data] will be violating IRB protocols as well, protocols that we have to adhere to that are federal guidelines. Not only can grant funds be yanked for such a move, a whole university can go under audit for inappropriate posting of human subjects data.

You are expected to respect the above concerns.

Setup

  1. Download and install eclipse from here. You want to select "Eclipse IDE for Java Developers" and it should send you to a page that has a large green arrow on it - click on that green arrow. Here are some installation instructions.
  2. Start Eclipse and choose a new workspace directory. If you get the welcome screen, then click on the image with a silver and gold swooshing arrow (labeled "Workbench" when you hover your mouse over it)
  3. Intall UIMA Eclipse plugins
    • Select Help -> Install New Software
    • Enter the URL
      http://www.apache.org/dist/incubator/uima/eclipse-update-site/
      into the "Work with" text box
    • Restart eclipse
  4. Obtain cleartk-workshop-feb-2010.zip directly from me (Philip Ogren) which is an exported eclipse project that contains source code, data, and dependencies.