Difference between revisions of "Main Page"

Welcome to the THYME project

Welcome to the Temporal Histories of Your Medical Event (THYME) project.

The overarching long-term vision of our research is to create novel technologies for processing clinical free text. Such technologies will enable sophisticated and efficient indexing, retrieval and data mining over the ever increasing amounts of electronic clinical data. Processing free text poses a number of challenges to which the fields of Artificial intelligence, natural language processing and computer science in general have made advances. Methods for processing free text are informed by linguistic theory combined with the power of statistical inferencing. A key component to the next step, natural language understanding, is discovering events and their relations on a timeline. Temporal relations are of prime importance in biomedicine as they are intrinsically linked to diseases, signs and symptoms, and treatments. Understanding the timeline of clinically relevant events is key to the next generation of translational research where the importance of generalizing over large amounts of data holds the promise of deciphering biomedical puzzles.

The goal of our current proposal is to discover temporal relations from clinical free text through achieving four specific aims:

Specific Aim 1: Develop (1) a temporal relation annotation schema and guidelines for clinical free text based on TimeML, which will require extensions to Treebank, PropBank and VerbNet annotation guidelines to the clinical domain, (2) an annotated corpus (500K words of clinical narrative) following the temporal relations schema with additions to Treebank, PropBank and VerbNet, (3) a descriptive study comparing temporal relations in the clinical and general domains.

Specific Aim 2: Extend and evaluate existing methods and/or develop new algorithms for temporal relation discovery in the clinical domain. Component-level evaluation

Specific Aim 3: Integrate best method and/or a variety of methods for temporal relation discovery into Apache cTAKES (ctakes.apache.org) and release as open source annotators in the pipeline. Functional testing. Dissemination activities.

Specific Aim 4: System-level evaluation. Test the functionality of the enhanced Apache cTAKES (ctakes.apache.org) on translational research use cases, e.g. the progression of colon cancer as documented in clinical notes and pathology reports, the progression of brain tumor as documented in radiology reports.

The methods we will use for the temporal relation discovery are based on machine learning, e.g., Support Vector Machine technology. Such methods require the annotation of a reference standard from which the computations are derived. The best methods will be released as part of the cTAKES for the larger community to use and contribute to. We will test the methods against biomedical queries.

ACKNOWLEDGMENT: The project described is supported by Grant Number R01LM010090 from the National Library Of Medicine. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Library Of Medicine or the National Institutes of Health.

The project period is October, 2010 - September, 2014.

Who We Are

University of Colorado
- Martha Palmer (PI)
- Jim Martin
- Wayne Ward
- Steven Bethard
- William Styler
- Arrick Lanfranchi (through August, 2012)
- Anwen Fredricksen
- and several Lingustics and Computer Science graduate students

Boston Childrens Hospital/Harvard Medical School
- Guergana Savova (PI)
- Dmitriy Dligach
- Timothy Miller
- Sameer Pradhan
- Sean Finan
- Chen Lin
- David Harris
- Jennifer Green

Mayo Clinic
- Piet de Groen
- Brad Erickson
- James Masanz
- Donna Ihrke (through December, 2012)
- Pauline Funk

Brandeis University
- James Pustejovsky

THYME Annotation Guidelines

These guidelines were provided to the organizers of the i2b2 challenge for consideration during planning, and reflect an earlier stage of our guidelines. As such, although representative, These guidelines are out of date. Please check back Mid-July for a more up-to-date copy of the guidelines.

i2b2 Simplified THYME Guidelines (PDF)

Annotations and availability

Annotation layers are treebank and propbank annotations as well as temporal annotations for events, temporal expressions and temporal relations. The corpus will be made available to the research community under a data use agreement. Instructions as to how to get the corpus will be posted soon.

THYME system

THYME system is available as part of Apache cTAKES at ctakes.apache.org.

Relevant Papers

Venues for manuscript submissions

Venues for manuscript submissions/publications

Project materials

Project Charter

Tasks, leads, teams and deadlines

Progress reports

Clinical Temporal Relations Annotation Guidelines - Release notes and latest versions

Annotations - Describes the corpus, the layers of annotations and annotation progress

Annotation Tools - Describes the progress and information pertaining to the Anafora annotation tool

Software - Describes the software modules and their organization

Train/Development/Test splits

Use this split for experiments with the THYME data (% 8)!
Train sets: 1, 2, 3, 8, 9, 10, 11, 16, 17, 18, 19, 24, 25, 26, 27
Development sets: 4, 12, 13, 20, 21
Test sets: 6, 7, 14, 15, 22, 23
Protege/Knowtator and Anafora annotation tools: annotations

Communication

Bi-weekly meetings, Wed 11-noon ET
- Call in details
Distribution Lists

Meeting Notes

July 17, 2013 Agenda and notes
No conference call on July 3, 2013. Happy 4th of July!
June 19, 2013 Agenda and notes
June 5, 2013 Agenda and notes
May 31, 2013 Methods meeting agenda and notes
May 24, 2013 Methods meeting agenda and notes
May 22, 2013 Agenda and notes
May 17, 2013 Methods meeting agenda and notes
May 9, 2013 Methods meeting agenda and notes
May 8, 2013 Agenda and notes
May 3, 2013 Methods meeting agenda and notes
April 24, 2013 Agenda and notes
April 10, 2013 Agenda and notes
March 27, 2013 Agenda and notes
March 13, 2013 Agenda and notes
February 27, 2013 Agenda and notes
February 13, 2013 Agenda and notes
January 30, 2013 Agenda and notes
January 28, 2013 (annotations subgroup) Agenda and notes
January 16, 2013 Agenda and notes
January 2, 2013 Agenda and notes
December 19, 2012 Agenda and notes
December 5, 2012 Agenda and notes
November 21, 2012 Agenda and notes
November 6, 2012 Agenda and notes
October 24, 2012 Agenda and notes
October 10, 2012 Agenda and notes
September 12, 2012 Agenda and notes
August 29, 2012 Agenda and notes
August 15, 2012 Agenda and notes
August 1, 2012 Agenda and notes
July 18, 2012 Agenda and notes
June 22, 2012 Agenda and notes
June 20, 2012 Agenda and notes
June 6, 2012 Agenda and notes
May 23, 2012 Agenda and notes
May 9, 2012 Agenda and notes
April 25, 2012 Agenda and notes
April 11, 2012 Agenda and notes
March 28, 2012 Agenda and notes
March 14, 2012 Agenda and notes
Feb 29, 2012 Agenda and notes
Feb 14, 2012 Agenda and notes
Feb 1, 2012 Agenda and notes

Getting started

Contact

If you need assistance and/or if you have questions about the project, feel free to send e-mail to steven.bethard at colorado dot edu OR Guergana.Savova at childrens dot harvard dot edu

@@ Line 52: / Line 52: @@
 ** James Pustejovsky
-== THYME annotation guidelines ==
+== THYME Annotation Guidelines ==
-They will be posted in time for an upcoming BioNLP publication. If you came here from that paper and this message is still here, please email Timothy Miller and bug him. (Email address is ''firstname''.''lastname''@childrens.harvard.edu).
+These guidelines were provided to the organizers of the i2b2 challenge for consideration during planning, and reflect an earlier stage of our guidelines.  As such, although representative, '''These guidelines are out of date.'''  Please check back Mid-July for a more up-to-date copy of the guidelines.
+* [[Media:i2b2simplifiedthymeguidelines.pdf|i2b2 Simplified THYME Guidelines (PDF)]]
 == Annotations and availability ==

Difference between revisions of "Main Page"

Revision as of 13:29, 19 June 2013

Contents