Meeting Schedule
Location: Hybrid - Buchanan 430, and the zoom link below
Time: Wednesdays at 10:30am, Mountain Time
Zoom link: https://cuboulder.zoom.us/j/97014876908
Date | Title |
---|---|
08/30/23 | Planning, introductions, welcome! |
09/06/2023 | ACL talk videos (Geoffrey Hinton) |
09/13/2023 | Ongoing projects talks (Susan: AIDA, KAIROS, DWD) |
09/20/2023 | Brunch and garden party outside in the Shakespeare Garden! (no zoom) |
09/27/2023 | Felix Zheng - practice talk, Ongoing projects (Martha: UMR. Jim: ISAT. Rehan: Event Coref Projects) |
10/04/2023 | Ongoing projects talks, focus on low-resource and endangered languages (UMR2, LECS lab, NALA) |
10/11/2023 | Ongoing projects talks, LECS lab and BLAST lab |
10/18/2023 | Téa Wright thesis proposal, BLAST lab
Téa Wright Research Proposal: Pretrained multilingual model Adaptation for Low Resource Languages with OCR Pretrained multilingual models (PMMs) have advanced the natural language processing (NLP) field over recent years, but they often struggle when confronted with low-resource languages. This proposal will explore the challenges of adapting PMMs to such languages, with a current focus on Lakota and Dakota. Of the data available for endangered languages, much of it is in formats that are not machine readable. As a result, endangered languages are left out of NLP technologies. Using optical character recognition (OCR) to digitize these resources is beneficial for this dilemma, but also introduces noise. The goal of this research is to determine how this noise affects model adaptation and performance for zero-shot and few-shot learning for low-resource languages. The project will involve data collection and scanning, annotation for a gold evaluation dataset, and evaluation of multiple language models across different adaptation methods and levels of noise. Additionally, we hope to expand this pipeline to more scripts and languages. The potential implications of this study are broad: generalizability to languages not included in the study as well as providing insight into how noise affects model adaptation and the types of noise that are most harmful. This project aims to address the unique challenges of Lakota and Dakota as well as develop the field’s understanding of how models may be adapted to include low-resource languages, working towards more inclusive NLP technologies.
|
10/25/2023 | Daniel Acuna (starting at 11:20) |
11/1/2023 | TBD |
11/8/2023 | Luke Gessler |
11/15/2023 | TBD |
11/22/2023 | *** fall break *** |
11/29/2023 | Jon's Proposal |
12/06/2023 | Adam's Proposal |
12/13/2023 | Elizabeth's Proposal |
12/20/2023 | Rehan's Dissertation |
Past Schedules
- Spring 2023 Schedule
- Fall 2022 Schedule
- Spring 2022 Schedule
- Fall 2021 Schedule
- Spring 2021 Schedule
- Fall 2020 Schedule
- Spring 2020 Schedule
- Fall 2019 Schedule
- Spring 2019 Schedule
- Fall 2018 Schedule
- Summer 2018 Schedule
- Spring 2018 Schedule
- Fall 2017 Schedule
- Summer 2017 Schedule
- Spring 2017 Schedule
- Fall 2016 Schedule
- Spring 2016 Schedule
- Fall 2015 Schedule
- Spring 2015 Schedule
- Fall 2014 Schedule
- Spring 2014 Schedule
- Fall 2013 Schedule
- Summer 2013 Schedule
- Spring 2013 Schedule
- Fall 2012 Schedule
- Spring 2012 Schedule
- Fall 2011 Schedule
- Summer 2011 Schedule
- Spring 2011 Schedule
- Fall 2010 Schedule
- Summer 2010 Schedule
- Spring 2010 Schedule
- Fall 2009 Schedule
- Summer 2009 Schedule
- Spring 2009 Schedule
- Fall 2008 Schedule
- Summer 2008 Schedule