Difference between revisions of "Meeting Schedule"
CompSemUser (talk | contribs) |
CompSemUser (talk | contribs) |
||
(89 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | '''Location:''' Hybrid - | + | '''Location:''' Hybrid - Muenzinger D430, and the zoom link below |
− | '''Time:''' Wednesdays at | + | '''Time:''' Wednesdays at 11:30am, Mountain Time |
'''Zoom link:''' https://cuboulder.zoom.us/j/97014876908 | '''Zoom link:''' https://cuboulder.zoom.us/j/97014876908 | ||
Line 13: | Line 13: | ||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 08/ | + | | 08/28/2024 || '''Planning, introductions, welcome!''' |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 09/ | + | | 09/04/2024 || Brunch Social |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 09/ | + | | 09/11/2024 || Watch and discuss NLP keynote |
+ | |||
+ | '''Winner:''' Barbara Plank’s “Are LLMs Narrowing our Horizon? Let’s Embrace Variation in NLP!” | ||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 09/ | + | | 09/18/2024 || CLASIC presentations |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 09/ | + | | 09/25/2024 || Invited talks/discussions from Leeds and Anschutz folks: Liu Liu, Abe Handler, Yanjun Gao |
+ | |||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 10/ | + | | 10/02/2024 || Martha Palmer, Annie Zaenen, Susan Brown, Alexis Cooper. |
+ | |||
+ | '''Title:''' Testing GPT4's interpretation of the Caused-Motion Construction | ||
+ | |||
+ | '''Abstract:''' The fields of Artificial Intelligence and Natural Language Processing have been revolutionized by the advent of Large Language Models such as GPT4. They are perceived as being language experts and there is a lot of speculation about how intelligent they are, with claims being made about “Sparks of General Artificial Intelligence.” This talk will describe in detail an English linguistic construction, the Caused Motion Construction, and compare prior interpretation approaches with current LLM interpretations. The prior approaches are based on VerbNet. It’s unique contributions to prior approaches will be outlined. Then the results of a recent preliminary study probing GPT4’s analysis of the same constructions will be presented. Not surprisingly, this analysis illustrates both strengths and weaknesses of GPT4’s ability to interpret Caused Motion Constructions and to generalize this interpretation. | ||
+ | |||
+ | Recording: https://o365coloradoedu-my.sharepoint.com/:v:/r/personal/mpalmer_colorado_edu/Documents/BoulderNLP-Palmer-Oct2-2024.mp4?csf=1&web=1&nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=aCHeN8 | ||
+ | |||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 10/ | + | | 10/09/2024 || NAACL Paper Clinic: Come get feedback on your submission drafts! |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 10/ | + | | 10/16/2024 || Senior Thesis Proposals: |
− | |||
− | ''' | + | '''Alexandra Barry''' |
− | ''' | + | '''Title''': Benchmarking LLM Handling of Cross-Dialectal Spanish |
− | + | '''Abstract''': This proposal introduces current issues and gaps in cross-dialectal NLP in Spanish as well as the lack of resources available for Latin American dialects. The presentation will cover past work in dialect detection, translation, and benchmarking in order to build a foundation for a proposal that aims to create a benchmark that analyses LLM robustness across a series of tasks in different Spanish dialects | |
− | |||
− | |||
+ | '''Tavin Turner''' | ||
+ | |||
+ | '''Title''': Agreeing to Disagree: Statutory Relational Stance Modeling | ||
+ | |||
+ | '''Abstract''': Policy division deeply affects which bills get passed in legislature, and how. So far, statutory NLP has predicted voting breakdowns, interpreted stakeholder benefit, informed legal decision support systems, and much more. In practice, legislation demands compromise and concession to pass important policy, yet models often struggle to reason over the whole act. Leveraging neuro-symbolic models, we seek to intermediate this challenge with relational structures of statutes’ sectional stances – modeling stance agreement, exception, etc. Beyond supporting downstream statutory analysis tasks, these structures could help stakeholders understand how a bill impacts them, litmus the cooperation within a legislature, and reveal patterns of compromise that aid a bill through ratification. | ||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 10/ | + | | 10/23/2024 || '''Ananya Ganesh''''s PhD Dissertation Proposal |
− | ''' | + | '''Title''': Reliable Language Technology for Classroom Dialog Understanding |
− | ''' | + | '''Abstract''': In this proposal, I will lay out how NLP models can be developed to address realistic use cases in analyzing classroom dialogue. Towards this goal, I will first introduce a new task and corresponding dataset, focused on detecting off-task utterances in small-group discussions. I will |
+ | then propose a method to solve this task that considers how the inherent structure in the dialog can be used to learn richer representations of the dialog context. Next, I will introduce preliminary work on applying LLMs in the in-context learning setting for a broad range of tasks pertaining to qualitative coding of classroom dialog, and discuss potential follow-up work. Finally, keeping in mind our goals of serving many independent stakeholders, I will propose a study to incorporate differing stake-holder’s subjective judgments while curating gold-standard data for classroom discourse analysis. | ||
+ | |||
+ | |- style="border-top: 2px solid DarkGray;" | ||
+ | | 10/30/2024 || '''Marie McGregor''''s area exam | ||
− | + | '''Title''': Adapting AMR Metrics to UMR Graphs | |
+ | '''Abstract''': Uniform Meaning Representation (UMR) expands on the capabilities of Abstract Meaning Representation (AMR) by supporting document-level annotation, suitability for low-resource languages, and support for logical inference. As a framework for any sort of representation is developed, a way to measure the similarities or differences between two representations must be developed in tandem to support the creation of parsers and for computing inner-annotator agreement (IAA). Fortunately, there exists robust research into metrics to assess the similarity of AMR graphs. The usefulness of these metrics to UMRs depends on four key aspects: scalability, correctness, interpretability, and cross-lingual suitability. This paper investigates the applicability of AMR metrics to UMR graphs along these aspects in order to create useful and reliable UMR metrics. | ||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 11/ | + | | 11/06/2024 || Short presentations / discussions: Curry Guinn, Yifu Wu, Kevin Stowe |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 11/ | + | | 11/13/2024 || Invited talk by '''Nick Dronen''' and '''Seminar Lunch''' |
+ | |||
+ | '''Title''': SETLEXSEM CHALLENGE: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models | ||
+ | |||
+ | '''Abstract''': Set theory is foundational to mathematics and, when sets are finite, to reasoning about the world. An intelligent system should perform set operations consistently, regardless of superficial variations in the operands. Initially designed for semantically-oriented NLP tasks, large language models (LLMs) are now being evaluated on algorithmic tasks. Because sets are comprised of arbitrary symbols (e.g. numbers, words), they provide an opportunity to test, systematically, the invariance of LLMs’ algorithmic abilities under simple lexical or semantic variations. To this end, we present the SETLEXSEM CHALLENGE, a synthetic benchmark that evaluates the performance of LLMs on set operations. SETLEXSEM assesses the robustness of LLMs’ instruction-following abilities under various conditions, focusing on the set operations and the nature and construction of the set members. Evaluating seven LLMs with SETLEXSEM, we find that they exhibit poor robustness to variation in both operation and operands. We show – via the framework’s systematic sampling of set members along lexical and semantic dimensions – that LLMs are not only not robust to variation along these dimensions but demonstrate unique failure modes in particular, easy-to-create semantic groupings of "deceptive" sets. We find that rigorously measuring language model robustness to variation in frequency and length is challenging and present an analysis that measures them independently. | ||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 11/ | + | | 11/20/2024 || '''Abteen’s proposal''' |
+ | |||
+ | '''When''': Wed. Nov 20, 11:30 am | ||
+ | |||
+ | '''Where''': MUEN D430 and zoom https://cuboulder.zoom.us/j/97014876908 | ||
+ | |||
+ | '''Title''': Extending Benchmarks and Multilingual Models to Truly Low-Resource Languages | ||
+ | |||
+ | '''Abstract''': Driven by successes in large-scale data collection and training efforts, the field of natural language processing (NLP) has seen a dramatic surge in model performance. However, the vast majority of the roughly 7,000 languages spoken across the globe do not have the necessary amounts of easily available text resources and have not been able to share in these advancements. In this proposal, we focus on how best to improve pretrained model performance for these languages, which we refer to as truly low-resource. First, we discuss model adaptation techniques which leverage unlabeled data and discuss experiments which evaluate these approaches in a realistic setting. Next, we address a limitation of prior work, and describe two data collection efforts for low-resource languages. We further present a synthetic evaluation resource which tests a model's understanding of specific linguistic phenomenon: lexical gaps. Finally, we propose additional analysis experiments we aim to address disagreements across prior work, and extend these experiments to include low-resource languages. | ||
+ | |||
+ | |||
+ | |||
+ | '''Alex’s area exam''': | ||
+ | |||
+ | '''When''': Wed. Nov 20, 1:30 pm | ||
+ | |||
+ | '''Where''': MUEN E214 and zoom https://cuboulder.zoom.us/j/97014876908 | ||
+ | |||
+ | '''Title''': Computational Media Framing Analysis through Rhetorical Devices and Linguistic Features | ||
+ | |||
+ | '''Abstract''': Over the past decade, there has been an increased focus on media framing in the Natural Language Processing (NLP) community. Framing has been defined as “select[ing] some aspects of a perceived reality and mak[ing] them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described” (Entman, 1993). This computational work generally seeks to quantify framing on a large scale to raise awareness about media bias. A prevalent paradigm for computational framing analysis focuses on studying high-level topical information. Though highly generalizable, this approach addresses only emphasis framing: when a writer or speaker highlights particular aspect of a topic more frequently than others. However, prior framing work is broad, encompassing many other facets and types of framing present in the media. In recognition of this, there has been a recent line of work seeking to subvert the earlier focus on topical information. In this survey, we present an analysis of work which is both in line with goals of expanding the breadth of computational framing analysis and is generalizable. We focus on work which analyzes the role of rhetorical devices and linguistic features to reveal insights about media framing. | ||
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 11/ | + | | 11/27/2024 || '''No meeting:''' Fall break |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | | + | | 12/04/2024 || Enora's prelim |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | 12/ | + | | 12/11/2024 || |
|- style="border-top: 2px solid DarkGray;" | |- style="border-top: 2px solid DarkGray;" | ||
− | | | + | | 1/23/25|| Chenhao Tan CS Colloquium, 3:30pm |
− | |||
− | |||
− | |||
|} | |} | ||
− | |||
=Past Schedules= | =Past Schedules= | ||
+ | * [[Spring 2024 Schedule]] | ||
+ | * [[Fall 2023 Schedule]] | ||
* [[Spring 2023 Schedule]] | * [[Spring 2023 Schedule]] | ||
* [[Fall 2022 Schedule]] | * [[Fall 2022 Schedule]] |
Latest revision as of 15:45, 18 November 2024
Location: Hybrid - Muenzinger D430, and the zoom link below
Time: Wednesdays at 11:30am, Mountain Time
Zoom link: https://cuboulder.zoom.us/j/97014876908
Date | Title |
---|---|
08/28/2024 | Planning, introductions, welcome! |
09/04/2024 | Brunch Social |
09/11/2024 | Watch and discuss NLP keynote
Winner: Barbara Plank’s “Are LLMs Narrowing our Horizon? Let’s Embrace Variation in NLP!” |
09/18/2024 | CLASIC presentations |
09/25/2024 | Invited talks/discussions from Leeds and Anschutz folks: Liu Liu, Abe Handler, Yanjun Gao
|
10/02/2024 | Martha Palmer, Annie Zaenen, Susan Brown, Alexis Cooper.
Title: Testing GPT4's interpretation of the Caused-Motion Construction Abstract: The fields of Artificial Intelligence and Natural Language Processing have been revolutionized by the advent of Large Language Models such as GPT4. They are perceived as being language experts and there is a lot of speculation about how intelligent they are, with claims being made about “Sparks of General Artificial Intelligence.” This talk will describe in detail an English linguistic construction, the Caused Motion Construction, and compare prior interpretation approaches with current LLM interpretations. The prior approaches are based on VerbNet. It’s unique contributions to prior approaches will be outlined. Then the results of a recent preliminary study probing GPT4’s analysis of the same constructions will be presented. Not surprisingly, this analysis illustrates both strengths and weaknesses of GPT4’s ability to interpret Caused Motion Constructions and to generalize this interpretation.
|
10/09/2024 | NAACL Paper Clinic: Come get feedback on your submission drafts! |
10/16/2024 | Senior Thesis Proposals:
Title: Benchmarking LLM Handling of Cross-Dialectal Spanish Abstract: This proposal introduces current issues and gaps in cross-dialectal NLP in Spanish as well as the lack of resources available for Latin American dialects. The presentation will cover past work in dialect detection, translation, and benchmarking in order to build a foundation for a proposal that aims to create a benchmark that analyses LLM robustness across a series of tasks in different Spanish dialects
Tavin Turner Title: Agreeing to Disagree: Statutory Relational Stance Modeling Abstract: Policy division deeply affects which bills get passed in legislature, and how. So far, statutory NLP has predicted voting breakdowns, interpreted stakeholder benefit, informed legal decision support systems, and much more. In practice, legislation demands compromise and concession to pass important policy, yet models often struggle to reason over the whole act. Leveraging neuro-symbolic models, we seek to intermediate this challenge with relational structures of statutes’ sectional stances – modeling stance agreement, exception, etc. Beyond supporting downstream statutory analysis tasks, these structures could help stakeholders understand how a bill impacts them, litmus the cooperation within a legislature, and reveal patterns of compromise that aid a bill through ratification. |
10/23/2024 | Ananya Ganesh's PhD Dissertation Proposal
Title: Reliable Language Technology for Classroom Dialog Understanding Abstract: In this proposal, I will lay out how NLP models can be developed to address realistic use cases in analyzing classroom dialogue. Towards this goal, I will first introduce a new task and corresponding dataset, focused on detecting off-task utterances in small-group discussions. I will then propose a method to solve this task that considers how the inherent structure in the dialog can be used to learn richer representations of the dialog context. Next, I will introduce preliminary work on applying LLMs in the in-context learning setting for a broad range of tasks pertaining to qualitative coding of classroom dialog, and discuss potential follow-up work. Finally, keeping in mind our goals of serving many independent stakeholders, I will propose a study to incorporate differing stake-holder’s subjective judgments while curating gold-standard data for classroom discourse analysis. |
10/30/2024 | Marie McGregor's area exam
Title: Adapting AMR Metrics to UMR Graphs Abstract: Uniform Meaning Representation (UMR) expands on the capabilities of Abstract Meaning Representation (AMR) by supporting document-level annotation, suitability for low-resource languages, and support for logical inference. As a framework for any sort of representation is developed, a way to measure the similarities or differences between two representations must be developed in tandem to support the creation of parsers and for computing inner-annotator agreement (IAA). Fortunately, there exists robust research into metrics to assess the similarity of AMR graphs. The usefulness of these metrics to UMRs depends on four key aspects: scalability, correctness, interpretability, and cross-lingual suitability. This paper investigates the applicability of AMR metrics to UMR graphs along these aspects in order to create useful and reliable UMR metrics. |
11/06/2024 | Short presentations / discussions: Curry Guinn, Yifu Wu, Kevin Stowe |
11/13/2024 | Invited talk by Nick Dronen and Seminar Lunch
Title: SETLEXSEM CHALLENGE: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models Abstract: Set theory is foundational to mathematics and, when sets are finite, to reasoning about the world. An intelligent system should perform set operations consistently, regardless of superficial variations in the operands. Initially designed for semantically-oriented NLP tasks, large language models (LLMs) are now being evaluated on algorithmic tasks. Because sets are comprised of arbitrary symbols (e.g. numbers, words), they provide an opportunity to test, systematically, the invariance of LLMs’ algorithmic abilities under simple lexical or semantic variations. To this end, we present the SETLEXSEM CHALLENGE, a synthetic benchmark that evaluates the performance of LLMs on set operations. SETLEXSEM assesses the robustness of LLMs’ instruction-following abilities under various conditions, focusing on the set operations and the nature and construction of the set members. Evaluating seven LLMs with SETLEXSEM, we find that they exhibit poor robustness to variation in both operation and operands. We show – via the framework’s systematic sampling of set members along lexical and semantic dimensions – that LLMs are not only not robust to variation along these dimensions but demonstrate unique failure modes in particular, easy-to-create semantic groupings of "deceptive" sets. We find that rigorously measuring language model robustness to variation in frequency and length is challenging and present an analysis that measures them independently. |
11/20/2024 | Abteen’s proposal
When: Wed. Nov 20, 11:30 am Where: MUEN D430 and zoom https://cuboulder.zoom.us/j/97014876908 Title: Extending Benchmarks and Multilingual Models to Truly Low-Resource Languages Abstract: Driven by successes in large-scale data collection and training efforts, the field of natural language processing (NLP) has seen a dramatic surge in model performance. However, the vast majority of the roughly 7,000 languages spoken across the globe do not have the necessary amounts of easily available text resources and have not been able to share in these advancements. In this proposal, we focus on how best to improve pretrained model performance for these languages, which we refer to as truly low-resource. First, we discuss model adaptation techniques which leverage unlabeled data and discuss experiments which evaluate these approaches in a realistic setting. Next, we address a limitation of prior work, and describe two data collection efforts for low-resource languages. We further present a synthetic evaluation resource which tests a model's understanding of specific linguistic phenomenon: lexical gaps. Finally, we propose additional analysis experiments we aim to address disagreements across prior work, and extend these experiments to include low-resource languages.
Alex’s area exam: When: Wed. Nov 20, 1:30 pm Where: MUEN E214 and zoom https://cuboulder.zoom.us/j/97014876908 Title: Computational Media Framing Analysis through Rhetorical Devices and Linguistic Features Abstract: Over the past decade, there has been an increased focus on media framing in the Natural Language Processing (NLP) community. Framing has been defined as “select[ing] some aspects of a perceived reality and mak[ing] them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described” (Entman, 1993). This computational work generally seeks to quantify framing on a large scale to raise awareness about media bias. A prevalent paradigm for computational framing analysis focuses on studying high-level topical information. Though highly generalizable, this approach addresses only emphasis framing: when a writer or speaker highlights particular aspect of a topic more frequently than others. However, prior framing work is broad, encompassing many other facets and types of framing present in the media. In recognition of this, there has been a recent line of work seeking to subvert the earlier focus on topical information. In this survey, we present an analysis of work which is both in line with goals of expanding the breadth of computational framing analysis and is generalizable. We focus on work which analyzes the role of rhetorical devices and linguistic features to reveal insights about media framing. |
11/27/2024 | No meeting: Fall break |
12/04/2024 | Enora's prelim |
12/11/2024 | |
1/23/25 | Chenhao Tan CS Colloquium, 3:30pm
|
Past Schedules
- Spring 2024 Schedule
- Fall 2023 Schedule
- Spring 2023 Schedule
- Fall 2022 Schedule
- Spring 2022 Schedule
- Fall 2021 Schedule
- Spring 2021 Schedule
- Fall 2020 Schedule
- Spring 2020 Schedule
- Fall 2019 Schedule
- Spring 2019 Schedule
- Fall 2018 Schedule
- Summer 2018 Schedule
- Spring 2018 Schedule
- Fall 2017 Schedule
- Summer 2017 Schedule
- Spring 2017 Schedule
- Fall 2016 Schedule
- Spring 2016 Schedule
- Fall 2015 Schedule
- Spring 2015 Schedule
- Fall 2014 Schedule
- Spring 2014 Schedule
- Fall 2013 Schedule
- Summer 2013 Schedule
- Spring 2013 Schedule
- Fall 2012 Schedule
- Spring 2012 Schedule
- Fall 2011 Schedule
- Summer 2011 Schedule
- Spring 2011 Schedule
- Fall 2010 Schedule
- Summer 2010 Schedule
- Spring 2010 Schedule
- Fall 2009 Schedule
- Summer 2009 Schedule
- Spring 2009 Schedule
- Fall 2008 Schedule
- Summer 2008 Schedule