Difference between revisions of "Meeting Schedule"

From CompSemWiki
Jump to navigationJump to search
 
(35 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''Location:''' Hybrid - Buchanan 430, and the zoom link below
+
'''Location:''' Hybrid - Muenzinger D430, and the zoom link below
  
'''Time:''' Wednesdays at 10:30am, Mountain Time
+
'''Time:''' Wednesdays at 11:30am, Mountain Time
  
 
'''Zoom link:''' https://cuboulder.zoom.us/j/97014876908
 
'''Zoom link:''' https://cuboulder.zoom.us/j/97014876908
Line 13: Line 13:
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 01/24/2024 || '''Planning, introductions, welcome!'''
+
| 08/28/2024 || '''Planning, introductions, welcome!'''
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 01/31/2024 || Brunch Social  
+
| 09/04/2024 || Brunch Social  
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 02/07/2024 || '''No Meeting''' - Virtual PhD Open House
+
| 09/11/2024 || Watch and discuss NLP keynote
 +
 
 +
'''Winner:''' Barbara Plank’s “Are LLMs Narrowing our Horizon? Let’s Embrace Variation in NLP!”
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 02/14/2024 || ACL paper clinic
+
| 09/18/2024 || CLASIC presentations
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 02/21/2024 || Cancelled in favor of LING Circle talk by Professor Gibbs
+
| 09/25/2024 || Invited talks/discussions from Leeds and Anschutz folks: Liu Liu, Abe Handler, Yanjun Gao, Curry Guinn
  
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 02/28/2024 || Short talks by Kathy McKeown and Robin Burke
+
| 10/02/2024 || Martha Palmer, Annie Zaenen, Susan Brown, Alexis Cooper.
  
Kathy's web page: ''' https://www.cs.columbia.edu/~kathy/
+
'''Title:''' Testing GPT4's interpretation of the Caused-Motion Construction
  
Title: Addressing Large Language Models that Lie: Case Studies in Summarization
+
'''Abstract:''' The fields of Artificial Intelligence and Natural Language Processing have been revolutionized by the advent  of  Large Language Models such  as  GPT4.  They  are  perceived  as  being  language  experts and there is a lot of speculation about how intelligent they are, with claims being made about “Sparks of  General  Artificial  Intelligence.”  This  talk  will  describe  in detail  an  English  linguistic  construction, the Caused Motion Construction, and compare prior interpretation approaches with current LLM interpretations.  The  prior  approaches  are  based  on  VerbNet. It’s unique  contributions  to  prior  approaches  will  be  outlined.  Then  the  results  of  a  recent  preliminary study  probing  GPT4’s  analysis  of  the  same  constructions  will  be  presented.  Not  surprisingly,  this analysis  illustrates  both  strengths  and  weaknesses  of  GPT4’s  ability  to  interpret  Caused  Motion Constructions and to generalize this interpretation.
  
Kathleen McKeown
+
Recording: https://o365coloradoedu-my.sharepoint.com/:v:/r/personal/mpalmer_colorado_edu/Documents/BoulderNLP-Palmer-Oct2-2024.mp4?csf=1&web=1&nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=aCHeN8
Columbia University
 
 
The advent of large language models promises a new level of performance in generation of text of all kinds, enabling generation of text that is far more fluent, coherent and relevant than was previously possible. However, they also introduce a major new problem: they wholly hallucinate facts out of thin air. When summarizing an input document, they may incorrectly intermingle facts from the input, they may introduce facts that were not mentioned at all, and worse yet, they may even make up things that are not true in the real world. In this talk, I will discuss our work in characterizing the kinds of errors that can occur and methods that we have developed to help mitigate hallucination in language modeling approaches to text summarization for a variety of genres.
 
 
Kathleen R. McKeown is the Henry and Gertrude Rothschild Professor of Computer Science at Columbia University and the Founding Director of the Data Science Institute, serving as Director from 2012 to 2017. In earlier years, she served as Department Chair (1998-2003) and as Vice Dean for Research for the School of Engineering and Applied Science (2010-2012). A leading scholar and researcher in the field of natural language processing, McKeown focuses her research on the use of data for societal problems; her interests include text summarization, question answering, natural language generation, social media analysis and multilingual applications. She has received numerous honors and awards, including 2023 IEEE Innovation in Societal Infrastructure Award, American Philosophical Society Elected member, American Academy of Arts and Science elected member, American Association of Artificial Intelligence Fellow, a Founding Fellow of the Association for Computational Linguistics and an Association for Computing Machinery Fellow. Early on she received the National Science Foundation Presidential Young Investigator Award, and a National Science Foundation Faculty Award for Women. In 2010, she won both the Columbia Great Teacher Award—an honor bestowed by the students—and the Anita Borg Woman of Vision Award for Innovation.
 
 
 
  
Title: Multistakeholder fairness in recommender systems
 
  
Robin Burke
+
|- style="border-top: 2px solid DarkGray;"
University of Colorado Boulder
+
| 10/09/2024 || NAACL Paper Clinic: Come get feedback on your submission drafts!
 
Abstract: Research in machine learning fairness makes two key simplifying assumptions that have proven challenging to move beyond. One assumption is that we can productively concentrate on a uni-dimensional version of the problem: achieving fairness for a single protected group defined by a single sensitive feature. The second assumption is that technical solutions need not engage with the essentially political nature of claims surrounding fairness. I argue that relaxing these assumptions is necessary for machine learning fairness to achieve practical utility. While some recent research in rich subgroup fairness has considered ways to relax the first assumption, these approaches require that fairness be defined in the same way for all groups, which amounts to a hardening of the second assumption. In this talk, I argue for a formulation of machine learning fairness based on social choice and exemplify the approach in the area of recommender systems. Social choice is inherently multi-agent, escaping the single group assumption and, in its classic formulation, places no constraints on agents' preferences. In addition, social choice was developed to formalize political decision-making mechanisms, such as elections, and therefore offers some hope of directly addressing the inherent politics of fairness. Social choice has complexities of its own, however, and the talk will outline a research agenda aimed at understanding the challenges and opportunities afforded by this approach to machine learning fairness.
 
 
Bio: Information Science Department Chair and Professor Robin Burke conducts research in personalized recommender systems, a field he helped found and develop. His most recent projects explore fairness, accountability and transparency in recommendation through the integration of objectives from diverse stakeholders. Professor Burke is the author of more than 150 peer-reviewed articles in various areas of artificial intelligence including recommender systems, machine learning and information retrieval. His work has received support from the National Science Foundation, the National Endowment for the Humanities, the Fulbright Commission and the MacArthur Foundation, among others.
 
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 03/06/2024 || '''Jon Cai''', CU Boulder Computer Science, PhD proposal defense
+
| 10/16/2024 || Senior Thesis Proposals:
  
'''Title:'''
 
Learning Fast and Slow with Semantics
 
  
'''Abstract:'''
+
'''Alexandra Barry'''
Abstract Meaning Representation(AMR) is a linguistic formalism that capture and encode semantics of natural language. It is one of the most widely accepted implementation over the truth value based theory of meanings. The impact of AMR has broadened since its introduction from its original design objective to help machine translation to more NLP tasks such as information extraction, summarizations and multi-modality semantic alignments etc. Meanwhile, AMR serves as a theoretical tool for computational semantics researches to advance semantic theories.  Being able to model holistic semantics thus become one of the ultimate goal for NLP and computational linguistics community. Despite the amazing advancement of LLMs in recent years, we still see gaps between shallow and deep semantic understanding of machine learning models. In this proposal, we go through the generalization issues that AMR parsing models renders and our proposed solutions over how could we design new methodologies and analytical tools to help us navigate the labyrinth of modeling semantics via AMR.
 
  
|- style="border-top: 2px solid DarkGray;"
+
'''Title''': Benchmarking LLM Handling of Cross-Dialectal Spanish
| 03/13/2024 || Veronica Qing Lyu,
 
  
'''Title:'''Faithful Chain of Thought Reasoning.  (''' https://aclanthology.org/2023.ijcnlp-main.20/ }
+
'''Abstract''': This proposal introduces current issues and gaps in cross-dialectal NLP in Spanish as well as the lack of resources available for Latin American dialects. The presentation will cover past work in dialect detection, translation, and benchmarking in order to build a foundation for a proposal that aims to create a benchmark that analyses LLM robustness across a series of tasks in different Spanish dialects
  
'''Abstract:'''
 
While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query → symbolic reasoning chain) and Problem Solving (reasoning chain → answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy.
 
  
'''Bio:'''
 
Veronica Qing Lyu is a fifth-year PhD student in Computer and Information Science at the University of Pennsylvania, advised by Chris Callison-Burch and Marianna Apidianaki. Her current research interests lie in the intersection of linguistics and natural language processing, explainable AI, and reasoning. Her paper "Faithful Chain-of-Thought Reasoning" received the Area Chair Award at IJCNLP-AACL 2023 (Interpretability and Analysis of Models for NLP track). She will co-organize a tutorial on “Explanations in the Era of Large Language Models” in NAACL 2024. Before Penn, she studied linguistics as an undergraduate student at the Department of Foreign Languages and Literatures at Tsinghua University.
 
  
|- style="border-top: 2px solid DarkGray;"
+
'''Tavin Turner'''
| 03/20/2024 || Jie Cao, CU Boulder/iSAT, practice talk
 
  
'''Title:''' Modularized Conversational Modeling for Efficient, Controllable, and Robust Real-World Applications
+
'''Title''': Agreeing to Disagree: Statutory Relational Stance Modeling
 
'''Abstract:''' Large Language Models~(LLM) make conversational AI accessible to everyone. Its general-purpose design benefits people across different domains, offering a powerful natural language interface to generate text, images, videos, and a broad range of AI services. However, a single monolithic black box is hard to maintain, scale, and control for all our communication goals, and it is often fragile and hallucinatory.  To build robust conversational applications, such as high-stake healthcare and education domains, we must tackle various challenges carefully, e.g., hard-to-obtain data and annotations, controlling the model behaviors, etc. In this talk, I will discuss my research agenda on modularized conversational modeling, focusing on efficient modeling under minimal supervision and controllable modules via neurosymbolic interfaces. I will begin by introducing zero-shot dialogue state tracking via modeling the natural language descriptions of the functionalities of intent and slots and factorizing the tasks for supplementary pretraining. Next, I will describe managing uncertain controls via discrete latent variables for structured prediction and conditional generation tasks. Finally, I will demo a case study on educational AI agent design for a form of collaborative learning called Jigsaw Classroom by showing its challenges in data collection, analysis, evaluation, and deployment issues of noisy speech.  I will end this talk by highlighting the future directions for better modularized conversational modeling and its applications.
 
  
 +
'''Abstract''': Policy division deeply affects which bills get passed in legislature, and how. So far, statutory NLP has predicted voting breakdowns, interpreted stakeholder benefit, informed legal decision support systems, and much more. In practice, legislation demands compromise and concession to pass important policy, yet models often struggle to reason over the whole act. Leveraging neuro-symbolic models, we seek to intermediate this challenge with relational structures of statutes’ sectional stances – modeling stance agreement, exception, etc. Beyond supporting downstream statutory analysis tasks, these structures could help stakeholders understand how a bill impacts them, litmus the cooperation within a legislature, and reveal patterns of compromise that aid a bill through ratification.
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 03/27/2024 || '''No Meeting''' - Spring Break
+
| 10/23/2024 || '''Ananya Ganesh''''s PhD Dissertation Proposal
  
|- style="border-top: 2px solid DarkGray;"
+
'''Title''': Reliable Language Technology for Classroom Dialog Understanding
| 04/03/2024 || CLASIC Industry Day
 
  
|- style="border-top: 2px solid DarkGray;"
+
'''Abstract''': In this proposal, I will lay out how NLP models can be developed to address realistic use cases in analyzing classroom dialogue. Towards this goal, I will first introduce a new task and corresponding dataset, focused on detecting off-task utterances in small-group discussions. I will
| 04/10/2024 || iSAT Dry Run or other?
+
then propose a method to solve this task that considers how the inherent structure in the dialog can be used to learn richer representations of the dialog context. Next, I will introduce preliminary work on applying LLMs in the in-context learning setting for a broad range of tasks pertaining to qualitative coding of classroom dialog, and discuss potential follow-up work. Finally, keeping in mind our goals of serving many independent stakeholders, I will propose a study to incorporate differing stake-holder’s subjective judgments while curating gold-standard data for classroom discourse analysis.
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 04/17/2024 || '''Maggie Perkoff''', dissertation proposal defense
+
| 10/30/2024 || '''Marie McGregor''''s area exam
  
'''Title:''' Bringing Everyone In: The Future of Collaboration with
+
'''Title''': Adapting AMR Metrics to UMR Graphs
Conversational AI
+
 +
'''Abstract''': Uniform Meaning Representation (UMR) expands on the capabilities of Abstract Meaning Representation (AMR) by supporting document-level annotation, suitability for low-resource languages, and support for logical inference. As a framework for any sort of representation is developed, a way to measure the similarities or differences between two representations must be developed in tandem to support the creation of parsers and for computing inner-annotator agreement (IAA). Fortunately, there exists robust research into metrics to assess the similarity of AMR graphs. The usefulness of these metrics to UMRs depends on four key aspects: scalability, correctness, interpretability, and cross-lingual suitability. This paper investigates the applicability of AMR metrics to UMR graphs along these aspects in order to create useful and reliable UMR metrics.
  
 +
|- style="border-top: 2px solid DarkGray;"
 +
| 11/06/2024 || Kevin Stowe - on Zoom
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 04/24/2024 || '''Téa Wright''', Senior Thesis Defense
+
| 11/13/2024 || Invited talk by Nick Dronen and Seminar Lunch
  
'''Title:''' PMM Adaptation for Lakota and Dakota with Noisy Data from OCR
+
|- style="border-top: 2px solid DarkGray;"
 +
| 11/20/2024 || Abteen's proposal
  
'''Abstract:''' This research addresses the challenge of integrating low-resource languages, specifically Lakota and Dakota, into Natural Language Processing (NLP) technologies such as Pretrained Multilingual Models (PMMs). These languages are critically underrepresented in digital linguistic resources, worsening risks of linguistic erosion. Our study seeks to explore this problem by creating authentic, Optical Character Recognition (OCR)-derived datasets to examine the capabilities of PMMs in handling these underrepresented languages. We document and create annotated datasets for these languages to perform a basic evaluation of PMMs on word alignment under realistic, noisy data conditions. We investigate the zero-shot capabilities and analyze how variations in language and the presence of noise from handwriting or formatting in adaptation data affects performance. By contributing datasets for Lakota and Dakota as well as aiming to highlight strengths and weaknesses in existing NLP tools, we hope to promote more inclusive approaches in technological advancements.
+
|- style="border-top: 2px solid DarkGray;"
 +
| 11/27/2024 || '''No meeting:''' Fall break
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 05/01/2024 || Sagi's Proposal
+
| 12/04/2024 || Enora's prelim
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 05/08/2024 || Mary's Prelim
+
| 12/11/2024 || DJ's prelim
  
 +
|- style="border-top: 2px solid DarkGray;"
 +
| 1/23/25|| Chenhao Tan CS Colloquium, 3:30pm
  
  
Line 114: Line 101:
  
 
=Past Schedules=
 
=Past Schedules=
 +
* [[Spring 2024 Schedule]]
 
* [[Fall 2023 Schedule]]
 
* [[Fall 2023 Schedule]]
 
* [[Spring 2023 Schedule]]
 
* [[Spring 2023 Schedule]]

Latest revision as of 11:32, 29 October 2024

Location: Hybrid - Muenzinger D430, and the zoom link below

Time: Wednesdays at 11:30am, Mountain Time

Zoom link: https://cuboulder.zoom.us/j/97014876908

Date Title
08/28/2024 Planning, introductions, welcome!
09/04/2024 Brunch Social
09/11/2024 Watch and discuss NLP keynote

Winner: Barbara Plank’s “Are LLMs Narrowing our Horizon? Let’s Embrace Variation in NLP!”

09/18/2024 CLASIC presentations
09/25/2024 Invited talks/discussions from Leeds and Anschutz folks: Liu Liu, Abe Handler, Yanjun Gao, Curry Guinn


10/02/2024 Martha Palmer, Annie Zaenen, Susan Brown, Alexis Cooper.

Title: Testing GPT4's interpretation of the Caused-Motion Construction

Abstract: The fields of Artificial Intelligence and Natural Language Processing have been revolutionized by the advent of Large Language Models such as GPT4. They are perceived as being language experts and there is a lot of speculation about how intelligent they are, with claims being made about “Sparks of General Artificial Intelligence.” This talk will describe in detail an English linguistic construction, the Caused Motion Construction, and compare prior interpretation approaches with current LLM interpretations. The prior approaches are based on VerbNet. It’s unique contributions to prior approaches will be outlined. Then the results of a recent preliminary study probing GPT4’s analysis of the same constructions will be presented. Not surprisingly, this analysis illustrates both strengths and weaknesses of GPT4’s ability to interpret Caused Motion Constructions and to generalize this interpretation.

Recording: https://o365coloradoedu-my.sharepoint.com/:v:/r/personal/mpalmer_colorado_edu/Documents/BoulderNLP-Palmer-Oct2-2024.mp4?csf=1&web=1&nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=aCHeN8


10/09/2024 NAACL Paper Clinic: Come get feedback on your submission drafts!
10/16/2024 Senior Thesis Proposals:


Alexandra Barry

Title: Benchmarking LLM Handling of Cross-Dialectal Spanish

Abstract: This proposal introduces current issues and gaps in cross-dialectal NLP in Spanish as well as the lack of resources available for Latin American dialects. The presentation will cover past work in dialect detection, translation, and benchmarking in order to build a foundation for a proposal that aims to create a benchmark that analyses LLM robustness across a series of tasks in different Spanish dialects


Tavin Turner

Title: Agreeing to Disagree: Statutory Relational Stance Modeling

Abstract: Policy division deeply affects which bills get passed in legislature, and how. So far, statutory NLP has predicted voting breakdowns, interpreted stakeholder benefit, informed legal decision support systems, and much more. In practice, legislation demands compromise and concession to pass important policy, yet models often struggle to reason over the whole act. Leveraging neuro-symbolic models, we seek to intermediate this challenge with relational structures of statutes’ sectional stances – modeling stance agreement, exception, etc. Beyond supporting downstream statutory analysis tasks, these structures could help stakeholders understand how a bill impacts them, litmus the cooperation within a legislature, and reveal patterns of compromise that aid a bill through ratification.

10/23/2024 Ananya Ganesh's PhD Dissertation Proposal

Title: Reliable Language Technology for Classroom Dialog Understanding

Abstract: In this proposal, I will lay out how NLP models can be developed to address realistic use cases in analyzing classroom dialogue. Towards this goal, I will first introduce a new task and corresponding dataset, focused on detecting off-task utterances in small-group discussions. I will then propose a method to solve this task that considers how the inherent structure in the dialog can be used to learn richer representations of the dialog context. Next, I will introduce preliminary work on applying LLMs in the in-context learning setting for a broad range of tasks pertaining to qualitative coding of classroom dialog, and discuss potential follow-up work. Finally, keeping in mind our goals of serving many independent stakeholders, I will propose a study to incorporate differing stake-holder’s subjective judgments while curating gold-standard data for classroom discourse analysis.

10/30/2024 Marie McGregor's area exam

Title: Adapting AMR Metrics to UMR Graphs

Abstract: Uniform Meaning Representation (UMR) expands on the capabilities of Abstract Meaning Representation (AMR) by supporting document-level annotation, suitability for low-resource languages, and support for logical inference. As a framework for any sort of representation is developed, a way to measure the similarities or differences between two representations must be developed in tandem to support the creation of parsers and for computing inner-annotator agreement (IAA). Fortunately, there exists robust research into metrics to assess the similarity of AMR graphs. The usefulness of these metrics to UMRs depends on four key aspects: scalability, correctness, interpretability, and cross-lingual suitability. This paper investigates the applicability of AMR metrics to UMR graphs along these aspects in order to create useful and reliable UMR metrics.

11/06/2024 Kevin Stowe - on Zoom
11/13/2024 Invited talk by Nick Dronen and Seminar Lunch
11/20/2024 Abteen's proposal
11/27/2024 No meeting: Fall break
12/04/2024 Enora's prelim
12/11/2024 DJ's prelim
1/23/25 Chenhao Tan CS Colloquium, 3:30pm


Past Schedules