Difference between revisions of "Fall 2022 Schedule"

From CompSemWiki
Jump to navigationJump to search
Line 64: Line 64:
  
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 03.02.22 || Ghazaleh Kazeminejad, proposal defense
+
| 03.02.22 || Kevin Cohen, chalk talk
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
| 03.09.22 || Kevin Cohen
+
| 03.09.22 || '''Ghazaleh Kazeminejad, proposal defense'''
 +
 
 +
'''Topic:''' ''Neural-Symbolic NLP: exploiting computational lexical resources''
 +
 
 +
Recent major advances in Natural Language Processing (NLP) have relied on a distributional approach, representing language numerically to enable complex mathematical operations and algorithms. These numeric representations have been based on the probabilistic distributions of linguistic units. The main recent breakthrough in NLP has been the result of feeding massive data to the machine and using neural network architectures, allowing the machine to learn a model that approximates a given language (grammar and lexicon). Following this paradigm shift, NLP researchers introduced transfer learning, enabling researchers with less powerful computational resources to use their pre-trained language models and transfer what the machine has learned to a new downstream NLP task. However, there are some NLP tasks, particularly in the realm of Natural Language Understanding (NLU), where surface level representations and purely statistical models may benefit from symbolic knowledge and deeper level representations. In this work, we explore contributions that symbolic computational lexical resources can still make to system performances on two different tasks. In particular, we propose to expose the model to symbolic knowledge, including external world knowledge (e.g typical features of entities such as their typical functions or whereabouts) as well as linguistic knowledge (e.g. syntactic dependencies and semantic relationships among the constituents). One of our goals for this work is finding an appropriate numeric representation for this type of symbolic knowledge.
 +
 
 +
We propose to utilize the semantic predicates from VerbNet, semantic roles from VerbNet and PropBank, syntactic dependency labels, and world knowledge from ConceptNet as symbolic knowledge, going beyond the types of symbolic knowledge used so far in neural-symbolic approaches. We will expose a pre-trained language model to symbolic knowledge in two ways. First, we will embed these relations into a neural network architecture by modifying the input representations. Second, we will treat the knowledge as constraints on the output, penalizing the model at the end of each training step if the constraints are not met in the model predictions at that step.
 +
 
 +
To evaluate this approach, we propose to test it on two downstream NLP tasks: Event Extraction and Entity State Tracking. We propose a thorough investigation of the two tasks, particularly focusing on where they have benefitted from a neural-symbolic approach, and whether and how we could further improve the performance on these tasks by introducing both linguistic and world knowledge to the model.
 +
 
 
|- style="border-top: 2px solid DarkGray;"
 
|- style="border-top: 2px solid DarkGray;"
 
| 03.16.22 || Chelsea Chandler, defense (TBC)
 
| 03.16.22 || Chelsea Chandler, defense (TBC)

Revision as of 17:31, 23 February 2022

Location: Hybrid (starting Feb 23) - Fleming 279, and the zoom link below

Time: Wednesdays at 10:30am, Mountain Time

Zoom link: https://cuboulder.zoom.us/j/97014876908

Date Title
1.12.22 Planning, introductions, welcome!

CompSem meetings will be virtual until further notice (https://cuboulder.zoom.us/j/97014876908)

01.19.22 Kai Larsen, CU Boulder Leeds School of Business

Validity in Design Research

Research in design science has always recognized the importance of evaluating its knowledge outcomes, particularly of assessing the efficacy, utility, and attributes of the artifacts produced (e.g., A.I. systems, machine learning models, theories, frameworks). However, demonstrating the validity of design science research (DSR) is challenging and not well understood. This paper defines DSR validity and proposes a DSR Validity Framework. We evaluate the framework by assembling and analyzing an extensive data set of research validities papers from various disciplines, including design science. We then analyze the use of validity concepts in DSR and validate the framework. The results demonstrate that the DSR Validity Framework may be used to guide how validity can, and should, be used as an integral aspect of design science research. We further describe the steps for selecting appropriate validities for projects and formulate efficacy validity and characteristic validity claims suitable for inclusion in manuscripts.

Keywords: Design science research (DSR), research validity, validity framework, artifact, evaluation, efficacy validity, characteristic validity.

01.26.22 Elizabeth Spaulding, prelim

Prelim topic: Evaluation for Abstract Meaning Representations

Abstract Meaning Representation (AMR) is a semantic representation language that provides a way to represent the meaning of a sentence in the form of a graph. The task of AMR parsing—automatically extracting AMR graphs from natural language text—necessitates evaluation metrics to develop neural parsers. My prelim is a review of AMR evaluation metrics and the strengths and weaknesses of each approach, as well as a discussion of gaps and unexplored questions in the current literature.

02.02.22 NO MEETING
02.09.22 SCiL live session!
02.16.22 NO MEETING
02.23.22 CompSem meetings go back to being hybrid! (Fleming 279 or https://cuboulder.zoom.us/j/97014876908)


Invited talk: Aniello de Santo, University of Utah

Bridging Typology and Learnability via Formal Language Theory

The complexity of linguistic patterns is object of extensive debate in research programs focused on probing the inherent structure of human language abilities. But in what sense is a linguistic phenomenon more complex than another, and what can complexity tell us about the connection between linguistic typology and human cognition? In this talk, I overview a line of research approaching these questions from the perspective of recent advances in formal language theory.

I will first broadly discuss how language theoretical characterizations allow us to focus on essential properties of linguistic patterns under study. I will emphasize how typological insights can help us refine existing mathematical characterizations, arguing for a two-way bridge between disciplines, and show how the theoretical predictions made by logic/algebraic formalization of typological generalizations can be used to test learning biases in humans (and machines).

In doing so, I aim to illustrate the relevance of mathematically grounded approaches to cognitive investigations into linguistic generalizations, and thus further fruitful cross-disciplinary collaborations.


Bio Sketch:

Aniello De Santo is an Assistant Professor in the Linguistics Department at the University of Utah.

Before joining Utah, he received a PhD in Linguistics from Stony Brook University. His research broadly lies at the intersection between computational, theoretical, and experimental linguistics. He is particularly interested in investigating how linguistic representations interact with general cognitive processes, with particular focus on sentence processing and learnability. In his past work, he has mostly made use of symbolic approaches grounded in formal language theory and rich grammar formalisms (Minimalist Grammars, Tree Adjoining Grammars).

03.02.22 Kevin Cohen, chalk talk
03.09.22 Ghazaleh Kazeminejad, proposal defense

Topic: Neural-Symbolic NLP: exploiting computational lexical resources

Recent major advances in Natural Language Processing (NLP) have relied on a distributional approach, representing language numerically to enable complex mathematical operations and algorithms. These numeric representations have been based on the probabilistic distributions of linguistic units. The main recent breakthrough in NLP has been the result of feeding massive data to the machine and using neural network architectures, allowing the machine to learn a model that approximates a given language (grammar and lexicon). Following this paradigm shift, NLP researchers introduced transfer learning, enabling researchers with less powerful computational resources to use their pre-trained language models and transfer what the machine has learned to a new downstream NLP task. However, there are some NLP tasks, particularly in the realm of Natural Language Understanding (NLU), where surface level representations and purely statistical models may benefit from symbolic knowledge and deeper level representations. In this work, we explore contributions that symbolic computational lexical resources can still make to system performances on two different tasks. In particular, we propose to expose the model to symbolic knowledge, including external world knowledge (e.g typical features of entities such as their typical functions or whereabouts) as well as linguistic knowledge (e.g. syntactic dependencies and semantic relationships among the constituents). One of our goals for this work is finding an appropriate numeric representation for this type of symbolic knowledge.

We propose to utilize the semantic predicates from VerbNet, semantic roles from VerbNet and PropBank, syntactic dependency labels, and world knowledge from ConceptNet as symbolic knowledge, going beyond the types of symbolic knowledge used so far in neural-symbolic approaches. We will expose a pre-trained language model to symbolic knowledge in two ways. First, we will embed these relations into a neural network architecture by modifying the input representations. Second, we will treat the knowledge as constraints on the output, penalizing the model at the end of each training step if the constraints are not met in the model predictions at that step.

To evaluate this approach, we propose to test it on two downstream NLP tasks: Event Extraction and Entity State Tracking. We propose a thorough investigation of the two tasks, particularly focusing on where they have benefitted from a neural-symbolic approach, and whether and how we could further improve the performance on these tasks by introducing both linguistic and world knowledge to the model.

03.16.22 Chelsea Chandler, defense (TBC)
03.23.22 ***Spring Break***
03.30.22 CLASIC Open House
04.06.22 Abteen Ebrahimi, prelim (TBC)
04.13.22 Ananya Ganesh, prelim (TBC)
04.20.22 Adam Wiemerslage, prelim (TBC)
04.27.22 Sagi Shaier, prelim (TBC)


Past Schedules