Fall 2022 Schedule
Location: 279, Fleming Building.
Time: 10:30am, Mountain Time
Zoom link: https://cuboulder.zoom.us/j/97014876908
Date | Title |
---|---|
9.1.21 | Planning, introductions, welcome!
CompSem meetings will be hybrid this semester - in person at Fleming 279, and online here: https://cuboulder.zoom.us/j/97014876908 |
9.8.21 | 10am (NOTE: special start time)
Yoshinari Fujinuma thesis defense Analysis and Applications of Cross-Lingual Models in Natural Language Processing Human languages vary in terms of both typologically and data availability. A typical machine learning-based approach for natural language processing (NLP) requires training data from the language of interest. However, because machine learning-based approaches heavily rely on the amount of data available in each language, the quality of trained model languages without a large amount of data is poor. One way to overcome the lack of data in each language is to conduct cross-lingual transfer learning from resource-rich languages to resource-scarce languages. Cross-lingual word embeddings and multilingual contextualized embeddings are commonly used to conduct cross-lingual transfer learning. However, the lack of resources still makes it challenging to either evaluate or improve such models. This dissertation first proposes a graph-based method to overcome the lack of evaluation data in low-resource languages by focusing on the structure of cross-lingual word embeddings, further discussing approaches to improve cross-lingual transfer learning by using retrofitting methods and by focusing on a specific task. Finally, it provides an analysis of the effect of adding different languages when pretraining multilingual models. |
9.15.21 | ACL best paper recaps |
9.22.21 | Introduction to AI Institute (short talks) |
9.29.21 | *** CANCELLED *** |
10.6.21 | Invited talk: Artemis Panagopoulou, University of Pennsylvania
Metaphor and Entailment: Looking at metaphors through the lens of textual entailment Metaphors are very intriguing elements of human language that are surprisingly prevalent in our everyday communications. Humans are pretty good at understanding metaphors, even if it is the first time they encounter them. Empirical studies indicate that 20% of our daily language use is metaphorical. Naturally, the ubiquity of metaphors draw the attention of psychologists who showed that the human brain processes conventional metaphors in the same speed as literal language. Nevertheless, the computational linguistics literature consistently treats metaphors as a separate domain to literal language. Earlier work has shown that traditional pipelines do not perform well on metaphoric datasets. Synchronously, the literature on computational understanding of metaphors has largely focused on developing concrete metaphor detection systems, coupled with interpretation systems targeted solely on metaphors. This tendency has presented across various aspects of the field, such as the purposeful exclusion of figurative language from large scale datasets. This study investigates the potential of constructing systems that can jointly handle metaphoric and literal sentences by leveraging the newfound capabilities of deep learning systems. We narrow the scope of the report, following earlier work, to evaluate deep learning systems fine-tuned on the task of textual entailment (TE). We argue that TE is a task naturally suited to the interpretation of metaphoric language. We show that TE systems can improve significantly in metaphoric performance by being fine-tuned on a small dataset with metaphoric premises. Even though the improvement in performance on metaphors is typically accompanied by a drop in performance on the original dataset we note that auto-regressive models seem to show a smaller drop in performance on literal examples compared to other types of models. |
10.13.21 | Invited guest - in person!: Arya McCarthy, Johns Hopkins University
Kilolanguage Processing by Projection The breadth of information digitized in the world’s languages gives opportunities for linguistic insights and computational tools with pan-lingual perspective. We can achieve this by projecting lexical information across language, either at the type or token level. First, we project information between thousands of languages at the type level to investigate the classic color word hypotheses of Berlin and Kay. Applying fourteen computational linguistic measures of color word basicness/secondariness, we find cross-linguistic credence and shed additional nuance. Second, we project information between thousands of languages at the token level to create fine-grained morphological analyzers and generators. We begin by creating a corpus of the Bible in over 1600 languages. Independent web-scraping and aggregation, alignment, and normalization create a ripe multilingual dataset. We then show applications to pronoun clusivity and multilingual MT. Finally, we produce morphological tools grounded in UniMorph that improve on strong initial models and generalize across languages. |
10.20.21 | |
10.27.21 | Invited talk: Lisa Miracchi, University of Pennsylvania
The Practical Emergence Approach to Meaning: Avoiding Echo Chambers I argue for what I call a stance of practical emergence towards intelligence and related kinds such as knowledge and linguisitic competence. Practical emergence is a commitment in explanatory practice to treating higher-level kinds as distinct from lower-level kinds, such that they cannot be reductively identified in lower-level terms, and to assuming that explanations of them in terms of lower-level kinds may be substantive, in that behavior of higher-level kinds cannot be logically or mathematically deduced from lower-level behavior. I’ll flesh out this stance using the Generative Framework for explaining how higher-level kinds obtain in virtue of lower-level kinds. Then I’ll show how this stance of practical emergence, bolstered by the Generative Framework, helps us avoid the pitfall of creating echo chambers, where the reductive hypotheses about intelligence kinds are amplified, not because they are empirically supported, but because they allow for simpler interdisciplinary communication. I'll use as examples recent work on vector representations of word meanings (such as Word2Vec) and alleged implications for heuristic reasoning. Lastly, I’ll discuss some important ethical implications of these echo chambers. I'll argue that the more ethically responsible approach is to adopt practical emergence, because that will help us proactively identify and address the social and ethical implications of differences between vector representations and manipulations of them, on the one hand, and genuinely intelligent semantic knowledge and reasoning, on the other. |
11.3.21 | EMNLP practice talks/preview |
11.10.21 | EMNLP - no meeting |
11.17.21 | Elizabeth Spaulding prelim |
11.24.21 | Fall break - no meeting |
12.1.21 | Invited talk: Abe Handler |
12.8.21 | Abhidip Bhattacharyya proposal defense |
Past Schedules
- Spring 2021 Schedule
- Fall 2020 Schedule
- Spring 2020 Schedule
- Fall 2019 Schedule
- Spring 2019 Schedule
- Fall 2018 Schedule
- Summer 2018 Schedule
- Spring 2018 Schedule
- Fall 2017 Schedule
- Summer 2017 Schedule
- Spring 2017 Schedule
- Fall 2016 Schedule
- Spring 2016 Schedule
- Fall 2015 Schedule
- Spring 2015 Schedule
- Fall 2014 Schedule
- Spring 2014 Schedule
- Fall 2013 Schedule
- Summer 2013 Schedule
- Spring 2013 Schedule
- Fall 2012 Schedule
- Spring 2012 Schedule
- Fall 2011 Schedule
- Summer 2011 Schedule
- Spring 2011 Schedule
- Fall 2010 Schedule
- Summer 2010 Schedule
- Spring 2010 Schedule
- Fall 2009 Schedule
- Summer 2009 Schedule
- Spring 2009 Schedule
- Fall 2008 Schedule
- Summer 2008 Schedule