Fall 2022 Schedule

From CompSemWiki
Revision as of 15:47, 12 October 2020 by CompSemUser (talk | contribs)
Jump to navigationJump to search
Date Title
9.2.20 Planning
9.9.20 More planning
9.16.20 Vivek Srikumar - Title: Fads, Fallacies and Fantasies in the Name of Machine Learning

Abstract: The pervasiveness of machine learning, and artificial intelligence powered by it, is clear from even a cursory overview of the last several years of academic literature and mainstream technology reporting. The goal of this talk is to provoke thought and discussions about the future of the field. To this end, I will talk about how, as a field, applied machine learning may be starting to bind itself into an intellectual monoculture. In particular, I will describe specific blinders that we may find hard to shake off: (a) the obsession with ranking and leaderboarding, (b) the assumption that purely data-driven computing is always the right answer, and (c) the excessive focus on clean toy problems in lieu of working with real data. Along the way, we will see several examples of questions that we may be able to think about if we cast aside these blinders.

Bio: Vivek Srikumar is associate professor in the School of Computing at the University of Utah. His research lies in the areas of natural learning processing and machine learning and has primarily been driven by questions arising from the need to reason about textual data with limited explicit supervision and to scale NLP to large problems. His work has been published in various AI, NLP and machine learning venues and has been recognized by paper awards from EMNLP and CoNLL. His work has been supported by awards from NSF, BSF and NIH, and also from several companies.. He obtained his Ph.D. from the University of Illinois at Urbana-Champaign in 2013 and was a post-doctoral scholar at Stanford University.

Recording of Vivek's presentation: [1]

9.23.20 2 papers:

Paper 1 - Jonas Pfeiffer (grad student, TU Darmstadt), MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer, Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder, EMNLP 2020. [2]

Abstract: The main goal behind state-of-the-art pretrained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer. However, due to limited model capacity, their transfer performance is the weakest exactly on such low-resource languages and languages unseen during pretraining. We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations. In addition, we introduce a novel invertible adapter architecture and a strong baseline method for adapting a pretrained multilingual model to a new language. MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and achieves competitive results on question answering.


Paper 2 - Extending Multilingual BERT to Low-Resource Languages [3]

9.30.20 Paper 1 - What determines the order of adjectives in English? Comparing Efficiency-Based Theories Using Dependency Treebanks [4] (Elizabeth)
10.7.20 Peter Foltz "NLP for Team Communication Analysis" Slides:[5]
10.14.20 2 papers:

Paper 1 - Tao Li (Utah CS PhD student) - Structured Tuning for Semantic Role Labeling, ACL 2020 Authors: Tao Li, Parth Anand Jawale, Martha Palmer, Vivek Srikumar, [6] Abstract: Recent neural network-driven semantic role labeling (SRL) systems have shown impressive improvements in F1 scores. These improvements are due to expressive input representations, which, at least at the surface, are orthogonal to knowledge-rich constrained decoding mechanisms that helped linear SRL models. Introducing the benefits of structure to inform neural models presents a methodological challenge. In this paper, we present a structured tuning framework to improve models using softened constraints only at training time. Our framework leverages the expressiveness of neural networks and provides supervision with structured loss components. We start with a strong baseline (RoBERTa) to validate the impact of our approach, and show that our framework outperforms the baseline by learning to comply with declarative constraints. Additionally, our experiments with smaller training sizes show that we can achieve consistent improvements under low-resource scenarios.


Paper 2 - Stephane Aroca-ouellette's EMNLP 2020 practice talk on exploring auxiliary tasks for BERT

10.21.20 Two Brian Keegan PhD students: Arcadia Zhang and Jordan Wirfs-Brock

Talk 1

Title: Learning to Listen to Data: A Case Study of Narrative Sonification

Abstract: Sonification—conveying data through sounds—is a thriving practice that has nonetheless not achieved the widespread appeal of data visualization. Traditionally, sonification practitioners have focused more on generating sounds than on teaching people how to listen to them. This talk will recount the process of designing two “narrative sonification” pieces that aired on the radio show Marketplace as a means to explore the role narrative plays in supporting people as they learn to listen to data. By revisiting these narrative sonifications through the lens of adjacent fields--guidance design, data-storytelling, visualization literacy, and sound studies--we identify design principles that interaction designers, especially those working in voice and sound, can adapt as they support users in understanding data. We also speculate about how we might rethink these radio pieces as interactive sonifications using conversation- and voice-based technologies. A holistic approach to designing narratives and sounds together can make sonification more accessible for general audiences as voice interfaces become more powerful and pervasive.

Bio: Jordan Wirfs-Brock is working on a PhD in Information Science at the University of Colorado Boulder with adviser Brian Keegan. Her research explores how voice interaction, sonification, and narrative support people as they learn to listen to data, producing more meaningful and engaging experiences with information. During her time as a PhD student, she has completed industry internships with Yahoo, Mozilla, and Spotify and her research has been published and recognized at top conferences like ACM CHI, DIS, and TEI. Previously, she was a data journalist at Inside Energy, a public media collaboration, where she used data to demystify energy topics. She loves using animation, visualization, podcasting and other creative ways to tell complex stories in approachable ways.

Talk 2 Title: Software patches and online gaming communities

Abstract: Software patches, often overlooked by users, plays an important role in online gaming communities such as Dota 2. Software patches can dominate community discussion, obsolete play strategies and force players to learn and adapt. In this talk, I will present a project from my dissertation studying the impact of software patches in the popular online game Dota 2. In this project, I will show how patches act as the source of disruptions to the players’ strategies in the game client using historical game data and quantitative analysis methods. I proposed a way to measure the severity of software patches in Dota 2 and demonstrated that the severity of patches correlates with the observed changes in player behaviors. This project shows that software patches present a unique and exciting opportunity to study how software engineering practices influence collective social behavior.

Bio: Arcadia Zhang is a 4th year PhD student in the Computer Science at the University of Colorado Boulder. She is advised by Prof. Brian Keegan and her research uses computational social science approaches to characterize the collective social behavior of online gaming communities in the aftermath of disruptions like software patches and international tournaments. During her time as a PhD student she has completed internships with Amazon and EA Games and her research has been supported by grants from NSF, Microsoft Azure, and Oracle Cloud.

10.28.20 Chelsea Proposal
11.4.20 Skatje Proposal?
11.11.20 Rehan Proposal
11.18.20 NAACL submission workshop
11.25.20 Fall Break
12.2.20 Tentative Abhidip's proposal
12.9.20 Tentative Vivian proposal

Past Schedules