Spring 2023 has been an exhilarating season for the PittCSS community! We’ve had the incredible opportunity to learn from a remarkable lineup of external speakers who graced our PittCSS seminars with their expertise and shared their research:
- Scott Renshaw
- Bio: Scott Leo Renshaw is a computational sociologist specializing in social network analysis with research interests in communication dynamics, belief formation, and hazards and disasters. He is a current Post-Doctoral Research Associate at the Center for Informed Democracy & Social-Cybersecurity (IDeas) under Dr. Kathleen Carley.
- Talk Title: Communication Across the Spectrum of Hazards and Disasters
- Abstract: This talk presents the micro-communication landscape across the spectrum of hazards and disasters, from the quotidian to the exotic, in the contexts of retransmission and communication dynamics. The first study focuses on hazard communication during quotidian and atypical hazards in the context of the National Weather Service’s use of twitter from 2009 to 2021, where several micro-structural, content, and style related message features are analyzed to understand the properties that make a message more likely to be retransmitted. Our second study looks into the communication occurring in the range of the exotic and atypical end of the spectrum by analyzing public-health communicators on Twitter during the first eight months of the unfolding COVID-19 pandemic, with special emphasis on the use of images and infographics. The final study presented focuses solely on the exotic kind of hazards and disasters through an investigation of 17 radio communication networks during the unfolding events of the 2001 World Trade Center Disaster. In this paper we model these 17 dynamic radio networks through a Relational Event Model approach to understand the role that the social mechanisms of preferential attachment, Institutionalized Coordination Roles (ICR), and conversational inertia play in the communication process of a disrupted environment.
- Matt Groh
- Bio: Matt Groh is a PhD candidate in the Affective Computing group at the MIT Media Lab and incoming assistant professor of Management and Organizations at Northwestern University’s Kellogg School of Management. His research examines human-AI collaboration with a focus on misinformation, medical diagnosis, and empathy. Before MIT, Matt cofounded a startup and worked as a data scientist at multiple startups, a non-profit, the World Bank, and DARPA. Matt has a bachelor’s degree from Middlebury College where he majored in economics and minored in Arabic and mathematics and a master’s degree from MIT in Media Arts and Sciences. He likes to ride his bicycle.
- Talk Title: Diagnostic Accuracy across Light and Dark Skin by Specialists, Generalists, and Physician-Machine Partnerships
- Abstract: Recent advances in deep learning systems (DLS) for image-based medical diagnosis demonstrate the potential to augment clinical decision-making, but the effectiveness of physician-machine partnerships remains an open question because physicians and algorithms are prone to systematic errors on underrepresented populations. We present results from a large-scale digital experiment (N=390 board-certified dermatologists and N=460 primary care physicians from 39 countries) to evaluate the accuracy of physicians submitting up to 4 differential diagnoses on 364 images of 46 skin conditions in a store-and-forward teledermatology simulation. We find specialists achieve 38% accuracy, specialists and generalists alike are 4 percentage points less accurate on images of dark skin than light skin, and fair DLS decision support reduces accuracy disparities and improves specialists’ leading diagnosis accuracy by 33%. These results reveal image-based diagnosis is challenging especially for underrepresented populations but well-designed physician-machine partnerships can enhance physician performance and reduce disparities across skin color.
- Max Goplerud
- Bio: Max Goplerud is an Assistant Professor of Political Science. His primary research creates new methods to facilitate political science research by leveraging the intersection of Bayesian methods and machine learning. These methods are focused on topics such as heterogeneous effects, hierarchical models, and ideal point estimation. He also is interested in understanding legislative behavior using text-as-data in a comparative context including studies on Europe, the United States, and Japan. He received his PhD from the Department of Government at Harvard University in 2020.
- Talk Title: Visualizing Democracy Using Covariate-Adjusted Principal Curves
- Abstract: A common task in data analysis in social science is to quantify and visually present the potentially non-linear relationship between a set of variables that lack a clear causal or temporal ordering. Principal curves provide one solution by drawing a single smooth curve that runs through the middle of the entire distribution of the data. However, it is often the case that the relationship between the variables differs based on known moderators. Therefore, allowing the curve to vary based on those covariates is crucial to accurately representing the underlying relationship. We propose “covariate-adjusted principal curves” to allow for an arbitrary number of moderators to affect the shape of the curve. Our method extends classic methods for fitting principal curves (e.g. splines) to their covariate-adjusted analogues (e.g. varying-coefficient models). Our motivating application is the empirical study of democracy. Using an influential existing study, we demonstrate that the relationship between two major dimensions of democracy (contestation and participation) has changed dramatically over the past two centuries—even when adjusting for other moderators such as colonial history and national income. Existing techniques for quantifying the changing relationship, e.g. correlations based on splitting the data into discrete time periods, falls short at illustrating this shift.
- Mike Colaresi
- Bio: Michael Colaresi is the Associate Vice Provost for Data Science, Director of the Pitt Disinformation Lab, and Research and Academic Director of the Institute for Cyber Law, Policy, and Security. He also co-directs the Computational Social Science BS degree and is the William S. Dietrich II Professor of Political Science at the University of Pittsburgh. His research is focused at the intersection of international relations and computational social science where he studies the application of Bayesian computation and machine learning to topics including how human rights are changing over time, where and when to predict political violence at high spatial and temporal resolution, and why disinformation poses both international and local challenges to democratic communities. At Pitt, he teaches Scientific Computation for Social Scientists, Text as Data, Bayesian Computation, The Domestic Politics of International Conflict in the Digital Age, and Introduction to Computational Social Science.
- Talk Title: Move It or Lose It: Introducing pseudo-Earth Mover Divergence as a Context-sensitive Metric for Evaluating and Improving Forecasting and Prediction Systems
- Abstract: Prediction systems necessarily utilize a suite of metrics to evaluate and compare models and ultimately guide inferences about the world. However, conventional measures of model performance, such as accuracy, Brier scores, F1 (and its components), AUROC, and AUPR solely utilize observation-by-observation, also known as bin-by-bin, comparisons between the distribution of predictions and the actual values. Any meaningful connections, be they social or physical, between observations, what we can think of as the data settings where predictions will be used, are ignored in these calculations. Thus, a system that consistently produced very close misses, for example in time and/or space, would not be preferred to a system that consistently emitted more distant but otherwise identical mistaken predictions. This bin-by-bin default approach is not assumption-free regarding the settings in which predictions are utilized: they explicitly encode the judgment that a prediction system’s output is being used within a completely disconnected network structure across observations. This assumption might be reasonable in some cases, but it is particularly problematic in computational social science where there are both enduring (eg. time serial and panel) as well as emerging (eg. high resolution spatial-temporal, and network) data settings where the connections between observations have enormous implications for the overall costs and benefits of incorporating model-based predictions into decisions. Forecasting political violence, famines, extreme weather events, and cybersecurity exploits and intrusions, are all examples where the distribution of predictions versus the actual values can be the difference between life and death or security and insecurity. Herein, we propose a new network-based context-sensitive performance metric, pseudo-Earth Mover Divergence (pEMDiv) to accelerate progress and discoveries in these, and related, areas. We explain how pEMDiv expands on Earth Mover Distance (EMD), how it can be used for binary prediction problems with hard or soft predictions, point predictions on a continuous scale, as well as probabilistic predictions over a discrete ordered state space.
As we bid adieu to Spring 2023, we express our deepest gratitude to these speakers for sharing their knowledge and unique perspectives with the PittCSS community.