We’re excited to be holding our first workshop reporting on progress in the funded projects. We’re delighted that Drew Houston will be joining us, opening the workshop with a short Q&A.

Workshop Details

Thursday April 4, 2024
10:00 am-12:30 pm

MIT Campus, Stata Center
Star Conference Room, 32-D463

Schedule

  • 10:05 am – Welcome
  • 10:10 am – Q&A with Drew Houston, CEO of Dropbox
    Moderator: Daniel Jackson
  • 10:30 am – Short project presentations
  • 11:40 am – Discussion
  • 12:00 pm – Lunch, discussion continues
  • 12:30 pm – Adjourn

Presentations

John Horton of the MIT Sloan School of Management and Jacob Andreas of EECS and CSAIL

We present an approach for automatically generating and testing, in silico, social scientific hypotheses. This automation is made possible by recent advances in large language models (LLM), but the key feature of the approach is the use of structural causal models. Structural causal models provide a language to state hypotheses, a blueprint for constructing LLM-based agents, an experimental design, and a plan for data analysis. The fitted structural causal model becomes an object available for prediction or the planning of follow-on experiments. We demonstrate the approach with several scenarios: a negotiation, a bail hearing, a job interview, and an auction.

David Atkin and Martin Beraja of the Department of Economics, and Danielle Li of MIT Sloan

Artificial intelligence and other information technologies are thought to augment the capabilities of workers performing cognitive tasks, thereby increasing aggregate productivity. However, these direct productivity gains may also come with a longer-term downside: if firms assign more entry-level tasks to automated systems, novice workers who previously performed these tasks may miss out on valuable training and mentorship opportunities.

Julie Shah of the Department of Aeronautics and Astronautics and CSAIL; Retsef Levi of MIT Sloan and the Operations Research Center; Kate Kellog of MIT Sloan; and Ben Armstrong of the Industrial Performance Center

In recent years, studies have linked a rise in burnout from doctors and nurses in the U.S. with increased administrative burdens associated with electronic health records and other technologies. This project aims to develop a holistic framework to study how generative AI technologies can both increase productivity for organizations and improve job quality for workers in healthcare settings.

Harold Abelson of EECS and CSAIL, Cynthia Breazeal of the Media Lab, and Eric Klopfer of the Comparative Media Studies/Writing

Aptly is a no-code/low-code environment that leverages large language models to let anyone, even young kids, build smartphone apps simply by speaking in natural language. It removes barriers of language and coding experience as limiting factors to computational action solutions — creating original apps with meaningful societal impact.

Manish Raghavan of MIT Sloan and EECS, and Devavrat Shah of EECS and the Laboratory for Information and Decision Systems

We introduce a framework to combine human expert and machine predictions. We show that even when machines appear to outperform human experts on average, there may be substantial and identifiable heterogeneity across examples: humans may (predictably) outperform machines on some instances. We illustrate this phenomenon in a few medical settings.

Tim Kraska of EECS and CSAIL, and Christoph Paus of the Department of Physics

The open source framework A2rchi allows connection to various source large language models including commercial, e.g. GPT4, and open source models, e.g. Llama2, and a highly configurable setup of tuning. We present an outline of A2rchi’s design and some applications for support of a help-desk of expert systems.

Patti Maes of the Media Lab and David Karger of the Department of Electrical Engineering and Computer Science (EECS) and the Computer Science and Artificial Intelligence Laboratory (CSAIL)

AI is not just an engineering problem. For AI deployments to be successful we have to consider human design factors: how does the design of the Human-AI interaction impact how people use and respond to AI systems in their daily life and work? I will discuss a series of prototypes and human subject experiments that reveal some unexpected lessons.