PlanAlyzer: Assessing Threats to the Validity of Online Experiments (SPLASH 2019 - OOPSLA)

Write a Blog >>

Sun 20 - Fri 25 October 2019 Athens, Greece

Who

Emma Tosch, Eytan Bakshy, Emery D. Berger, David Jensen, Eliot Moss

Track

SPLASH 2019 OOPSLA

Time Zone

The program is currently displayed in (GMT+03:00) Beirut.

Use conference time zone: (GMT+03:00) BeirutSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 23 Oct 2019 17:07 - 17:30 at Olympia - Analysis Chair(s): Jan Vitek

Abstract

Online experiments have become a ubiquitous aspect of design and engineering processes within Internet firms.
As the scale of experiments has grown, so has the complexity of their design and implementation. In response, firms have developed software frameworks for designing and deploying online experiments. Ensuring that experiments in these frameworks are correctly designed and that their results are trustworthy—referred to as internal validity—can be difficult. Currently, verifying internal validity requires manual inspection by someone with substantial expertise in experimental design.

We present the first approach for statically checking the internal validity of online experiments. Our checks are based on well-known problems that arise in experimental design and causal inference. Our analyses target PlanOut, a widely deployed, open-source experimentation framework that uses a domain-specific language to specify and run complex experiments. We have built a tool called PlanAlyzer that checks PlanOut programs for a variety of threats to internal validity, including failures of randomization, treatment assignment, and causal sufficiency. PlanAlyzer uses its analyses to automatically generate contrasts, a key type of information required to perform valid statistical analyses over the results of these experiments. We demonstrate PlanAlyzer's utility on a corpus of PlanOut scripts deployed in production at Facebook, and we evaluate its ability to identify threats to validity on a mutated subset of this corpus. PlanAlyzer has both precision and recall of 92% on the mutated corpus, and 82% of the contrasts it generates match hand-specified data.

DOI

https://doi.org/10.1145/3360608

Emma Tosch

University of Massachusetts Amherst

Eytan Bakshy

Facebook, Inc.

Emery D. Berger

University of Massachusetts Amherst

United States

David Jensen

University of Massachusetts Amherst

Eliot Moss

University of Massachusetts Amherst

Time Zone

The program is currently displayed in (GMT+03:00) Beirut.

Use conference time zone: (GMT+03:00) BeirutSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 23 Oct
Displayed time zone: Beirut change

16:00 - 17:30	Analysis OOPSLA at Olympia Chair(s): Jan Vitek Northeastern University

16:00 22m Talk		Precision-Preserving Yet Fast Object-Sensitive Pointer Analysis with Partial Context Sensitivity OOPSLA Jingbo Lu UNSW Sydney, Jingling Xue UNSW Sydney DOI
16:22 22m Talk		Precise Reasoning with Structured Time, Structured Heaps, and Collective Operations OOPSLA Gregory Essertel Purdue University, Guannan Wei Purdue University, Tiark Rompf Purdue University DOI
16:45 22m Talk		I/O Dependent Idempotence Bugs in Intermittent Systems OOPSLA Milijana Surbatovich Carnegie Mellon University, Limin Jia Carnegie Mellon University, Brandon Lucia Carnegie Mellon University DOI
17:07 22m Talk		PlanAlyzer: Assessing Threats to the Validity of Online Experiments OOPSLA Emma Tosch University of Massachusetts Amherst, Eytan Bakshy Facebook, Inc., Emery D. Berger University of Massachusetts Amherst, David Jensen University of Massachusetts Amherst, Eliot Moss University of Massachusetts Amherst DOI