Abstract
In the United States, much research on federal sentencing relies on data made available by the U.S. Sentencing Commission (USSC). As valuable as USSC data is, it also has “blind spots” that limit defendant-level analysis and corresponding insights into key aspects of sentencing, including disparities among codefendants and among judges within a common court. This article describes how we generated a dataset on more than 2,770 defendants without using USSC resources. Instead, we drew from the Public Access to Court Electronic Record (PACER) Website and a searchable Website maintained by the Bureau of Prisons to generate data on a specific defendant population: foreign nationals detained at sea by the U.S. Coast Guard. We describe the challenges we encountered during this process, including inconsistencies in data-sharing practices across federal courts, the high costs associated with accessing court records in PACER, and ethical concerns regarding access to sensitive information. Nevertheless, we show how our approach can yield vital information that would be unattainable through USSC-based analysis, including detailed data on the nature of the offense, demographic characteristics of defendants, and judge identifiers. We suggest ways for researchers interested in federal sentencing to streamline and automate our approach. We also propose that the USSC consider supporting similar data-gathering initiatives through grant opportunities to enrich understanding of judicial decision making.