To report on how New York state prison officials discipline officers they accuse of abuse, The Marshall Project examined two primary data sets. We received one through public records requests to the state corrections agency. The other we compiled based on thousands of pages of court records released by the state attorney general’s office.
The discipline database
For the first set of data, we asked New York’s Department of Corrections and Community Supervision for its database of discipline cases it brought against employees. These records had been hidden from public view under a decades-old secrecy law, which the Legislature repealed in 2020. The department gave us two PDF files itemizing cases it filed from Jan. 1, 2010, through mid-April of 2022. These included more than 450 pages of spreadsheets. We wrote computer programs to turn those PDFs of messy and inconsistent case information into standardized data we could confidently analyze. This required us to fix misspellings of names of facilities, job titles and punishments, and to group outcomes into categories. We read and re-read thousands of records in this data. We verified name spellings, titles and assignments against state payroll records compiled by the Empire Center for Public Policy. We asked the corrections agency clarifying questions about hundreds of these cases to determine how many individuals were cited, including instances where the employee shared a name with someone else at the same prison.
Through our queries to the department, and by examining hundreds of lawsuits and reports compiled by independent arbitrators in discipline cases, we found and corrected mistakes in the state data.
At several points in our process, we asked other reporters in our newsroom to review hundreds of rows at random from our cleanup of the data, and we had others check our data processing pipeline and the results of our calculations to ensure our methods and their outputs were sound.
We limited our analysis to cases alleging physical abuse of incarcerated people by front-line security staff — corrections officers, sergeants and lieutenants. We included any case that the department itself categorized as “inmate abuse,” as well as cases where the description indicated that it involved an “excessive,” “unnecessary,” “inappropriate,” “without authorization” or “unjustified” use of physical force on a prisoner. Using natural language processing methods, we clustered similar incident descriptions to identify cases that the agency did not designate as “inmate abuse,” but had the same description as those marked that way.
We were also interested in seeing which abuse cases involved allegations of cover-ups. We considered cases that the department categorized as “false documentation,” and used the same techniques to include those where descriptions included allegations of lying to investigators, failing to report incidents or falsifying records.
We wanted to find cases where the state tried to discipline multiple officers for the same incident. We wrote a program to identify groups of cases around the same date at the same prison, with the department classifying at least one case in each group as “inmate abuse.” Many of the related cases involved allegations of either physical abuse or “false documentation,” or sometimes both. We manually reviewed those groupings and verified them through arbitration reports, lawsuits and other reporting. We also asked the agency to confirm that these groupings were connected to the same precipitating incident.
Some cases are missing from our analysis. Corrections officials provided the discipline records in two batches. In each, the agency redacted any case that had not been concluded. When we received the second batch, officials failed to provide the outcomes of many of the more than 200 cases that hadn’t been resolved when we got the first set of records 18 months earlier. We believe some of those withheld records may also include cases of physical abuse. The department has not yet given them to us.
To understand the trends in the records, and to tell the stories of the people behind the data, we conducted dozens of interviews. We spoke with agency officials, former investigators, arbitrators, union leaders and attorneys, as well as people currently and formerly incarcerated and their families. In a few cases, we reviewed photographic evidence and video, though they were rare and often difficult to obtain.
Compiling lawsuit information
For our second set of data, we asked the New York attorney general’s office for every lawsuit it settled — or lost — on behalf of the corrections department since 2010. The office provided more than 13,000 pages of court records in eight separate PDF files.
Reporters combed through every page of the files to categorize lawsuits that dealt with allegations of physical abuse by prison staff. Hundreds of court records were repeated in the files. We split the PDFs into individual documents and uploaded them to DocumentCloud, where reporters could examine them again.
We built a spreadsheet of every lawsuit in these records alleging prisoner abuse. We identified and included a handful of cases that the state did not give us because they were not resolved at the time of our request. For each abuse lawsuit, we obtained filings from state and federal courts to examine the underlying claims and findings. We classified the cases, which stemmed from incidents from 2010 through 2020, based on our reporting, talking to lawyers and plaintiffs, and to the corrections department itself.
We checked the lawsuits against the discipline data to see which incidents cited in abuse lawsuits also had led to discipline against accused employees. In some cases, the records were clear that a lawsuit was connected to a discipline case. In many others, the description in the state database was sparse, so we asked the department to verify whether more than 150 cases were linked to the discipline records. We also asked the agency whether any of the officers named in the lawsuits were cited in the redacted discipline cases it did not provide.
We asked other reporters to spot-check more than 100 lawsuits to vet our description and classification of the records.
Finally, we took our findings from both sets of data to the department in an interview and in a lengthy series of written questions to get its response.