Mar 21,2025

Estimating Reporting Bias in 311 Compliant Data

+ Kate Boxer

Introduction

Resident-generated data1 consists of data that are intentionally created by individuals for accessible use by policy-makers and the public, for purposes including improved governance and more responsive public services (Meijer and Potjer, 2018). For example, many cities now have “311” systems through which residents can report on their environments, request information, and request non-emergency municipal services through phone calls, texts, web, or other communication modalities (City of New York, 2021). Increasingly, such resident-generated data are used to help make policy decisions and allocate resources, and can be used as data for machine learning models incorporated into automated tools for policy decision support (City of New Orleans, 2019).

Here we focus on New York City’s 311 system (NYC 311) (City of New York, 2021), which receives more than 8,000,000 resident-generated reports annually (Kontokosta et al., 2017; Zha and Veloso, 2014; Nadeau, 2011). Of these reports, approximately 20% can be considered a “complaint” about city services or conditions, which require additional follow up by the relevant city agency (Nadeau, 2011). Our research focuses on residential heating and water issues in NewYorkCity. NYC’sDepartmentofHousingPreservationandDevelop ment (HPD), the agency responsible for monitoring housing conditions in NYC, inspects and subsequently issues violations for inadequate heating and hot water at building level. Resi dential building inspections primarily occur after a 311 complaint is filed through NYC’s 311 system, stating that at least one residential unit in the building is experiencing a heating and hot water problem.

Despite recent work to increase the accessibility of NYC 311 (Minkoff, 2016), there is a concern that these data are biased due to systematic differences in residents’ propensity to report a problem (Kontokosta and Hong, 2021; McLafferty, Schneider and Abelt, 2020). Therefore, using resident-generated data at face value to fit decision support tools, without interrogating potential data biases, can result in misallocation of community services and resources, and can further reinforce societal biases which may harm under-served popula tions. These biases are potentially exacerbated because most of New York City’s building inspections are conducted reactively, in response to 311 resident complaints, and therefore a problem that is not reported to 311 is both less likely to be observed in the data, and less likely to be addressed in a timely manner by the city. Thus, identifying under-reporting areas and subpopulations could assist policy-makers in reaching out to these groups to ensure that their housing issues are addressed.

Heating and hot water problems are serious quality-of-life issues which represent the largest category of NYC 311 complaints (35.9% of total complaints). As such, there is a critical need to understand geographic and demographic disparities both in the frequency of these problems, and the probability that they are reported to the city via 311. However, ob taining an unbiased ground truth of heating and hot water issues would require some form of resident surveys, ubiquitous sensing (e.g., Internet of Things), or inspections throughout all or a large random sample of buildings. These solutions are costly, and they often suffer from common survey sampling errors (Kelly and Swindell, 2002) or are nascent initiatives that are inconsistently deployed and unevenly distributed (Kontokosta, 2016; Zheng et al., 2014). Without reliable “gold standard” data, previous studies such as Minkoff (2016) and White and Trump (2018) identify geographic and demographic variability in 311 complaints but are unable to distinguish whether this variation is due to differences in reporting, differences in the underlying rates of problems, or a combination of the two.

Here we address the challenge of estimating under-reporting in 311 data using two dis tinct and novel methodological approaches. Our first approach estimates “non-reporting”, in which a building has a heating or hot water problem during the legally mandated heating period, but no resident of the building places a 311 call about that problem. In this case, we have no record in the 311 call log that such a problem has occurred, making it impossible to distinguish whether that particular building failed to report its problem or simply did not have a problem to report. Nevertheless, when aggregating data across many buildings in neighbor hoods throughout a city, we can estimate the expected frequency of problems conditional on building characteristics, and use this information to estimate the “non-reporting” rate for different neighborhoods and neighborhood-level demographic characteristics. To accomplish this, we design a latent variable model which we fit with expectation-maximization to esti mate the distribution of underlying unreported heating and hot water problems. We then use this model to identify which neighborhood socioeconomic characteristics are associated with non-reporting.

Our second methodological approach estimates “less-than-expected” reporting. When a building places at least one 311 call for a given problem in a given heating season, we can both observe the total number of calls placed as well as estimate the duration of that problem based on the timing of those calls. This allows us to ask a different question: conditional on a problem having occurred and at least one 311 call being placed, and controlling for building size (number of residential units) and problem duration, which buildings are placing a lower-than-expected number of calls? To achieve this, we define groupings of buildings of similar size, experiencing a similar length of heating and hot water problems, and rank the normalized call volumes (311 calls per unit per day) within each grouping. We reaggre gate these buildings by neighborhood and test whether each neighborhood has a significantly higher than expected proportion of less-than-expected reporting buildings, adjusting for mul tiple testing and controlling the overall False Discovery Rate. Furthermore, we fit a linear regression model to determine which neighborhood-level socioeconomic characteristics are associated with less-than-expected reporting.

We apply these methods for estimating “non-reporting” and “less-than-expected report ing” in conjunction, and observe that they provide highly consistent results regarding which neighborhoods and subpopulations are under-reporting their heating and hot water problems to 311.

The contributions of this work include:

1. We define a taxonomy for under-reporting, and separate analysis approaches for non reporting and less-than-expected reporting, which can be easily adapted to a variety of other real-world scenarios, such as crime reporting.

2. To our knowledge, we are the first to develop a latent variable model for reporting bias in 311 complaints. This model integrates both attributes which are assumed independent of reporting behaviors (structural building characteristics) and attributes which are associated with the propensity to report (socioeconomic characteristics) to estimate a distribution of the latent variable, i.e., underlying heating and hot water problems.

3. We integrate domain knowledge about the 311 system in our modeling logic to estimate non-reporting and exploit the nuances of the 311 complaint data schema to estimate less than-expected reporting.

4. Our analyses provide actionable information about NYC neighborhoods with a higher density of under-reporting and neighborhood-level socioeconomic characteristics which are associated with under-reporting, which can help government and advocacy agencies improve accessibility to 311 and reduce the impacts of unreported heating and hot water problems in New York City.

The remainder of the paper proceeds as follows: Section 2 describes the NYC 311 data used for analysis and our methods for estimating non-reporting and less-than-expected re porting. The results of applying these methods to the NYC 311 data are presented in Sec tion 3, and Section 4 contains a concluding discussion.

Kate Boxer