Jocelyn E. Strauber, Commissioner of the New York City Department of Investigation (“DOI”), issued
the 2023 Annual Anti-Corruption Report on the topic of data integrity — how agencies ensure the accuracy and
consistency of their data — and agencies’ use of their data to address fraud, waste and corruption risks. Data
integrity is defined as the quality, accuracy, consistency and security of data, the verification of its accuracy and
consistency maintained over time and across formats, and the enforcement of rules and standards that prevent
unauthorized data alteration. Data integrity is crucial because it ensures the trustworthiness and reliability of data,
enabling informed decision-making and efficient operations, and strengthens data security by controlling access
and preventing misuse. To the extent agencies use their data in their anti-corruption efforts, data integrity is one
component of the effectiveness of those efforts.
DOI’s Annual Anti-Corruption Report is mandated by Executive Order 105 (“EO 105”), which consolidated the
Inspector General function within DOI and established the DOI Commissioner as the City’s independent Inspector
General, but gave agency heads primary responsibility for maintaining corruption-free agencies, and called upon
DOI to assist in their efforts by preparing this annual Report, which summarizes agency-identified corruption
vulnerabilities and agencies’ remedial strategies. Since 2020, these annual reports have focused on City agencies’
responses to a corruption-related issue. Unlike other DOI Reports, these annual reports rely primarily on information
and analysis supplied to DOI by City agencies, rather than DOI’s own investigative work.
For this Report, DOI analyzed questionnaire responses from 48 agencies. The Report covers the period
October 1, 2022 through September 30, 2023. A copy of the Report is attached to this release and can be found on
DOI’s Reports page or by clicking here.
DOI Commissioner Jocelyn E. Strauber said, “Maintaining accurate and consistent data and using that data in
anti-corruption efforts, are significant responsibilities of every New York City agency. Data, and therefore data
integrity, impacts the City’s decision-making, record-keeping, and how agencies serve the public. This Report found
that the majority of agencies use data to combat corruption and have practices or written policies designed to protect
data integrity, but also found areas where agencies could improve. To that end, DOI recommended that City
agencies review their data integrity practices and consider improvements such as memorializing practices in writing,
controlling database access, and testing recovery procedures. I thank all the City agencies that responded to our
questionnaire for their participation in the creation of this Report and for their commitment to maintaining data
integrity."
This Report relies on agencies’ own assessments to provide a broad overview of the approach City agencies
are taking to address risks of corruption, misconduct, or other criminal activity. In order to promote candid responses
by the participating agencies, the individual responses have been aggregated or anonymized, as appropriate. For
this 2023 Report, DOI developed a questionnaire that probed agencies’ approaches to data integrity by posing
questions in the following categories:
• Identification – agencies were asked to identify and describe all databases that they control, directly
or via contract, for which that agency is the primary user; what type of data each database contains;
who owns the database; whether the data is captured manually or electronically; and whether the data
is stored on-premises or in the cloud;
• Risk Analysis – agencies were asked to identify the five databases they deem most critical, to describe
the data integrity measures in place for those databases, to identify any data integrity risks not
addressed by those measures, and whether the agency has written data integrity policies or
procedures; and
• Proactivity – agencies were asked whether they use any of their databases for purposes of identifying,
preventing, or mitigating risks of corruption, fraud, waste, or abuse; whether they use artificial
intelligence for anti-corruption efforts; and whether agencies have staff or units specifically responsible
for analyzing, monitoring, or auditing data contained in those databases.
Data integrity is conceptually distinct from cybersecurity, which was the focus of DOI’s 2021 Anti-Corruption
Report. Cybersecurity focuses on protecting systems and networks from digital attacks from both external and
internal threats. Data integrity refers to preserving the validity and accuracy of data largely from internal threats that
can involve internal bad actors or accidental human error. Inaccurate data can lead to poor decision-making and
thus government waste; unmonitored data may present an opportunity for internal bad actors to commit acts of
corruption and fraud. Proper data integrity policies can help avoid both.
City agencies’ questionnaire responses indicate that most agencies have taken steps to protect the integrity of
their data, but the responses also exposed areas in which agencies could improve. The Report found that the
majority of City agencies utilize some combination of data integrity best practices, including user-based limitations
on access, audit trails, and periodic internal audits. However, agencies self-reported key risks to data integrity.
Those risks included lack of role-based permissions and access, increased employee turnover and retention issues,
as well as the advanced age of certain database platforms and maintenance of those platforms by third-party,
outside vendors.
Thirty agencies indicated that they had written policies or procedures in place governing data integrity, though
five of those agencies appear to rely solely on Citywide policies, rather than agency-specific policies on data
integrity. A few of those thirty agencies submitted audit policies that did not explicitly mention or address data.
Eighteen agencies reported they had no written policies on data integrity, though each of those agencies reported
having practices in place to address data integrity—presumably, then, not memorialized in writing.
Thirty-three agencies responded that they do use data to identify, prevent, reduce, or eliminate instances or
risks of corruption, fraud, waste, or abuse. Agencies reported methods including cross-referencing datasets against
each other to flag issues. Six of the 48 agencies responded that they use artificial intelligence to analyze, monitor,
or audit data to prevent corruption, fraud, waste, or abuse. These agencies reported a variety of helpful AI uses,
such as algorithm-based evaluations of transaction characteristics to assign a fraud risk score for credit card
authorization requests.
Forty agencies had staff responsible for data integrity efforts; eight agencies reported having no staff or units
responsible for analyzing, monitoring, and/or auditing data contained in any of the identified databases. Of the 40
agencies with staff, headcounts varied but appeared largely proportional to the size of the particular agency. The
experience of those staff members also varied and included a mix of advanced degrees, civil service exams or
certifications, and general expertise in the data system itself or experience in City government.
Based on these findings, the Report recommends that City agencies assess their current data integrity policies
and practices to evaluate whether they adequately promote data integrity and sufficiently utilize data to address
risks of fraud, corruption, and abuse. As part of that assessment, DOI issues the following five recommendations
for agencies to consider in light of their specific needs:
-Ensure that the agency has a written data policy that includes provisions regarding data governance,
such as access control and disaster recovery procedures. Such data policy should be periodically
reviewed and updated as necessary.
• Appoint a data officer responsible for setting the data policy, determining access, and reviewing
compliance.
• Where possible, phase out the manual entry of data, moving to electronic input only. With respect to
data deletion, limit deletion authority to a small universe of appropriate supervisory staff.
• Control database access based on roles or groups to which individuals are assigned so that access is
consistent across similarly situated staff.
• Test and simulate disaster recovery procedures periodically to ensure that they will work as intended
when actually needed.