Building sustainable health data capability in Aotearoa New Zealand: opportunities and challenges highlighted through COVID-19
ABSTRACT
Aotearoa New Zealand should take the opportunity created by national health reforms to learn from experience with COVID-19, creating a world-class health system that utilises data and modelling effectively. For this to happen, we must build upon a foundation of equity, ethics, trust and transparency and ensure we have the right tools and processes in place for our researchers and practitioners to translate insights into better outcomes for all.
Introduction
Aotearoa New Zealand's response to COVID-19, underpinned by data, modelling, and communication, has been recognised as effective (Deckert et al. 2021). Yet, the rapid translation from research to practice faced a number of barriers that could have jeopardised scientific quality and implementation. Amplified by the epidemic response (Kinsella et al. 2020), these challenges are not new in applied research using health data (Lucyk et al. 2017).
Concurrently with the management of COVID-19 and following a comprehensive review,1 the New Zealand health sector's operating model is moving toward greater centralisation and a standardisation of technology and management resources. These changes could catalyse solutions to systemic data challenges benefiting routine health services and enabling more dynamic and innovative responses to future epidemics.
Here we outline how lessons learned from the COVID-19 pandemic can inform the new system, addressing privacy and ethics challenges for health data research in Aotearoa documented by Ballantyne and Style (2017).
Ethics: value, risk, and data access
Our ethical review processes need to become adaptable to real-time decision-making.
Health research in Aotearoa is managed through the Health and Disabilities Ethics Committees, under the National Ethical Advisory Committee framework.2This guides researchers to consider ethical challenges around their work, following a principle of risk-limiting through minimisation: researchers' data access is strictly limited in scope as defined at the outset.
For modern data science, as in an emerging pandemic, the exact specification of a research question and of the data necessary to answer it, may itself be possible only after data exploration. Currently, under certain scenarios motivating faster action (for example, when data access is needed to reduce serious population health threats), people can access data as part of an initiative to improve the quality of care, without undergoing a full review process. This line between data access for research and for quality improvement processes seems arbitrary, with the two activities being symbiotic. Having a rigorous process on one side of a fine line and inconsistent processes on the other creates an incentive to bypass ethical considerations.
The early 2020 COVID-19 outbreak prompted the need to share data with modelling teams associated with Te Pūnaha Matatini (TPM), a Centre of Research Excellence hosted by the University of Auckland, involving researchers at several organisations. TPM and Statistics New Zealand (StatsNZ) were tasked with providing daily modelling updates, critically depending on current information on COVID-19 cases, to the National Crisis Management Centre so to inform the operational and policy response to the crisis. Data collection for this work evolved from manual interpretation of press releases, to a formal agreement eventually
In early May, assisted by the Office of the Privacy Commissioner, Te Pūnaha Matatini undertook a privacy assessment for a data agreement between Statistics New Zealand and the Ministry of Health. From late May, case data were delivered by the MoH to StatsNZ and then on to TPM, with Privacy Act-compliant precautions in place to manage privacy risks. Statistics New Zealand's involvement as a broker added more complexity and risk but alleviated capacity issues at the MoH. The effective if rather an unwieldy arrangement persisted until early in 2021 when StatsNZ withdrew from its role as intermediary. From then Te Pūnaha Matatini established a direct arrangement for data sharing with the MoH: a more efficient process that avoided the complex statutory obligations of the Statistics Act around data access approvals.
The ideal process for data access should be systematic and simple, adapted to the approaches and timelines required for high-quality research. One model could be to broaden data access conversations to repositories of relevant data (rather than every specific elements of that data), coupled with clearer ethical standards and audit/monitoring requirements. A national standard code of conduct and a consistent approval process could assist in transparency for citizens and researchers. Such a process could provide the forum for building a body or precedent for ethically acceptable practice with linked health and social data that could proactively inform research. This should be considered in the context of supporting the creation and maintenance of social and cultural license, ensuring that benefits to society clearly outweigh risks.
In this context, we are mostly referring to datasets that have been properly aggregated and/or anonymised so to be robust against violations of privacy (at the individual or group level). However, complete security against re-identification is never achievable. Moreover, an excessive stringency about data privacy may hinder its usability for health responses. It is therefore important to discuss what remediation and redress mechanisms are needed and to maintain a proactive civil conversation about what levels of data granularity are admissible under different circumstances.
Equity
Equity needs to be prioritised as an explicit national goal.
Historically, while researchers and healthcare professionals were aware that epidemics would disproportionately impact Māori and Pacific populations, those inequities had been at most the subject of post-epidemic research, not of routine outbreak monitoring (Wilson et al. 2012).
In COVID-19, the earliest reports to MoH about the emerging pandemic mentioned those inequities (Telfar Barnard et al. 2020) and there were efforts to monitor differential impacts with routine testing and health data. Te Pūnaha Matatini applied an equity lens early in its pandemic modelling, identifying Māori and Pacific people to be at higher risk, initially as a result of the mātauranga3from one of its Advisory Board members, and subsequently using data on the relationship between COVID-19 outcomes and co-morbidities from the United Kingdom (Steyn et al. 2021).
Unfortunately, inconsistent data records and understanding across the epidemic response made it difficult to report on inequities in a way that could inform intervention. For example, early testing numbers often missed ethnicity information and, when collected, were recorded inconsistently with the Statistics New Zealand Ethnicity standard, impeding comparisons with StatsNZ population counts. Moreover, the MoH's ‘prioritised ethnicity’ approach showed its limits. Allocating individuals to a single ethnicity results in Pacific people reporting both Māori and a Pacific ethnicity being classified as Māori; the use of the ‘level 1’ ethnicity categories meant that all Pacific ethnicities were reported as ‘Pacific’ rather than their specific ethnicity or ethnicities. This determined an undercount of Pacific observations, aggravated for the younger groups (Pacific Perspectives Limited,2019), and in the lack of social grouping information, that would have been crucial in the second outbreak.
Consent and control
People need to be given authority over their health data.
New Zealand citizens consent for health services by signing an agreement at the point of care. Data produced from the health care delivery can be used for audit purposes but could be useful for secondary, yet undefined, research, Research generally requires approved de-identification or explicit informed consent from the data donor, and mitigation of potential harms. People are generally happy for their health data to be used if it will benefit others (Dobson et al. 2021), with discomfort around commercial interests, and our health data managers have a critical kaitiaki (guardianship) role in data sharing decisions.
Today, citizens do not know what health data is held about them, how it is used and stored, or how they might contribute to it. Here, there is a critical role for consumer representatives. While the general public may not have time or inclination to consider these important issues, a broad-based consumer engagement approach could allow our policy makers, health professionals and researchers to hear the voice of those who are represented in the data and address their concerns in an open and transparent manner.
In New Zealand's COVID-19 response, expediency was given priority similar to a civil defense emergency. However, pandemic responses can be enduring, so temporary expedient responses can be precedent-setting and become normalised beyond the initial emergency, sidestepping the protective processes that minimise risk of misuse or harm.
Mapping Aotearoa's data landscape
Data collected in New Zealand should be documented transparently for our researchers and citizens.
Relevant data sources for COVID-19 included population information, managed by Statistics New Zealand; healthcare records, managed by GP practices, Primary Healthcare Organisations, pharmacies, District Health Boards, and the Ministry of Health; testing information, managed by ESR; highly aggregated population density data,4 businesses, and aggregated supermarket volume of sales data. Later, as misinformation spread, social media data became valuable to track patterns. While these datasets are not generally public, the fact that agencies and private companies hold the data should be made public. For the modelling team, understanding which agency held which data was the greatest challenge in March and April 2020.
Snapshot catalogues of health data in New Zealand (Atalag et al. 2013; Thurier 2017) need to be maintained. A searchable, publicly accessible catalogue limited to the StatsNZ Integrated Data Infrastructure (IDI) is under construction5 but this represents a small subset of relevant sources. Privacy and anonymity should be safeguarded in this process, according to existing laws.
A maintained mapping of information sources with clear access processes would help timely and safe access during emergency responses. This could be delivered within the health system reforms, with a single system that works for Health NZ, the Māori Health Authority, the Regional Iwi Partnership Boards, and government agencies. The single system would avoid unnecessary duplication and be an important resource in unlocking future research opportunities.
The use of data in population health needs to be grounded on a relationship of trust between researchers and the public, which can never be taken for granted. We believe that improving the transparency of the Aotearoa's data landscape (including both where is the data, and what it is used for) may help in addressing the public suspicion of malpractices, biases, or unfair intrusions.
A research data sandpit
A high-performing, secure data sharing and processing system can deliver value far beyond COVID-19.
The global sharing of research and administrative data relating to COVID-19 has been remarkable. The collective will of the scientific community and the enabling innovation across organisations allowed for benchmarking statistics across different regions. The Research Data Alliance created online networks for the collaborative sharing of epidemic data. Such approaches can go beyond the global pandemic.
Linked data in IDI6 is precious for research and was relevant for the pandemic response. However, the administrative data in the IDI is seldom current, being supplied from organisations external to Statistics New Zealand and added integrated 3-4 times per year. Adding new data is time and resource-costly, barring the frequent updates of multiple data sets required in a rapidly changing environment. Furthermore, data access is highly restricted due to the Statistics Act requirements and the need for strict risk control. This impedes translational research and operational use in a rapidly changing setting. The aggregation of many diverse administrative data sets makes the IDI a world-leading research repository but the very size and diversity of the data can limit its agility.
An alternative would be to focus the linked data approach on a smaller range of datasets and users, enabling timely updates and application. Scotland used this approach, creating a secure repository for frequently updated health and social data, to monitor ‘the COVID-19 epidemic and to evaluate the effectiveness of therapeutic interventions in approximately 5.4 million individuals’ (Simpson et al. 2020). The repository acted as a trusted research environment with permissioned users and purposes, where standardised linkage and analysis was updated multiple times a day. This more focused secure data environment reduced governance complexity and facilitated tools and code sharing, accelerating the availability of epidemic information to decision-makers.
The lack of a high-performance computing capability in the IDI makes it unsuitable for the computationally intensive calculations involved in scenario and outcome modelling. For example, network modelling could not be operational without support from NESI (the New Zealand eScience Infrastructure), which is not accessible from the IDI environment. Thus, summary data had to be extracted from IDI, and model ran on data sets reconstructed from the summaries rather than directly on the microdata. This made slowed the process slow and affected the analytic precision of the models.
To equip researchers and the public, we need improved self-service visualisation and de-identification tools. The ESR Vaccination Modelling dashboard7 uses this approach to enable public access to the results of almost 242,000 modelled scenarios.
Conclusions
Aotearoa can build a world-leading health data system to improve health outcomes equitably and inform responses to changing circumstances.
We make several recommendations for change that will not only improve preparedness for a future pandemic but also enable our health system to adopt the findings of research. The impending changes in the research, statistics and health sectors create a unique opportunity for systemic changes that will create enduring benefits across the health and research sectors. Effecting that change and achieving those benefits will require engagement with and involvement of key people – including the users and donors of the data, not just the producers.