Royal Academy of Sciences New Zealand Open Science
Open Science

Māori and the Integrated Data Infrastructure: an assessment of the data system and suggestions to realise Māori data aspirations [Te Māori me te Integrated Data Infrastructure: he aromatawai i te pū

Published:

ABSTRACT

The Statistics New Zealand Integrated Data Infrastructure (IDI) is a collection of de-identified whole-population administrative datasets. Researchers are increasingly utilising the IDI to answer pressing social and policy research questions. Our work provides an overview of the IDI, associated issues for Māori (the Indigenous peoples of New Zealand), and steps to realise Māori data aspirations. We first introduce the IDI including what it is and how it was developed. We then move to an overview of Māori Data Sovereignty. We consider the main issues with the IDI for Māori including technical issues and problems with ethnic identifiers, deficit-framed work, community involvement, consent, social licence, further data linkage, offshore access, and barriers to access for Māori. We finish with a set of recommendations around how to improve the IDI for Māori, making sure that Māori can get the most out of administrative data for our communities. These include the need to build data researcher capacity and capability for Māori; work with hapori Māori to increase utilisation; change accountability mechanisms, including greater co-governance of data; adequately fund alternatives; or potentially even abolishing the IDI and starting again.

HE WHAKARĀPOPOTOTANGA

Ko te Statistics New Zealand Integrated Data Infrastructure (IDI) tētahi kohinga raraunga whakahaerenga ā-taupori katoa tē-tautohua. Kua nui haere ā ngā kairangahau whakamahi i te IDI kia uruparengia ai ngā pātai rangahau mō te pāpori me te kaupapa here. Mā ā mātou mahi e tiro whānui ai ki te IDI, ngā raru e hāngai ana ki te Māori (te iwi Taketake o Aotearoa), me ngā mahi e puāwai ai ngā raraunga Māori. Ka mātua whakaaturia e mātou te IDI tae rā anō ki ōna āhuatanga me tōna whanaketanga ka tahi. Ka rua, ka hūnuku ki tētahi tirohanga whānui mō ngā Raraunga Rangatiratanga Māori. Hei tā mātou ko ngā raru matua o te IDI ki te Māori tae rā anō ki ngā raru tautuhi me ngā raru o ngā tohu ā-mātāwaka, ngā mahinga ā-takarepa, te nōhanga mai ā-hapori, te whakaaetanga, te āheinga ā-pāpori, te tūhononga ki ngā raraunga kē atu, te whainga āheinga ā-tāwāhi, me ngā tauparenga āheinga ki te Māori. Ka oti ki tētahi kohinga marohitanga mō te ara e whanake ai i te IDI ki te Māori, e oti ai i te Māori te tino whai hua i te nuinga o ngā raraunga whakahaere e pā ana ki ō tātou hapori. E tae rā anō ana ēnei ki te hiahia kia rahi ake ngā kairangahau raraunga me te āheinga ki te Māori; kia mahi tahi ki ngā hapori Māori e whanake ai te whakamahinga; kia panoni i ngā āhuatanga haepapa, tae rā anō ki te kāwanatanga tahitanga ā-raraunga; kia tika te whāngai ā-pūtea i ngā ara kē atu; ki te whakakorenga katoatanga pea rānei o te IDI me te tīmata anō.

Glossary of Māori words: Aotearoa: New Zealand; hapū: kinship group, subtribe relating to common ancestors; hapori: community networks; Iwi: kinship group, tribe relating to common ancestors; kairangahau: researcher; kaupapa Māori: a philosophical approach using Māori knowledge and values; kaupapa Pākehā: Pākehā values and approaches; mana whenua: jurisdictional authority over an area or territory; mātauranga Māori: Māori knowledge(s); Māori: the Indigenous peoples of Aotearoa New Zealand; rōpū: group; taonga: prized treasure, culturally valuable objects; tangata whenua: people of the land; tauiwi: New Zealander of non-Māori decent; Te Ao Māori: the Māori world; Te Tiriti o Waitangi: true founding document of New Zealand, the Māori version of the Treaty; te reo Māori: the Māori language; tikanga Māori: correct procedures or customs from a Māori worldview; tino rangatiratanga: absolute sovereignty; turangawaewae: where one has right to stand and belong through kinship and whakapapa; whakapapa: descent, genealogy; waiata: song(s); whanau: extended family or kinship networks; whenua: land.

Introduction

The Statistics New Zealand Integrated Data Infrastructure (IDI) is a collection of de-identified whole population administrative datasets that researchers are increasingly using to answer pressing social and policy research questions (Milne 2022). IDI data are largely collected via government agencies, the Census, and Statistics New Zealand, but much of this data are not collected directly from Māori (the Indigenous peoples of Aotearoa) with full and informed consent. This is problematic because Māori have very little control on how data are governed, analysed, and reported, which does not reflect Māori rights and interests in data, based in Te Tiriti o Waitangi and Māori data sovereignty principles (Te Mana Raraunga 2018; Paine et al. 2020; Kukutai and Cormack 2021). Statistics New Zealand (Stats NZ), the manager of the IDI, has increased the safeguards on the use of Māori data (Stats NZ 2021). However, there remains considerable risk for the use, analysis, and interpretation of Māori data to be stigmatising and detached from understandings of the broader context, and devoid of a solutions-focus. Data that reinforce stereotypical deficits-focused narratives of Māori contribute to the ways data, science, and research were (and are) used to justify colonial harm. These concerns become more pressing, given moves away from the Census towards administrative data for policymaking, and the issues with the 2018 national Census (Kukutai and Cormack 2019).

Considerable progress is required to understand the appropriate limitations when using IDI data in ways consistent with Indigenous data sovereignty and that seek to improve equity outcomes and realise Māori aspirations. We are writing as a group of kairangahau Māori (researchers), alongside tauiwi allies, who work with or around the IDI. The aims of this paper are to: (1) explore Māori data sovereignty within the IDI context; and (2) provide future-facing suggestions for Māori-centred/-focussed or Kaupapa Māori1 development around administrative data. First, we introduce the IDI for the unfamiliar reader. We then discuss Māori Data Sovereignty (MDS), before covering issues with the IDI with a Māori-centred focus. Lastly, we provide suggestions on how the IDI could be used to help realise Māori aspirations. Ultimately, if Māori do not engage with the IDI it will be used about us, without us. Our goal is to provide a useful, but sometimes critical2 review, alongside Māori-focussed points for constructive development, to empower hapori Māori3, improve Māori data futures, and to work together to help realise desired future states.

What is the IDI?

The Integrated Data Infrastructure (IDI) is a database of potentially-linkable datasets containing individual level information sourced from government agencies, Censuses, and several surveys conducted by Stats NZ. For example, data sources from government agencies include Births, Deaths and Marriages, Hospitalisations, Education, Tax and Income, and Court Charges. Some of the data go back to 1848 (Births, Deaths and Marriages), although most were collected from the 1990s onwards. Data are also sourced from community providers and government-funded NGOs, including client-level data from those who have engaged services (e.g. Auckland City Mission data, a provider of services for homelessness and poverty; Moore 2019). Data are matched across data sources by Stats NZ, through using personal information such as full name and date of birth. Records are then de-identified before they are made available in the IDI. IDI researchers can link records from different data sources using personal unique identifiers that are assigned to individuals by Stats NZ. This allows for research and statistics that consider points of contact with government agencies over time. Generally, IDI data can be used for four types of research: descriptive, analytical, methodological, and evaluation of policies and interventions (see Milne et al. 2019). It is important to note that we focus primarily on issues with the IDI, but many of these problems existed before its establishment: for instance, problems with the collection and framing of administrative data when used for research. These issues persist independent of the IDI, but the IDI is now the vehicle for accessing these data, and the ability to link multiple datasets compounds many existing issues.

Stats NZ aims to keep access to data safe by following a Five Safes framework which governs data access through five criteria, safe: people, projects, settings, data, and output. IDI access is authorised for vetted researchers who are committed to safe data use, where projects operate in the interests of the public good (Stats NZ 2021). Once authorised, data is generally4 only accessible through a secure virtual environment in approved facilities (data labs) where data are de-identified and anonymised, and researchers only get access to the datasets approved for their project (Stats NZ 2021). Research outputs and results must be reviewed by Stats NZ to ensure confidentiality before outputs are released. For instance, counts less than 6 must be suppressed to eliminate the risk of identifying individuals (Stats NZ 2020c). Within the data labs researchers use data management software such as SQL and statistical software (e.g. STATA, SAS, R) to conduct research. Researchers must demonstrate through their CV and references that they are experienced in and capable of such analyses before gaining access. Alongside the vetting process, researchers undertake a training session which covers confidentiality and privacy. Afterwards, individuals must complete a Declaration of Secrecy, committing to Stats NZ’s data security rules before they are granted access.

The IDI was formally established in 2011, born out of work linking different held datasets at different times on student finance, tax, education, benefits, and employee-employer data (Moses 2020). Cabinet agreement extended the IDI to cross-government data in 2013. Its creation was based on both the Statistics Act 1975 and the Privacy Act 1993; interpreting these as meaning there was no need for further legislation to establish the IDI (Moses 2020). Consequently, the IDI was established without due democratic process, such as the parliamentary process – and debate – that typically comes with creating a new law. Although more specific and up-to-date guidance is coming through in the Data and Statistics Act 2022 (NZ Parliament 2022), there is a sense that now it has been created, the IDI is probably here to stay. Furthermore, the IDI was not initially purposely designed with users in mind, and it was certainly not established with considerations of use by hapori Māori.

What is Māori data sovereignty?

The term ‘Māori’ is used to refer to a diverse group of Indigenous peoples, who now comprise approximately 16.5% of Aotearoa’s population by ethnicity, or 19.1% by descent (Stats NZ 2019). At its core, Māori data sovereignty (MDS) is about respecting Māori data, and the ability for Māori communities to exercise power over data usage and outputs. Māori data has been defined as ‘digital or digitisable information or knowledge that is about or from Māori people, our language, culture, resources or environments’, this includes data produced byMāori and about Māori (Te Mana Raraunga 2018).

Māori have rights and authority to access and control their data under Te Tiriti o Waitangi, a key founding document of Aotearoa. Although there are two versions of this document, Te Tiriti (written in te reo Māori) and The Treaty (written in English), we refer to Te Tiriti, as it unquestionably affirms Māori rights and sovereignty, and it is widely viewed as the primary treaty (Paine et al. 2020; Kukutai and Cormack 2021). Doing so also aligns with MDS goals and aspirations, as Te Tiriti maintains the right of Māori to hold power in data in ways Māori deem appropriate and acceptable (Reid and Robson 2007). Through the Waitangi Tribunal (an independent commission of inquiry that hears treaty breaches), six Māori claimant groups began the WAI262 claim challenging the Crown’s obligations to allow Māori to exercise tino rangatiratanga over taonga Māori and cultural resources. Within Te Ao Māori data is considered a precious cultural taonga (Hudson et al. 2017). MDS also aligns with global developments in Indigenous Data Sovereignty, where advancements reflect the values and aspirations of Indigenous communities including through the Global Indigenous Data Alliance (GIDA) and the CARE principles of Collective Benefit, Authority to Control, Responsibility, and Ethics (Carroll et al. 2020). This is further solidified through the rights of Indigenous peoples over their aspirations and self-governance under the United Nations Declaration on the Rights of Indigenous Peoples (UNDRIP; Kukutai and Taylor 2016).

There have been numerous organisations, frameworks and agreements developed within Aotearoa in the last decade to work towards the aspiration of MDS. Te Mana Raraunga (TMR), the Māori Data Sovereignty Network, was founded in 2015 by Māori researchers, practitioners, and entrepreneurs from a wide range of sectors and hapori. Their purpose is to ensure that Māori data is used in ethical ways to promote Māori wellbeing, and to advocate for Māori data governance and data sovereignty. TMR devised a set of principles to guide the use of data, including: rangatiratanga (authority), whakapapa (relationships), whanaungatanga (obligations), kotahitanga (collective benefit), manaakitanga (reciprocity), and kaitiakitanga (guardianship; Te Mana Raraunga 2018). TMR continues to advocate for Māori rights and interests in data across a range of sectors, subject areas, and using a range of participatory methods. Additionally, The Mana Ōrite agreement is a partnership document between Stats NZ and the Data Iwi Leaders Group (Data ILG) of the National Iwi Chairs Forum (a collective of Iwi leaders). This Tiriti-driven agreement was the first of its kind and functioned to solidify the authority of Iwi over their own data. Te Kāhui Raraunga was subsequently established to operationalise Mana Ōrite and advance the ambitions of the Data ILG. Te Kāhui Raraunga aims to strengthen MDS with an Iwi-specific approach, increase Māori statistical capability, increase Iwi data accessibility, and to support Māori innovation through empowered data usage (Te Kāhui Raraunga 2020).

In addition, MDS has to some extent been operationalised in IDI procedure. The Ngā Tikanga Paihere framework was introduced by Stats NZ to assess the appropriateness of IDI projects. The framework helps to assess the cultural appropriateness of research for Māori and urges researchers to prioritise Māori participation and community relationships (Stats NZ 2020b). Guided by tikanga Māori, it draws on ten concepts from Te Ao Māori that demonstrate good data usage, establish cultural boundaries, and encourage deeper research insights through cultural connection (Stats NZ 2020b). However, while this framework prompts researchers to consciously engage with Māori who may be impacted by their research, it does not resolve Māori data governance concerns (McBeth 2020).

What are the issues with the IDI for Māori?

Although the organisations, frameworks and agreements described above provide a basis for MDS and governance, issues remain. In this section, we explore a broad array of issues associated with the IDI, paying particular attention to issues for kairangahau and hapori Māori.

First, it is important to note that technical issues exist with the IDI, and while a complete survey of the issues is outside the scope of this paper, some impact Māori data, such as ethnicity indicator issues (Kvalsvig et al. 2019; Stats NZ 2022). Particularly, there are known technical issues regarding ethnicity data, which has important implications for policy and funding decisions concerning Māori (Kukutai 2004). The IDI contains mainly administrative data: data which are collected as part of routine transactions with state agencies, by busy staff who may be untrained, and are seldom users of the data they collect. Many may be unaware of the IDI and that data will later be used for research purposes. These data are collected in different contexts, sometimes in different ways, by different people, and for different systems and purposes, creating significant variation between data sources in the quality and reliability of data. Research was not the intended or original purpose of the data collection, meaning collection processes may not be suitable – and may not reflect robust research processes. While there are standard data collection protocols (e.g. Ministry of Health 2021) these can be hard to apply in operational contexts, especially in sensitive and culturally appropriate (and hence robust) ways. Anecdotally, many of us report having an ethnicity box completed for us (and whānau), in ways inconsistent with the self-definition inherent in the concept of ethnicity. In addition, Māori may identify their ethnicity differently depending on their trust in and relationship with the specific government agency involved (see Greaves et al. 2022).

This flows to a lack of consistency and quality for Māori ethnic identifiers in the IDI, with individuals often having different ethnicities recorded in different sources of data. Stats NZ have previously demonstrated that sole Māori identification had a high (∼85%) correlation with other IDI administrative data sources and the Census (Reid et al. 2016). For those 51% of Māori with multiple ethnic identifications there was only a 10-40% correlation with the Census response in data sources other than births. To work with this variability in ethnic identification, Stats NZ reports individual ethnicity profiles based on information from multiple IDI datasets (Teng et al. 2017). These individual ethnicity profiles contain the ethnicity information from the agency whose ethnicity data is most comparable with the Census ethnicity data. This ‘source ranked’ ethnicity method uses only one source and ranks the Census as the highest priority source, then birth data, then health data.

However, Census 2018 had poor coverage of the Māori population, so the IDI ethnicity profiles for Māori rely on non-Census sources more than for non-Māori. Māori descent information is available in the IDI with almost all of this information coming from the Census. The poor Census coverage in 2018 required the reuse of some Census 2013 descent data to fill in the response gaps, supplemented by imputation and some birth data, as descent has been able to be recorded on birth records since 1 September 1995. As a result, Māori descent data in the IDI is not drawn from a consistent source, collection method, nor timeframe. Furthermore, it can be challenging to use Māori descent variable for research where individuals may not have been alive or responded to the 2013 or 2018 Censuses.

Deficit framed data also functions to problematise IDI use for Māori. Deficit framing is where research invisibilises the historical and institutional drivers of inequities for marginalised groups, therefore placing blame for inequitable outcomes on marginalised individuals and collectives (Reid and Robson 2007; Reid 2011). Deficit framing is still present in some IDI research,5 which may prevent hapori and kairangahau Māori from recognising the utility of the IDI. Avoiding deficit framing is more challenging using the IDI as data are collected through interactions with the state when whānau are particularly vulnerable (e.g. corrections, welfare, hospitalisations). Data collected in these environments are at higher risk of being interpreted in a deficit-focussed manner as they are separated from important contextualising information (West et al. 2020). Similarly, the IDI has been used in predictive risk modelling, which is often problematic for Māori, further builds mistrust, and can lead to a disproportionate focus on interventions on whānau (Blank et al. 2015). Research with a deficit focus, or that features predictive risk modelling, potentially alienates some Māori from the IDI (and quantitative research) due to the outcomes of such work.

While we have discussed Ngā Tikanga Paihere above, there exists a dearth of accountability mechanisms to hapori Māori and Māori data sovereignty. Many Māori have lamented the lack of community engagement and consideration of research impacts by researchers from university and government organisations (Rauika Māngai 2020; Kukutai et al. 2021): IDI research is no different. Across the history of research in Aotearoa, many researchers engaged with quantitative and data projects have marginalised mātauranga Māori and Kaupapa Māori research. There is increasing engagement with Māori through MBIE’s Vision Mātauranga policy, which serves to encourage researchers to engage with hapori Māori. There are also considerable Māori-driven and – led initiatives to increase research sector engagement with Māori (Rauika Māngai 2020; Kukutai et al. 2021). Therefore, while steps have been taken, improvement is needed to ensure accountability to Māori, given Māori rights and interests in data, and to provide a solutions-focus rather than continuing to simply document inequities.

There is a complex body of issues around consent and the IDI, which are even more complicated for Māori. While permitted under current regulations, data are not collected with full, informed consent by individuals. Instead, this ability to continue to collect and link data without explicit consent is said to rely on ‘social licence’. Social licence has been defined as ‘the permission it [Stats NZ] has to make decisions about management and use of the public’s data without sanction’ (Nielsen 2018, p. 3). Overall, social licence includes the trust the public has in Stats NZ to govern the nation’s data and use it appropriately to benefit society. Social licence is important as international experiences have demonstrated that without it, big datasets can be disbanded due to severe public scrutiny (Carter et al. 2015). Despite having what looks like a comprehensive approach to data security and usage, the power to define howthis operates sits with the state (i.e. Stats NZ), which minimises the autonomy and control that Māori can exercise regarding Māori data. While research has shown that the public tend to not know enough about Stats NZ to make an assessment on their trustworthiness (Nielsen 2018), there is no research that specifically evaluates Māori perspectives of Stats NZ (Kukutai and Cormack 2019). Additionally, social licence is not a static concept as perceptions of Stats NZ can change over time. Therefore, it is important for the system to have ongoing engagement with communities, and to be aware of any change in attitudes.

Social licence is likely more complex for Māori communities, within Aotearoa’s context of ongoing colonisation and contemporary disparities. For example, use of the IDI has potential concerns for Māori when data may contribute to deficit-focussed narratives about Māori, and if the context in which data is collected is not clearly understood (Gulliver et al. 2018). TMR proposes two additional dimensions to social licence for Māori: cultural licence – considering the ramifications of big data usage on the social contract between Māori and the Crown under Te Tiriti; and Māori data sovereignty – the right of Māori to govern and control their own data (Hudson 2016). In summary, it is clear from existing work on social licence that there are broader questions that need to be asked about the IDI by the Crown, and answered by hapori Māori.

While consent is considered a crucial component in the ethics of collecting and using data, concerns remain surrounding the lack of clear, established consent processes for secondary data use (Sporle et al. 2020). The procedural tendencies of informed consent and secondary data usage do not allow for ongoing dialogue, renegotiation, or reciprocal information exchange (Miller and Bell 2002) as it is unclear what may happen with data once it is collected. There has been pressure on researchers and organisations to include additional datasets in the IDI, including data not collected as official statistics, for example independent survey or NGO/independent provider data (McLeod 2010; Radio NZ 2018). Researchers can apply to integrate their datasets into the IDI after an ethics and privacy assessment, although there is currently a lengthy wait (∼2 years; Stats NZ 2021). Such integration may lead to better data for policy, but data being integrated into the IDI poses ethical concerns, given that the original ethics and consent processes from when data were collected do not cover the IDI, and some data was collected before the establishment of the IDI. Service users have expressed concern about their data being used by the state, and providers have expressed concern that they may need to provide such data to continue being funded (Radio NZ 2017). There are further concerns around the integration of youth data as young people have different abilities to consent. A far greater proportion of the Māori population, compared to Pākehā, are under 18 (Ministry of Health 2018). Informed consent becomes patchy when people consent to their data being added to the IDI, however, because, as both the state and the IDI are ever-evolving, it is not clear what future use looks like. Such practices do not align with Māori approaches to knowledge ownership and exchange, and position Māori data within colonial frameworks that do not understand that knowledge can be collectively owned within contexts of mana whenua, Iwi, hapū, and whānau; in opposition to individualised consent approaches (Sporle and Koea 2004). These are important considerations, as consent for individual data usage can have different implications when transferring individual consent to apply to large datasets.

The IDI, as a national-level collection of whole population data, in certain lights can be thought of as a strategic resource. Concerns have arisen when requests have been made by overseas-based researchers to access the IDI through establishing remote data labs. A key part of the Five Safes framework is the secure access to the IDI through data labs, although key individuals have remote access within Aotearoa (Stats NZ 2021). The new Data and Statistics Act includes provisions for access to data from overseas. In making any international access decision, the government Statistician must account for the laws in the overseas jurisdiction, mechanisms to ensure compliance, and any existing relationship to Stats NZ (New Zealand Parliament 2022). This remains an area of concern for Māori, as once data leaves our jurisdiction, there becomes less ability to enforce the rules and norms pertaining to it. Furthermore, overseas researchers – who may have never been to Aotearoa – will likely be unable to appropriately attend to the nuance and contexts of Māori data. From a MDS perspective, the inclusion of overseas researchers in NZ-based projects is an unresolved issue – likely involving consideration of multiple contextual factors including the researchers’ relationships to Aotearoa, who is leading the project, and what data and research questions are involved. While the priority should be Māori access to Māori data, realistically, overseas researchers are accessing the IDI. We argue that at a bare minimum for every project (regardless of where researchers are based), data should only be housed in Aotearoa, existing safeguards such as Ngā Tikanga Paihere should be emphasised, as well as a focus on and concrete investment in Māori capacity and capability building.

Lastly, the learning curve for research in the IDI is often lamented by researchers seeking to use the IDI for the first time This paper has discussed both the technical skills and administrative barriers to accessing the IDI. While many of these are important to keep data safe, they also serve to reinforce limited access: where the same people and groups are the only ones that have the institutional and technical knowledge to access and use the data. These processes make it harder for hapori and kairangahau Māori to access data and use it to answer questions that are of use to them as the requisite technical skills are much scarcer. The access processes presume a researcher-led model of research and preclude a collective-led approach which may involve, but is not led by, technical experts.

In summary, there are a range of issues with the IDI, many of which affect Māori differently to other groups. While we have been unable to cover every possible issue, we hope the reader has gained an appreciation of some of the issues that need working through. We now move to a discussion of solutions.

How do we make a better IDI?

As highlighted, barriers remain for hapori and kairangahau Māori to access and benefit from the IDI. This paper is part of a project which draws on 2013 Census data to explore Māori identity across descent, ethnicity, and knowledge of Iwi variables, and then subsequently link identity to health and social service access, to identify needs (see Greaves et al. 2022). A key reason for this paper was the lack of exemplars for our project. Together, with our Māori advisory rōpū, we explored these issues over the project. Many reflected on their experiences with the IDI, with IDI researchers, and the experiences of Māori colleagues, including postgraduate students. It became clear through the research that if Māori do not engage with the IDI, it will be used about us, withoutus. Given the potential of the data in the IDI to be utilised by Māori to realise community aspirations, we now provide forward-facing suggestions as to how the IDI can work better for Māori.

Build the pipeline, the networks, the capacity and capability

First, all these suggestions rest on Māori capacity and capability. Projects focusing on Māori data need to be led by Māori. There is a small pool of talented Māori quantitative researchers, many of whom combine technical skills with social and cultural knowledge, and even clinical skills. These researchers are spread across many projects – funded and unfunded – and spend a large amount of time on advisory groups. There is clearly a need to grow Māori data researchers. While it is beyond the scope of this paper, several well-documented issues exist for Māori researchers, which some have called the ‘pakaru pipeline’ (Naepi et al. 2020). Issues such as ‘the cultural double shift’ – the need to not only master the skills associated with a general research job, but also culturally-specific skills and tasks – create burnout and exit (Haar and Martin 2021). Others have documented the greater barriers Māori face to attend university, and the discrimination and stereotypes barriers to quantitative subjects, with little access to role models (Theodore et al. 2017). In addition, very few university programmes currently engage with Māori data sovereignty. Universities need to make sure they are actively engaging with safe and appropriate teaching of Indigenous Data Sovereignty throughout their courses and programmes. While structural issues may be addressed over time in policy – including hopefully through the research sector review (MBIE 2021) – there are also practical measures specific to IDI research that could help.

There needs to be a network of Māori IDI users, and more broadly of Māori quantitative researchers. Such networks allow for whakawhanaungatanga, can be protective, and provide pathways and mentoring for junior researchers. Supportive mentoring networks for Māori have been recognised across a range of areas (Martin and Haar 2021). However, such networks cause administrative burden to already-busy researchers. We recommend the establishment of a Māori data researcher network, with noncontingent financial and resource support by organisations (Crown, university, and private sector-based). There also need be clear and funded pathways for Māori into quantitative research and statistics. These need to start in undergraduate study and include funding such as summer scholarships and research assistant work on a stable (non-casual) level, in a supportive environment and cohort where the student is not asked to give ‘the Māori perspective’ and can instead focus on developing skills. A network and an early career programme would enable Māori to grow the Māori workforce and create culturally safe spaces for the development of projects and initiatives to improve New Zealand’s data infrastructure.

Work with hapori Māori to increase utilisation

There are many ways to reduce the aforementioned IDI learning curve to increase access for hapori. The IDI was set up as a research database and presumes both a series of advanced research skills, awareness of what the IDI contains, and knowledge of the administrative systems required to access it. To maintain the credibility and mandate of the IDI it needs to be useful to a wider audience, who are the providers of data. Currently, Māori data flows into the IDI but the access process has not been designed to ensure Māori can access their data or the information it contains. There are numerous barriers for hapori Māori to use such data for their own goals, and insufficient structures to facilitate access to the data resources for hapori-led research. Beyond increasing capacity, capability, and co-governance, structures also need to enable hapori participation. Institutions and universities need to provide funding so hapori are funded to contribute to research proposals from the outset. There is a power imbalance when a university researcher (on a reasonable salary) approaches a community organisation to develop a proposal that may not be funded. Institutions need to make such development funding available and accessible without too many administrative barriers. Additionally, groups such as the Virtual Health Information network and projects such as Te Rourou Tātaritanga6 seek to demystify the IDI with courses and workshops, online resources, exemplars, and guides. A further suggestion is for the (funded) implementation of data navigators who work with hapori Māori to help to navigate IDI use and access for their needs. We are hopeful for continuing work in this space over time, including hapori Māori-specific resources.

There is also a need to reduce technical skill and location access barriers. One solution is to invest more in kairangahau Māori, but a broader solution would be to change the IDI. We think that there is a need for iNZight and other point-and-click tools to be made readily available in the data lab. Technical solutions should be possible to allow wider (but safe) use of the IDI, especially for very simple uses (counts, proportions), which comprise most use of the IDI. Geographical access barriers also reinforce that the IDI is for academics and policy researchers, given that data labs are only in the main centres, and where there are universities. Investment needs to be made in providing data labs to varied, accessible locations for communities.

There is also a fundamental question to be asked: what data do hapori need from the IDI? Engagement with hapori and kairangahau Māori needs to happen early and throughout the work. The framing of the research questions, and the results are important. Māori perspectives are needed to co-design any models and include variables. It may be a case of adding specific variables to data collection, but also of allowing control of certain variables, and the ability to shape collection and storage. For example, Te Mana Raraunga held an online wānanga on Iwi identifiers, and found that Iwi identifiers are useful for hapori, but that Iwi need control over access to these variables; however, many issues remain around the quality of these identifiers and consistency with Iwi-held datasets (Te Mana Raraunga 2021). Working with hapori Māori to improve data collection, including not only the methods used (Kukutai and Cormack 2019), but also the quality of specific questions asked and their cross-cultural applicability (Greaves et al. 2022), is work that needs to be undertaken and funded by the Crown. Similarly, more work is needed at the data collection-level around clinicians and administrators who need to know where the data goes, how it is used, and how it changes systems. Data quality is particularly important, given the use of some markers for resource allocation, for monitoring equity, and in the calculation of Māori (and general) electorates (Kukutai 2004).

Increase accountability, include Māori data governance

Researchers applying for the IDI need to satisfy certain requirements, but there is no clear public accountability to Māori. Processes and procedures need to be recreated to include transparent provisions for Māori data governance. Current procedures around Māori engagement place emphasis between the front and middle of the project, whereas governance is needed throughout. Māori are often engaged around the time of application for funding, ethics, or IDI access, but there is little-to-no accountability at the time the final paper or report is written, or once the work is published. A common solution has been to include a Māori individual, potentially through a postgraduate scholarship, casual or fixed-term research contract, and/or as a co-author. Such engagement is often counterproductive in capacity and capability building, especially around time limitations, limited resourcing, and power imbalances. Indeed, we have observed papers where the work is clearly problematic, but have not wanted to draw attention given our limited capacity, and the ability to incidentally give the paper attention, downloads, and citations.

There is a need to ensure Māori review the final product from a project and accountability through providing the ‘final word’ on interpretation and content of Māori data (Came et al. 2020). Such processes need proper resourcing to avoid problematic power dynamics around junior researchers, to prevent the other ways that researchers play systems, and to ensure the review is active over time (Daalder 2022). Māori need mechanisms for challenging research decisions that are not costly for them personally, in time or finances (i.e. the courts are not an appropriate mechanism). There is some opportunity for process change with the incoming Data and Statistics Act, and while these ideas may not be incorporated into the law, the new law attempts to give effect to Te Tiriti, and includes capacity and capability building for both Stats NZ and Māori researchers and communities. This provides a great opportunity operationally for Stats NZ to provide greater transparency for Māori. There is also the potential to make these processes external to research groups. Alternatives such as a Māori Chief Data Steward or an expert advisory panel within co-governance structures have been suggested in the past (Kukutai et al. 2021). There need to be external structures that consist of more than one Māori individual, that can review outputs and give constructive advice. The design of such processes is under way in partnership between Stats NZ and the Data Iwi Leader Group (data.govt.nz 2021).

In the case of the IDI, the data are already collected and integrated, and one way to uphold Māori rights and interests in data is through governance processes. Such structures need to include not only Māori individuals, but Māori with the mandate to represent different collectives. These processes need to be transparent, for instance, projects need to include public statements as to how they will uphold Māori data sovereignty and cultural and social responsibilities throughout the life of the project, and on who will be involved. Current databases are hard to navigate to identify work that is relevant to or focused on Māori, limiting our ability to scrutinise existing research. Ultimately, governance needs to equal power sharing and control. Advisory groups certainly assist, but the key word is ‘advisory’. Although ‘veto power’ has been controversial in discussions pertaining to the Māori Health Authority (Tipene-Allen 2021), we would encourage governance structures to include mechanisms for Māori veto over projects. If a project is problematic from a Māori perspective, then Māori on a given committee could be out-voted, so extra protection is needed. In short, Māori data governance needs to be transparent, public, include veto mechanisms, and have Māori involved who are accountable to communities.

Adequately fund alternatives

Over time the costs associated with research are increasing and funding is not keeping pace, despite pressure on researchers to publish more (MBIE 2021). Response rates to surveys and requests for research participation are decreasing over time (Greaves et al. 2020). Respondent burden in some contexts is high, for example in school-based research, or in seeking the participation of Māori. Limited options for research by both government and university researchers lead to movement towards administrative data. Despite the barriers discussed, access to the IDI is affordable ($500 + GST) and it contains numerous datasets. However, work on administrative data can never truly be the voice of someone given the contexts under which data are collected, and other methods are still needed to supplement the work. The IDI cannot replace other research, and there is a need to make sure that researchers are able to work together to collect data in the face of increasing costs. There is also an inability to find the solutions in IDI data, as such solutions need a different type of research, rather than simply reporting on administrative data. Short of funding increases, there need to be mechanisms and systems for researchers to be able to pool resources and work together – for example to complete a study in schools, or a Māori survey study – to access participants and different types of data, or we risk more researchers moving towards the IDI due to the absence of accessible data.

Start again

Finally, we would like to acknowledge ideas around abolishing the IDI. As mentioned earlier, the IDI was not designed with an eye to Te Tiriti. The data in the IDI are inherently deficit-focussed, given that they were collected through individual interactions with the Crown, often under trying circumstances. For us, questions remain as to whether Māori would be able to ever be culturally safe under the current system. A potential solution is to scrap the IDI and co-design a system with kairangahau and hapori Māori, alongside other end-users. A new system could go through the democratic process, gaining the public’s consent, and could better give effect to Te Tiriti and Māori community needs. Such a system could prioritise collecting the data that matters to Māori, and focus on aspirations, not problems. While this is an ambitious suggestion, we would challenge readers to think of how world-leading and innovative such a co-design project would be, and how it would flow through to policy, and outcomes for all of Aotearoa.

Conclusion

As Māori, if we do not engage with and make use of the IDI, it will be used aboutus, without us. However, this group seeks to instead envision an infrastructure that is led by Māori, collects data that upholds the aspirations of Māori, is maintained and accountable to Māori, and is utilised to create change that benefits hapori Māori. In the meantime, we can continue to challenge and improve the current systems, with a vision to Indigenise future data infrastructure systems, and to decrease the barriers between kairangahau and hapori Māori and accessing data for use by us, with us, for us.

Acknowledgements

The authors would like to thank the project advisory rōpū for their guidance, suggestions, and advice. We would like to thank Frank Gore for his proof-reading and editing work. We would like to acknowledge the Public Policy Institute (PPI) and Te Rourou Tātaritanga for hosting our Māori Data Sovereignty reading group.

Disclosure statement

No potential conflict of interest was reported by the author(s).

This work was funded by a Health Research Council of New Zealand Emerging Researcher First Grant awarded to Lara Greaves.