Data dissemination in the kilometre grid: challenges connected to the protection of statistical confidentiality Cover Image

Udostępnianie danych spisowych w przekroju siatki kilometrowej – wyzwania związane z ochroną tajemnicy statystycznej
Data dissemination in the kilometre grid: challenges connected to the protection of statistical confidentiality

Author(s): Tomasz Klimanek, Tomasz Józefowski, Andrzej Młodak, Amelia Wardzińska-Sharif
Subject(s): Maps / Cartography
Published by: Główny Urząd Statystyczny
Keywords: kilometre grid; statistical disclosure control; non-perturbative methods; census

Summary/Abstract: Providing basic data from the National Population and Housing Census in a kilometre grid is one of the most important ways of disseminating census results, which at the same time meets the applicable national and international requirements. Due to the fact that the tiles of the kilometre grid are relatively small (squares with one-kilometre-long sides) and thus the risk of identifying a concrete person and disclosing sensitive information about him or her is significant, it is necessary to employ data-protection procedures. The aim of the paper is to discuss the most important directions in the statistical disclosure control on the example of data collected during the National Population and Housing Census 2021, and to propose methods and tools from the aforementioned realm that would be applicable. These will be mainly non-perturbative approaches, i.e. ones that cause suppression of sensitive information. The paper also brings to light the most important issues and challenges dependent on the scope of information disclosed and related to this type of data-protection procedures, as the number and type of variables determine the risk of the identification of individuals and influence the selection of suitable protection tools. The article sets forth proposals for methodological and technical solutions in the field. The analyses demonstrate that data protection poses a significant challenge in the studied case, especially if several mutually-connected databases are to be protected. In such a situation, it is necessary to take into account the logical and mathematical connections between the data sets. An additional risk factor can also be the density or hierarchical character of the grid.

  • Issue Year: 69/2024
  • Issue No: 07
  • Page Range: 43-59
  • Page Count: 17
  • Language: English, Polish
Toggle Accessibility Mode