Three Metric-Based Method for Data Compatibility Calculation
Three Metric-Based Method for Data Compatibility Calculation
Author(s): Daniel VodňanskýSubject(s): ICT Information and Communications Technologies
Published by: Vysoká škola ekonomická v Praze
Keywords: Data metrics; Amount of information; Metadata; Relational database; XML; JSON; RDF; Ontology; Transformation; Structuredness; Hierarchicallity; Normalization; Visualization
Summary/Abstract: This article analyzes ways of calculating characteristics of data and most common data structure types that allow comparison between them or on a time axis. To achieve this, it studies the key aspects of relational databases, XML, JSON and RDF structure types. These data structure types are compared to multiple isolated approaches to data quality and other data characteristics measurements. The goals of the article are the calculation method itself and a storage structure for calculated values. The article presents a method of characterization of data and data structure types based on the calculation of three metrics: the amount of structuredness, the amount of hierarchicallity and the amount of information. This triad of metrics allows comparison between various data sets (objects), for example evaluating the complexity of the transformation of data from one data object to another, as well as with data structure types (as mentioned above). Based on the vector of three metrics, the calculation method of the compatibility between data and data structure type is proposed. This method can help select the most compatible data format for existing data. The calculated values of metrics can also detect non-optimal storage design and classify data transformations. The method was evaluated on an example case study, which showed its usability on an example demonstration data set. It can be used in the process of data modelling to help select optimal data structure type, to design a data transformation process and to optimize existing data storages.
Journal: Acta Informatica Pragensia
- Issue Year: 10/2021
- Issue No: 1
- Page Range: 38-60
- Page Count: 23
- Language: English