Profilowanie, oczyszczanie i zapobieganie
powstawaniu dirty data

Profilowanie, oczyszczanie i zapobieganie powstawaniu dirty data
Dirty data – profiling, cleansing and prevention

Author(s): Kamila Migdał-Najman, Krzysztof Najman
Subject(s): Economy
Published by: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Keywords: Big Data; dirty data; profiling data; data cleansing; defect prevention

Summary/Abstract: There are almost unlimited sources of large streams of information now being referred to as Big Data. Because of it we hope for a faster, cheaper, more precise and versatile description in the world around us. At the same time, in such data sets, apart from data of a proper quality (clear data), significant share is false, outdated, noisy data, often multiplied, incomplete or incorrect (dirty data), as well as data of unknown quality or usefulness (dark data). A significant share of dirty data and dark data causes a number of negative consequences in the analysis of Big Data sets. The aim of this article is to review and systemically capture the procedures for minimizing the negative effects of dirty data in the analysis of Big Data. The design of the data collection system includes the most important profiling procedures (profiling data), cleansing data and defect prevention of dirty data in the process of building and analyzing the Big Data sets.

Details
Contents

Journal: Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu

Issue Year: 2018
Issue No: 508
Page Range: 146-156
Page Count: 11
Language: Polish

Content File-PDF

Back to list

Profilowanie, oczyszczanie i zapobieganie powstawaniu dirty data Dirty data – profiling, cleansing and prevention

Profilowanie, oczyszczanie i zapobieganie powstawaniu dirty data
Dirty data – profiling, cleansing and prevention