Usuwanie artefaktów w wykrywaniu wzorców użytkowania stron WWW
Denoising as a method of discovering hidden web usage patterns
Author(s): Paweł Weichbroth, Łukasz MikulskiSubject(s): Economy
Published by: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Keywords: web usage mining; data mining; knowledge discovery from databases
Summary/Abstract: The activity of web portals’ users is recorded in a WWW server log file. In order to reveal and analyse the web usage patterns, the data from unprocessed log files should be preprocessed. In this article the two-stage research was conducted. In the first one all frequent sets were found, with arbitrarily assumed support ratio, and association rules, with arbitrarily assumed confidence ratio. In the second stage the obtained results were analysed - frequent sets, and based on them generated association rules. Based on this analysis the subjectively chosen sets, classified as noise, are removed. Those are either outside the scope of research data or the ones which dominate other elements. If necessary the activities connected with data denoising can be iterated. Based on such a processed WWW server log file, finally frequent sets are selected. In turn, based on aforementioned, association rules are extracted. Those are the ones reflecting the relevant navigation paths which, while adequately aggregated, would be used to select the web usage patterns.
Journal: Informatyka Ekonomiczna
- Issue Year: 2011
- Issue No: 22
- Page Range: 254-266
- Page Count: 13
- Language: Polish