ИЗВЛИЧАНЕ НА СЪДЪРЖАТЕЛНА ИНФОРМАЦИЯ ОТ УЕБ ИЗТОЧНИЦИ НА ДАННИ ЗА НЕДВИЖИМИ ИМОТИ - ОПИТА НА НСИ КАТО ЧАСТ ОТ МРЕЖАТА ЗА ИЗУЧАВАНЕ НА УЕБ ПРОСТРАНСТВОТО (WEB INTELLIGENCE NETWORK - WIN)
EXTRACTING VALUABLE INFORMATION FROM WEB SOURCES OF REAL ESTATE DATA - NSI’S EXPERIENCE AS PART OF THE WEB INTELLIGENCE NETWORK (WIN)
Author(s): Galja Stateva, Kostadin GeorgievSubject(s): Economy, National Economy, ICT Information and Communications Technologies
Published by: Национален статистически институт
Keywords: web data; web sources; real estate; web scraping; WIH; trusted smart statistics
Summary/Abstract: This article presents the challenges related to the application of Natural Language Processing (NLP) to web data from online real estate advertisements, aiming to extract valuable information that can serve as an additional source for official statistics. The current study is part of the experimental use cases within the ESSnet WIN project, whose main goal is to explore the potential of creating new and expanding existing statistics through the Web Intelligence Hub (WIH).
Journal: Статистика
- Issue Year: 2023
- Issue No: 1
- Page Range: 42-56
- Page Count: 15
- Language: Bulgarian