Designing a Data Warehouse for Collected Data About User Activity in Social Networks Using Elasticsearch Cover Image

Designing a Data Warehouse for Collected Data About User Activity in Social Networks Using Elasticsearch
Designing a Data Warehouse for Collected Data About User Activity in Social Networks Using Elasticsearch

Author(s): Iryna Mysiuk
Subject(s): Media studies, Communication studies, Social Informatics, ICT Information and Communications Technologies
Published by: Altezoro, s. r. o. & Dialog
Keywords: social networks; data warehouse; data analytics; big data processing; system design;

Summary/Abstract: In this paper, a data storage data warehouse is designed to store collected data from social networks. Creating indexes with data and selecting a configuration with the appropriate number of shards and replicas is described – the primary states of the cluster and possibilities of its scaling. The features of working with the non-relational Elasticsearch database are described when working with data on user activity in social network posts. Among social networks, Facebook and Instagram were chosen for analysis. The paper describes the advantages and disadvantages of using such a data store compared to Apache Kafka. Analysed existing data insertion Application Program Interfaces (APIs) and data visualisation tools integrated with Elasticsearch. The study describes the use of the Bulk API to insert many records at once into a database. The designed data warehouse uses Kibana, a data visualisation and analytics tool integrated with the selected database. Also, it is shown the ability to insert and view logs using Elasticsearch, Logstash, and Kibana (ELK stack). Tested data ingest by logging into the database using Beats. The obtained results can help implement a system for analysing user activities from social network data based on Elasticsearch as a central component.

  • Issue Year: 9/2023
  • Issue No: 7
  • Page Range: 4001-4005
  • Page Count: 5
  • Language: English
Toggle Accessibility Mode