Scaling Learning Analytics: The Practical Application Of Synthetic Data

Scaling Learning Analytics: The Practical Application Of Synthetic Data
Scaling Learning Analytics: The Practical Application Of Synthetic Data

Author(s): Alan Berg, Gabor Kismihok, Niall Sclater
Subject(s): Social Sciences, Education, Higher Education
Published by: European Distance and E-Learning Network
Keywords: Distance and e-learning methodology; Learning analytics; New ICT and media applications in learning; Online learning environments and platforms; Quality issues; Stakeholder involvement; Standards and

Summary/Abstract: This case study is based on experiences gained during the running of a two-day data hackathon around large scale Learning Analytics infrastructure at the LAK16 conference. The main conclusion is that there will be a significant demand for realistic synthetic data to support the development of large scale infrastructures. Synthetic data overcomes ethical barriers to sharing large data sets between different (parts of) organizations. Properly simulated synthetic data can be leveraged to fine tune algorithms deployed within the field of Learning Analytics. This data driven approach lowers the risk of accidental disclosure and bypasses limitations rightfully imposed due to legal and/or ethical constraints associated with real student data. The application of synthetic data to performance testing allows universities to develop highly scalable infrastructure in parallel to developing central data governance practices. This short paper explores the conformance testing of Learning Record Stores (LRS – secure locations to store and query student digital traces), discusses the implications for Universities around a specific set of xAPI recipes (Berg, Scheffel, Drachsler, Ternier, & Specht, 2016) and generalizes practices for the acceleration of large scale deployments of LA infrastructure. The authors argue that by applying a standardized set of synthetic data based on a peer reviewed synthetic data generator, universities will find it easier to develop reliable recipes for digital learner traces. Consistent data storage across university boundaries will subsequently enable the benchmarking of algorithms that consume student digital traces and support the generation of predictive validity evidence across university boundaries. Thus universities can compare the value of their algorithms relative to other universities and consistently apply algorithms when students transfer.

Details
Contents

Journal: European Distance and E-Learning Network (EDEN) Conference Proceedings

Issue Year: 2016
Issue No: 2
Page Range: 264-269
Page Count: 6
Language: English

Content File-PDF

Back to list