Decrypting Log Data: A Meta-Analysis on General Online Activity and Learning Outcome within Digital Learning Environments

Thu-02

Presentation time:

Maria Klose, Diana Steger, Julian Fick, and Cordula Artelt

Leibniz Institute for Educational Trajectories

Background

Until recently, face-to-face teaching and on-site exams were considered gold standard. Not least because of the corona pandemic, the need for online learning increased drastically. However, only little is known about how students’ learning behavior is linked to learning outcomes. Common online learning environments, so-called Learning Management Systems (LMS), make it easier to obtain information about students’ learning behavior by analyzing log data (e.g., McCuaig & Baldwin, 2012). Yet, log data are just simplified proxies for complex learning behavior, which questions their usefulness for the evaluation of online classes. Consequently, there is an ongoing debate whether log data can be used to predict learning outcomes (Morris et al., 2005), or whether online courses are too heterogeneous to draw a general conclusion (e.g., Conijn et al., 2017).

Objectives

We provide a systematic overview of studies that investigated the relationship between general online activity measured via log data and educational outcomes in LMS to assess the applicability of log data as a proxy of online learning behavior. Furthermore, since online classes vary broadly, we examine to which extent different course characteristics might affect this relationship.

Research questions

Overall, we summarize findings on correlations between general online activity (i.e., total time spent online or login frequency) and learning outcome (i.e., course grade or course score) within LMS. Additionally, the impact of several moderators was examined: First, online classes vary with respect to their format (online vs. blended learning). Courses that use blended learning formats include components of face-to-face learning which could weaken the relation. Second, courses vary with respect to the emphasis that is put on discussion activity, since the existence of a discussion board might foster deeper learning while being online. Third, three forms of requirement for participation were distinguished: participation as a part of grading, shares of mandatory participation, and no requirements. Courses that include requirements are considered to strengthen the relation between general online activity and learning outcome, as they might lead to a higher importance of online participation for the learner. Lastly, we compare the operationalization of general online activity measured as total duration, or as login frequency..

Method

Literature Search and Coding Process. Potentially relevant studies were identified from searches in major scientific databases. We retrieved further studies by conducting a similar search in Google Scholar, screening reference lists, contacting authors, and a call for unpublished paper. Subsequently, studies were included depending on a priori determined inclusion criteria. After applying these criteria, 34 studies remained. All studies are coded by two independent raters.

Statistical Analyses. We used Pearson product-moment correlation as effect size measure. We pooled effect sizes using a random-effects model with a maximum likelihood estimator (Viechtbauer, 2010) and a three-level meta-analysis was used to account for dependent effect sizes (Cheung, 2014). Correlations were weighted using sample size weights (Brannick et al., 2011). I² statistics were calculated for quantifying heterogeneity in observed effect sizes (Higgins et al., 2003). Meta-regression analyses were used for examining moderating effects on the pooled effect size (Harrer et al., 2019).

Results

Descriptive Statistics. The meta-analysis is based on 88 effect sizes covering 55 independent samples from 34 studies (N = 34,363) that were published between 1997 and 2019, mainly as peer-reviewed articles (79%). The online courses varied in format (27% online vs. 72% blended learning), emphasis of discussion (28% instruction for vs. 51% availability of vs. 20% none), and requirements (51% participation as part of grading, 14% shares of mandatory participation vs. 35% none). In 38% of the cases, general online activity was operationalized as time spent online and in 62% as login frequency.

Meta-Analytic Model. Overall, the three-level random-effects meta-analysis identified a pooled effect of ρ = .22, 95% CI [.06, .38] indicating that students who are more active online, have also the better learning outcome. The studies showed medium heterogeneity indicating moderate differences between (I² = .53) and within samples (I² = .41). Furthermore, we conducted a meta-regression to examine the effects of learning format, emphasis of discussion, requirements, and operationalization of online activity on the pooled effect. None of the moderators was significant. The respective meta-regression analysis that we conducted to examine publication bias also indicated no significant difference between effect sizes extracted from published vs. unpublished sources (γ = -0.06, SE = 0.12, p = .62).

Conclusion

In summary, a positive—yet comparatively small—association between general online activity measured via log data and learning outcome within LMS exists. Moreover, the examined moderators cannot explain ambiguous findings. Further studies should systematically examine other potential influencing factors, such as different instructions or incentives for participation. We will provide a thorough discussion on when and how the use of log data as approximations of learning behavior is reasonable—and when it is not.

References

Brannick, M. T., Yang, L.-Q., & Cafri, G. (2011). Comparison of weights for meta-analysis of r and d under realistic conditions. Organizational Research Methods, 14(4), 587–607. https://doi.org/10.1177/1094428110368725

Cheung, M. W.-L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19(2), 211–229. https://doi.org/10.1037/a0032968

Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2017). Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29. https://doi.org/10.1109/TLT.2016.2616312

Harrer, M., Cuijpers, P., Furukawa, T. A., & Ebert, D. D. (2019). Doing meta-analysis in R: A hands-on guide. https://doi.org/10.5281/zenodo.2551803

Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring inconsistency in meta-analyses. BMJ, 327(7414), 557–560. https://doi.org/10.1136/bmj.327.7414.557

McCuaig, J., & Baldwin, J. (2012). Identifying successful learners from interaction behaviour. Proceedings of the 5th International Conference on Educational Data Mining, 160–163.

Morris, L. V., Finnegan, C., & Wu, S.-S. (2005). Tracking student behavior, persistence, and achievement in online courses. The Internet and Higher Education, 8(3), 221–231. https://doi.org/10.1016/j.iheduc.2005.06.009

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03