Skip to content Skip to navigation

Data-Driven Research – Accelerating and Scaling the Science of Learning at CAROL and the Stanford Lytics Lab

December 13, 2017
Stanford VPTL Lytics Lab

As Stanford helped spur a transformation in online learning over the past five years, research undertaken by university affiliated scholars has been illuminating the fast-changing world of digital instruction and playing a significant role in the development of an entirely new field of educational research. 

These efforts have been spearheaded by two research groups, the Stanford Lytics Lab and the Center for Advanced Research through Online Learning, or CAROL.

Scholars associated with the Stanford Lytics Lab, which originated in a gathering of five graduate students from multiple disciplines in September 2012, are advancing the science of learning through large-scale data analyses and the study and development of new educational technologies. To date, Lab-affiliated researchers have published more than 40 peer-reviewed articles in journals including Science, the Journal of Educational Psychology, and PNAS, and in the published proceedings of multiple major learning and technology conferences.

Much of their work would have been impossible without access to the very large data sets and new data management analysis tools developed by colleagues at the Center for Advanced Research through Online Learning. Founded contemporaneously with the Lytics Lab, CAROL acts as a central repository for the data generated by the various Stanford online learning platforms that are used by hundreds of thousands of global learners each year. CAROL curates and shares these data with researchers both at Stanford and across the world, and maintains computational tools for their analysis by experts in many fields.

The Stanford Lytics Lab – innovative research with notable results

From the outset, the Stanford Lytics Lab was conceived as an open, interdisciplinary research community. As a result, it has provided an environment within which researchers have been able to pioneer the new science of learning analytics (from which the word “lytics” is derived) and then to build and iterate upon their emerging insights with unusual flexibility and speed.

To date, research undertaken by Lytics Lab researchers has typically fallen within one of three broad areas of study:

  • Understanding online learners: the characteristics of online learners, their motivations, how they learn, and how to help them succeed.
  • Evaluating digital instruction: data analysis to improve assessment, course design, instructor feedback, and student interaction.
  • Building better learning tools: new approaches to key educational technologies such as learning platforms, peer feedback mechanisms, student assessment tools, and discussion fora.

Understanding online learners

Early Lytics Lab research identified four prototypical course trajectories in online learner behavior and established an enrollment intention metric that can be used to study learner behavior and improve course design. Other studies have revealed significant gender and geographical achievement gaps in MOOCs, shown how highly active “superposters” can contribute to a course’s success, and suggested how course design can encourage forum participation.

A number of analyses identified the positive effects of learning in groups that are culturally diverse. Researchers have also tested new approaches to peer grading, explored how detecting patterns in assignment completion can aid progress in learning, and created a dropout predictor that can flag 40%-50% of dropouts while they are still active and another 40%-45% within 14 days of their being inactive.

Evaluating digital instruction

Lytics Lab research into digital instruction has suggested optimal approaches to data-enriched assessment and shown how instructor dashboards can help advance student learning. Additional studies have explored the student experience of blended engineering courses, identified common patterns in very large sets of assignment submissions, and found that prompting explanations of beliefs can help clear up student misconceptions.

Some instructional research results have been counterintuitive, while others reveal the complex heterogeneity of the online learning population. A study of online teams, for example, found that interventions to encourage positive perceptions and behavior can result in learners leaving their teams. And while 2/3 of learners prefer to watch videos showing their instructor's face, 1/3 prefer not to see it. This has led to further lab research in visual attention to the face and the potential for social robots to deliver instructional content.

Building better learning tools

Lab researchers have also studied, and in many cases helped develop, a broad array of new digital tools for learning. Lacuna Stories is a social reading platform that encourages annotation and connecting fragments of texts. PeerStudio improves student performance by enabling rapid peer feedback, and Talkabout lets large online classes create small, globally diverse video discussion groups, resulting in higher student achievement and new opportunities for cross-cultural connection.

The Codewebs search engine can index computer science students’ coding submissions, revealing common mistakes and thereby helping instructors give more effective feedback. In a similar vein, lab research has shown how neural networks can be used to assess student code submissions and then propagate useful comments on an otherwise impossible scale, and produced a forum post classifier that guides learners towards relevant content that is otherwise difficult to access.

CAROL, a key information hub for learning about digital learning  

Lytics Lab research has been enabled by the Center for Advanced Research through Online Learning (CAROL), which collects, stores, and manages instructional data from multiple digital instruction platforms, curates these data for the purposes of research, and maintains analytic tools that make the data accessible to researchers from a wide variety of disciplines.

In addition to providing much of the data underlying Lytics Lab research, CAROL has carried out over one hundred data shares with researchers worldwide, including colleagues at 25 other universities, sustaining a truly international and multidisciplinary scientific community.

CAROL also articulates standards for ethical governance of research data derived from digital learning environments. Center members are active participants in ongoing national and global efforts to establish guidelines and best practices for the responsible use of instructional data.

An evolving – and expanding – research focus

Stanford researchers continue to expand upon their investigations of digital learning, both in their long-term fields of interest and in new areas of study. 

They are now building on early lab findings, for example, to develop a body of experimentally validated research that identifies targeted interventions most likely to result in improvements in distance learning, especially for students who are traditionally marginalized in conventional academic settings. Supported by Lab-developed tools and analyses that both track individual progress and place students within specific cultural learning contexts, this research is indicating how interventions known to work in the classroom environment can be applied successfully in the online space at scale.

Researchers are also seeking to identify more precisely the design elements that make for successful computer-based learning environments. These include psychological interventions for struggling students that might address issues such as self-regulation, inclusion, and how culture affects interventions, as well as efforts to understand better how students conceive of their own learning, allowing instructional designers to factor social, emotional, and epistemological considerations into their learning designs. 

Other areas of current interest include: using neural networks for knowledge tracing; new tools for teaching theorem proving; teaching choreography and engineering through robotic and interactive simulations; narrative approaches to science instruction; and an infrastructure for collecting data across platforms and analyzing it with alternative processing models.