Wednesday, December 8, 2021

Google Has The Ability To 10x Education, Here’s How

Charlie Sweeting

This article was written after interning at Google in 2019, but before returning there in 2022. It represents my personal views and not those of Alphabet or it's associated companies. Although I still believe in the fundamentals of this article, I have since updated my views and believe there are reasons of a structural, legislative and organisational nature that potentially place this kind of vision more firmly in the realm of startups.

TLDR: Efficient learning is a recommendation problem. Putting the right combination of materials in front of an individual at the right time is key to unlocking understanding. Learning data is so highly dimensional, contextual and with such a long time horizon, that it presents an incredibly large cold start problem. Google, through search and Youtube, is potentially the only company that can overcome that cold start. Allowing Google to extend search and build contextual recommender systems for learning that can 10x, and massively individualise, the process for students globally.

Learning Is A Recommendation Problem

Learning isn’t a continuous process. It occurs in leaps. We sit there, rephrasing how we approach a subject again and again, until the “Ah-ha” and “Oh” moments make it make sense. Asking the right questions, viewing the right learning materials or listening to the right explanation speeds that process up. The converse is also true. Not finding the right materials can slow learning, or completely bring it to a halt.

Teachers, tutors and peers help with that process. They use contextual information and a feedback loop with the student to identify the right explanation for a given topic. The more personalised that feedback loop, the quicker the explanation that resonates with the learner is found.

Driving Learning Efficiency Is Currently A Very Expensive Manual Process

The resource requirement needed to personalise learning in our traditional model is huge. The high cost of personal tutors, who present the most personalised and effective learning recommendations to a student, results in a trade-off. Classrooms sacrifice personalised approaches, presenting a setting where multiple students can share a teaching resource, in favour of better cost efficiency.

High cost, manual matching processes are an ideal use case for recommender systems. They scale past, and provide better learning transfer than, what you can achieve with a manual network. However, recommender systems universally suffer from a cold start problem. They need a high volume of relevant and good quality data to train on in order to provide salient recommendations. That becomes progressively harder when data collection depends on the existence of the recommender system, initial data sits in silos which you can’t access and the required data is highly dimensional.

Learning is a Very Hard Recommendation Problem To Solve

The cold start for recommender systems in education is particularly severe. You need enough access to a broad range of educational resources, and student interaction with those resources, to paint a full picture of what brought a student to progress.

A perfect recommender dataset has a huge number of mutually exclusive samples with a clear outcome dependent on a small range of hyper-relevant features. An education dataset doesn’t model that ideal. It’s generally interconnected (not mutually exclusive), highly contextual, fragmented across services and has a time horizon that can map over years.

Although the rewards are considerable, truly revolutionising learning, no company has yet been able to build a sufficient knowledge graph for education. A system that identifies what concept you’re trying to grasp and individually selects the best learning resource to push that understanding by exploring the map of how those concepts relate to each other.

Enter Google

In order to build that knowledge graph successfully you’d need three core resources.

[1] Long-term search data (Like the majority of search volume dating back to 2004).

[2] An indexed web with semantic models which understand the relationships between content.

[3] Linked contextual information of searchers (such as who they are, where they live, where they go to school etc.)

Google is potentially the only company that has all three. Handling over 90% of search volume, the worlds more successful web crawler, the most comprehensive knowledge graph for search, the worlds repository for video information (Youtube) and enough services with one associated central identity platform to provide a comprehensive view of who you are.

Google’s comprehensive tracking of your whole digital identity (Search, Youtube, Gmail) and where you go offline (Google Maps) gives them the contextual information to understand your whole learning journey since 2004.

In aggregate, Google have the data, knowledge and capital to segment and map human progression. Although an incredible feat, and a technical challenge equivalent to search, Google have the capability to build a recommender system that could truly 10x education.