The Promising, Eerie Future of Big Data in Public Schools

Alexis Kedo
5 min readApr 5, 2021
Photo by Ivan Aleksic on Unsplash; Muzej Jugoslavije, Belgrade, Serbia

To say Data is Big in the NBA is an understatement. Nearly every team now has a front-of-office analytics department. Cameras are stationed in arenas that capture the movements of all 10 players and the ball at a rate of 25 times per second. Last summer (prior to the now-famous Bubble Season) the NBA announced that players would have the option of wearing a biometric device that tracks sleep, respiratory health, and heart rate.

As a former educator (and, admittedly, one who finds the concept of a Bubble Season being deemed “necessary” a bit ridiculous), I can’t help but compare the NBA’s data practices to the means by which society holds itself accountable to, say, the academic and social progress of the average American fourth-grader.

Due to high-stakes testing dating back to No Child Left Behind era of the early aughts, most American students are tested annually in (at a minimum) reading and math from third until twelfth grade, mainly so that their state can demonstrate Adequate Yearly Progress and maintain eligibility for federal education dollars. To ensure students are as prepared as possible for these high-stakes tests, many districts and charter networks institute quarterly “benchmark” testing, where, for one-plus weeks every few months, regular classes are put on hiatus and students sit for mock standardized assessments meant to gauge their degree of preparation for the real thing. And, as anyone who grew up attending an American school is intimately familiar with, teachers tend to fill in the gaps between these tests with smaller, daily assignments and weekly or monthly tests meant to track student progress.

It may be unfair to compare the data practices of a national sports league that generated $8.3 billion in profits last year to those of the entire American public education system — which is estimated to cost federal, state, and local governments more than $720 billion annually. Yet is there a world where fourth-grade reading performance data is collected, aggregated, and analyzed just as intensely as the blood oxygen levels of an elite athlete?

The idea and application of “big data” and “data analytics” is still a fairly new one in the education realm. The vast majority of districts rely solely on traditional metrics such as the aforementioned assessment data, school attendance, and classroom-level grades. “Analytics” remains largely a process of collecting individual reports from multiple sources (e.g., teacher gradebooks, state assessment results, attendance records), and cobbling together some sort of description of past trends that will (hopefully) be useful for future decision-making.

However, there is a moderate shift — already taking place amongst many practitioners — from a focus on retroactive analysis of past data to “predictive and prescriptive” methods, i.e., building models that use past data to forecast future scenarios. The pandemic has spurred many districts to refine and ramp up their use of early-warning systems that use hand-picked indicators (for example, behavioral data, report card grades, even 3rd grade reading ability) to predict the likelihood of dropping out of high school. Furthermore, implementing more robust analytics solutions is a priority for 63% of school administrators who responded to a survey by the Center for Digital Education. Many agencies, including the federal Department of Education, are creating new C-suites and appointing “Chief Data Officers.” Since at least 2014, the state of Florida has administered the Florida Assessment for Instruction in Reading (FAIR), whose stated objectives include an intention “to predict students’ literacy success.”

Amongst those who are taking the first steps towards making sophisticated use of large-scale education datasets are a subset of education reformers who have plunged into NBA-style efforts. One such plunger, recently at the forefront, was AltSchool, a Silicon Valley startup headed by self-described “serial entrepreneur” Max Ventilla and given almost $200 million dollars in venture capital from Mark Zuckerberg. Until 2019, AltSchool ran half a dozen schools in the Bay Area, where they pioneered what one higher-up, in a 2016 article by EdWeek, defined as “automated metadata production.” This manifested itself in, among other things, motion-sensing cameras outfitted in every AltSchool classroom. In the same EdWeek article, AltSchool explained that “For now, the footage is mostly used by teachers and administrators on an ad hoc basis — if someone wants to review a particularly fruitful interaction with a student, for example.” However, AltSchool also confirmed plans to eventually apply “facial recognition software and complex algorithms […] to search for signs of student learning and engagement.”

These kinds of efforts spark concerns, long-present in other industries, regarding data collection integrity, privacy, and the ethics tied to conducting research on children — not to mention a kind of Big Brother-type surveillance culture that would most likely raise alarm bells for teachers and parents alike. For what it’s worth, AltSchool’s experiment did not last long. The startup closed its schools and re-branded itself as Altitude Learning in 2019. In May 2020, Ventilla left Altitude Learning for an executive role at DropBox. The website opaquely states that their focus is now on “providing the adaptive and technical support required to deliver a truly learner-centered school experience.” They say little about any current forays into big data — though they do mention that their experience operating schools “allowed us to develop a clear perspective on the role technology should and shouldn’t play in the classroom.”

The public-ed sweet spot for data scientists must lie somewhere in the middle of two extremes: on one hand, a reliance on the traditional large-scale, semi-annual assessments imposed by most states that have shown to yield little to no meaningful academic gains for large swaths of the American student population; on the other, methods that capture every classroom cough and squirm, akin to those pioneered by AltSchool that have thorny ethical implications.

Indeed, there are many problems — from predicting which teacher candidates would have the largest impact on student achievement to using population data to determine the best locations for new school buildings — that can be solved through smarter uses of existing data. Stopping short of motion-sensing cameras, there’s still much meaningful work to do. Just this month, an city audit cited the Washington, D.C. public school system’s failure to take advantage of $35 million in federal funds meant for a longitudinal data system, saying that, in the midst of a pandemic, the District is now “put[ting] the most vulnerable children at even greater risk academically.”

If those in the education field continue to think about how they can harness the potential of data science for public education, they could realize the type of systems-level change that has already transformed the stock market, consumer technology — and yes, professional sports.

--

--