Beating AI in the Classroom
Last Monday, the faculty of Princeton University, where I teach, voted to reinstitute the proctoring of exams after more than 130 years of reliance on an honor code to prevent cheating. The move was inevitable. Between handheld electronic devices, artificial intelligence, and ever greater (if largely self-imposed) pressure on undergraduates to get high grades, the honor code simply could not function effectively anymore. Both the temptation to cheat, and the ease of cheating, are now just too great. The cheaters are learning little or nothing, while the students who refrain from cheating find themselves at a grossly unfair disadvantage.
Princeton’s decision, however, is best seen only as a good start. The educational landscape has shifted so massively in the past few years that we need fundamentally to rethink how we evaluate college students. Unfortunately, the direction we should move will involve far more costly pedagogical labor, and is diametrically opposed to the direction most universities are currently moving in.
I did not need to read this op-ed by Stanford senior Theo Baker to learn how deeply broken the current system has become. When I assign traditional term papers, it’s blindingly obvious that many students have had recourse to AI in one form or another. Some have done so in an acceptable way, using AI as a bibliographical tool to find useful sources. Some have gone further, using AI to suggest arguments. Some have let AI clean up their prose. Some have asked it to improve a draft. And some, of course, have simply given AI a prompt and let it write the paper. Proving AI use is difficult, and the technology itself is evolving so rapidly I find it hard to believe that detection systems will catch up any time soon.
Is there a way to outsmart AI and keep traditional term papers in the curriculum? A year ago, as I prepared to teach my fall upper-level seminar on the Enlightenment, I tried to devise essay prompts that were so specific to the reading, and so trickily phrased, that they would defeat AI. Each time I gave one of these prompts to ChatGPT, it returned a perfectly acceptable B+ paper. A colleague of mine has hit on a different solution. He requires students to compose their papers in Google Docs, and to use the features that allow him to track their writing process, keystroke by keystroke. Of course, there is no way to prevent the students from composing a paper on a different device and then copying it into the app. And as he himself admits, this degree of surveillance is nothing if not creepy.
In my fall seminar, I ended up eliminating the short paper assignment, and instead had students do three in-class writing exercises over the course of the semester, with pens, in blue books. The students knew perfectly well why I had made the change, and didn’t like the lack of trust it implied, but I felt I had no choice. In fact, the writing exercises improved the course, because they forced the students to review material they had read and discussed. The discussion of new, related material in subsequent weeks benefitted.
Meanwhile, some colleagues and I have started experimenting with a different form of evaluation: oral exams. In a couple of large lecture courses, at the end of the semester I have given students long lists of terms drawn from the reading and lectures. Each student then had a fifteen-minute appointment during which I or a teaching assistant would bring up terms at random, and ask him or her to identify them, and discuss their significance. The quality of the responses varied enormously and gave us a very good sense of who had actually learned the course material. One colleague has been doing oral exams with more involved, complicated questions, and then graded the students’ responses.
In the future, in-class and oral exercises of this sort may be the only way to evaluate students thoroughly and fairly, at least in humanities and social science courses. While it would be a big mistake to eliminate term papers and take-home exams entirely, it makes sense to reduce their importance, and to place greater weight on exercises in which students cannot use AI (unless, conceivably, they get responses on earbuds concealed under their hair…).
Will the shift come at a price? Of course. Writing skills will deteriorate if students do fewer extended writing exercises (or none, if they just use AI). But students will still need some level of expository skill in order to write the blue book assignments. And there might also be beneficial trade-offs. A shift towards evaluations that place a heavy weight on oral fluency is not such a bad thing. American schools and universities have always been notoriously poor at teaching oral skills—just compare the off-the-cuff oratory of the average American and the average British or French politician. In a world of podcasts and speech-to-text applications, making students better speakers will have serious benefits. For that matter, selective American universities might consider doing what Oxford and Cambridge have long done and admit students in part based on real interviews by faculty.
But the changes will also come at a different sort of price: a financial one. A move towards oral exams is a move towards individual exams, and properly administering individual exams in a large class takes an enormous amount of time. Evaluating them properly can only be done by instructors with genuine expertise and experience—multiple choice this isn’t. When I taught a large lecture course this past spring, I decided to forego the oral exam, because there were simply too many students for me to examine myself, and I thought it unfair to ask my teaching assistants to devote what would have been more than a day each to the exercise without extra compensation.
In short, moving towards more individual instruction and evaluation means—or should mean—hiring more instructors. Which is exactly what most US universities have been moving away from. Facing both financial and political pressures, they have cut teaching positions, programs, and entire departments. One friend and colleague, a prominent, tenured scholar in our field, was recently laid off at age 59. Another was informed that she would have, simultaneously, a 14 percent pay cut and a 33 percent increase in her course load. New positions remain scarce, and many superb doctoral students, who would have easily secured tenure-track positions twenty years ago (when the job market was already poor), now fail to find them.
The last thing most universities want to do at present is to hire more instructors in the humanities and social sciences, even if doing so is what the ongoing radical shifts in the educational landscape caused by the rise of AI may demand. Instead, they are rushing to integrate AI into their operations in countless ways, funneling more and more of their precious funds to the same tech companies that are so deeply undercutting their principal mission.
You may already see, dear reader, where this whole absurd process could well end up. I can already imagine some bright, ambitious sub-dean somewhere proposing the idea. To defeat AI, why not introduce comprehensive oral examinations administered by… AI?
At that point, I quit.


As someone who has been working the problem of transactional attitudes towards school since before ChatGPT appeared I'm deeply sympathetic to these challenges. We know that students need to do the actual stuff or nothing is going to be learned.
But as you intuit, things like policing LLM use or oral exams or all in-person writing all run up against downsides that, really, are unacceptable. In my view, the only route towards a solution is to move away from the framework of "schooling" and toward a root-level examination of the experiences of "learning." Because I am old and have written and taught writing for decades, I "know" that the struggle of learning is the point, but students at elite universities especially have been working inside a system that privileges achievement and optimization. The hard part is that we can't force the struggle on students through simply being more punitive. They have to opt-in, but to do so they have to be helped to understand what the struggle entails, how it's meaningful, how to manage it, and perhaps most importantly, they have to be given experiences that they believe are worth doing - rather than outsourcing - and then we have to assess them in ways that genuinely value the struggle.
I've done something like 60 talks and workshops exploring this challenge over the last 2+ years at institutions all across the country and up and down the latter of selectivity/prestige, and I can report that there is great progress to be made, but also, the challenge appears to be hardest at places like Princeton because of the way prestige and achievement drive the culture.
One of the best skills you develop in the humanities is the ability to write. Researching and writing longer papers was extremely valuable in my later career as a lawyer. Moreover, reading quality writing from others teaches you how to write better yourself.
I understand the need to stamp out cheating, but would hate to see all term papers eliminated.
University administrators need to place severe consequences on those who cheat. A few expulsions might send the right message.