Giving University Exams in the Age of Chatbots
Blog Livres Software À Propos
Giving University Exams in the Age of Chatbots by Ploum on 2026-01-19 What I like most about teaching "Open Source Strategies" at École Polytechnique de Louvain is how much I learn from my students, especially during the exam. I dislike exams. I still have nightmares about exams. That’s why I try to subvert this stressful moment and make it a learning opportunity. I know that adrenaline increases memorization dramatically. I make sure to explain to each student what I was expecting and to be helpful. Here are the rules: 1. You can have all the resources you want (including a laptop connected to the Internet) 2. There’s no formal time limit (but if you stay too long, it’s a symptom of a deeper problem) 3. I allow students to discuss among themselves if it is on topic. (in reality, they never do it spontanously until I force two students with a similar problem to discuss together) 4. You can prepare and bring your own exam question if you want (something done by fewer than 10% of the students) 5. Come dressed for the exam you dream of taking! This last rule is awesome. Over the years, I have had a lot of fun with traditional folkloric clothing from different countries, students in pajamas, a banana and this year’s champion, my Studentausorus Rex!
An inflatable Tyranosaurus Rex taking my exam in 2026
My all-time favourite is still a fully clothed Minnie Mouse, who did an awesome exam with full face make-up, big ears, big shoes, and huge gloves. I still regret not taking a picture, but she was the very first student to take my words for what was a joke and started a tradition over the years. Giving Chatbots Choice to the Students Rule N°1 implies having all the resources you want. But what about chatbots? I didn’t want to test how ChatGPT was answering my questions, I wanted to help my students better understand what Open Source means. Before the exam, I copy/pasted my questions into some LLMs and, yes, the results were interesting enough. So I came up with the following solution: I would let the students choose whether they wanted to use an LLM or not. This was an experiment. The questionnaire contained the following: # Use of Chatbots Tell the professor if you usually use chatbots (ChatGPT/LLM/whatever) when doing research and investigating a subject. You have the choice to use them or not during the exam, but you must decide in advance and inform the professor. Option A: I will not use any chatbot, only traditional web searches. Any use of them will be considered cheating. Option B: I may use a chatbot as it’s part of my toolbox. I will then respect the following rules: 1) I will inform the professor each time information come from a chatbot 2) When explaining my answers, I will share the prompts I’ve used so the professor understands how I use the tool 3) I will identify mistakes in answers from the chatbot and explain why those are mistakes Not following those rules will be considered cheating. Mistakes made by chatbots will be considered more important than honest human mistakes, resulting in the loss of more points. If you use chatbots, you should be held accountable for the output. I thought this was fair. You can use chatbots, but you will be held accountable for it. Most Students Don’t Want to Use Chatbots This January, I saw 60 students. I interacted with each of them for a mean time of 26 minutes. This is a tiring but really rewarding process. Of 60 students, 57 decided not to use any chatbots. For 30 of them, I managed to ask them to explain their choices. For the others, I unfortunately did not have the time. After the exam, I grouped those justifications into four different clusters. I did it without looking at their grades. The first group is the "personal preference" group. They prefer not to use chatbots. They use them only as a last resort, in very special cases or for very specific subjects. Some even made it a matter of personal pride. Two students told me explicitly "For this course, I want to be proud of myself." Another also explained: "If I need to verify what an LLM said, it will take more time!" The second group was the "never use" one. They don’t use LLMs at all. Some are even very angry at them, not for philosophical reasons, but mainly because they hate the interactions. One student told me: "Can I summarize this for you? No, shut up! I can read it by myself you stupid bot." The third group was the "pragmatic" group. They reasoned that this was the kind of exam where it would not be needed. The last and fourth group was the "heavy user" group. They told me they heavily use chatbots but, in this case, were afraid of the constraints. They were afraid of having to justify a chatbot’s output or of missing a mistake. After doing that clustering, I wrote the grade of each student in its own cluster and I was shocked by how coherent it was. Note: grades are between 0 and 20, with 10 being the minimum grade to pass the class. The "personal preference" students were all between 15 and 19, which makes them very good students, without exception! The "proud" students were all above 17! The "never use" was composed of middle-ground students around 13 with one outlier below 10. The pragmatics were in the same vein but a bit better: they were all between 12 and 16 without exceptions. The heavy users were, by far, the worst. All students were between 8 and 11, with only one exception at 16. This is, of course, not an unbiased scientific experiment. I didn’t expect anything. I will not make any conclusion. I only share the observation. But Some Do Of 60 students, only 3 decided to use chatbots. This is not very representative, but I still learned a lot because part of the constraints was to show me how they used chatbots. I hoped to learn more about their process. The first chatbot student forgot to use it. He did the whole exam and then, at the end, told me he hadn’t thought about using chatbots. I guess this put him in the "pragmatic" group. The second chatbot student asked only a couple of short questions to make sure he clearly understood some concepts. This was a smart and minimal use of LLMs. The resulting exam was good. I’m sure he could have done it without a chatbot. The questions he asked were mostly a matter of improving his confidence in his own reasoning. This reminded me of a previous-year student who told me he used chatbots to study. When I asked how, he told me he would tell the chatbot to act as the professor and ask exam questions. As a student, this allowed him to know whether he understood enough. I found the idea smart but not groundbreaking (my generation simply used previous years’ questions). The third chatbot-using student had a very complex setup where he would use one LLM, then ask another unrelated LLM for confirmation. He had walls of text that were barely readable. When glancing at his screen, I immediately spotted a mistake (a chatbot explaining that "Sepia Search is a compass for the whole Fediverse"). I asked if he understood the problem with that specific sentence. He did not. Then I asked him questions for which I had seen the solution printed in his LLM output. He could not answer even though he had the answer on his screen. But once we began a chatbot-less discussion, I discovered that his understanding of the whole matter was okay-ish. So, in this case, chatbots disserved him heavily. He was totally lost in his own setup. He had LLMs generate walls of text he could not read. Instead of trying to think for himself, he tried to have chatbots pass the exam for him, which was doomed to fail because I was asking him, not the chatbots. He passed but would probably have fared better without chatbots. Can chatbots help? Yes, if you know how to use them. But if you do, chances are you don’t need chatbots. A Generational Fear of Cheating One clear conclusion is that the vast majority of students do not trust chatbots. If they are explicitly made accountable for what a chatbot says, they immediately choose not to use it at all. One obvious bias is that students want to please the teacher, and I guess they know where I am on this spectrum. One even told me: "I think you do not like chatbots very much so I will pass the exam without them" (very pragmatic of him). But I also minimized one important generational bias: the fear of cheating. When I was a student, being caught cheating was a clear zero for the exam. You could, in theory, be expelled from university for aggravated cheating, whatever "aggravated" could mean. During the exam, a good number of students called me panicked because Google was forcing autogenerated answers and they could not disable it. They were very worried I would consider this cheating. First, I realized that, like GitHub, Google has a 100% market share, to the point students don’t even consider using something else a possibility. I should work on that next year. Second, I learned that cheating, however lightly, is now considered a major crime. It might result in the student being banned from any university in the country for three years. Discussing exam with someone who has yet to pass it might be considered cheating. Students have very strict rules on their Discord. I was completely flabbergasted because, to me, discussing "What questions did you have?" was always part of the collaboration between students. I remember one specific exam where we gathered in an empty room and we helped each other before passing it. When one would finish her exam, she would come back to the room and tell all the remaining students what questions she had and how she solved them. We never considered that "cheating" and, as a professor, I always design my exams hoping that the good one (who usually choose to pass the exam early) will help the remaining crowd. Every learning opportunity is good to take! I realized that my students are so afraid of cheating that they mostly don’t collaborate before their exams! At least not as much as what we were doing. In retrospect, my instructions were probably too harsh and discouraged some students from using chatbots. Stream of Consciousness
My 2025 banana student!
Another innovation I introduced in the 2026 exam was the stream of consciousness. I asked them to open an empty text file and keep a stream of consciousness during the exam. The rules were the following: In this file, please write all your questions and all your answers as a "stream of consciousness." This means the following rules: 1. Don’t delete anything. 2. Don’t correct anything. 3. Never go backward to retouch anything. 4. Write as thoughts come. 5. No copy/pasting allowed (only exception: URLs) 6. Rule 5. implies no chatbot for this exercice. This is your own stream of consciousness. Don’t worry, you won’t be judged on that file. This is a tool to help you during the exam. You can swear, you can write wrong things. Just keep writing without deleting. If you are lost, write why you are lost. Be honest with yourself. This file will only be used to try to get you more points, but only if it is clear that the rules have been followed. I asked them to send me the file within 24h after the exam. Out of 60 students, I received 55 files (the remaining 5 were not penalized). There was also a bonus point if you sent it to the exam git repository using git-send-email, something 24 managed to do correctly. The results were incredible. I did not read them all but this tool allowed me to have a glimpse inside the minds of the students. One said: "I should have used AI, this is the kind of question perfect for AI" (he did very well without it). For others, I realized how much stress they had but were hiding. I was touched by one stream of consciousness starting with "I’m stressed, this doesn’t make any sense. Why can’t we correct what we write in this file" then, 15 lines later "this is funny how writing the questions with my own words made the problem much clearer and how the stress start to fade away". And yes, I read all the failed students and managed to save a bunch of them when it was clear that they, in fact, understood the matter but could not articulate it well in front of me because of the stress. Unfortunately, not everybody could be saved. Conclusion My main takeaway is that I will keep this method next year. I believe that students are confronted with their own use of chatbots. I also learn how they use them. I’m delighted to read their thought processes through the stream of consciousness. Like every generation of students, there are good students, bad students and very brilliant students. It will always be the case, people evolve (I was, myself, not a very good student). Chatbots don’t change anything regarding that. Like every new technology, smart young people are very critical and, by defintion, smart about how they use it. The problem is not the young generation. The problem is the older generation destroying critical infrastructure out of fear of missing out on the new shiny thing from big corp’s marketing department. Most of my students don’t like email. An awful lot of them learned only with me that Git is not the GitHub command-line tool. It turns out that by imposing Outlook with mandatory subscription to useless academic emails, we make sure that students hate email (Microsoft is on a mission to destroy email with the worst possible user experience). I will never forgive the people who decided to migrate university mail servers to Outlook. This was both incompetence and malice on a terrifying level because there were enough warnings and opposition from very competent people at the time. Yet they decided to destroy one of the university’s core infrastructures and historical foundations (UCLouvain is listed by Peter Salus as the very first European university to have a mail server, there were famous pioneers in the department). By using Outlook, they continue to destroy the email experience. Out of 55 streams of consciousness, 15 ended in my spam folder. All had their links destroyed by Outlook. And university keep sending so many useless emails to everyone. One of my students told me that they refer to their university email as "La boîte à spams du recteur" (Chancellor’s spam inbox). And I dare to ask why they use Discord? Another student asked me why it took four years of computer engineering studies to get a teacher explaining to them that Git was not GitHub and that GitHub was part of Microsoft. He had a distressed look: "How could I have known? We were imposed GitHub for so many exercises!"
How GitHub monopoly is destroying the open source ecosystem (ploum.net)
Each year, I tell my students the following: It took me 20 years after university to learn what I know today about computers. And I’ve only one reason to be there in front of you: be sure you are faster than me. Be sure that you do it better and deeper than I did. If you don’t manage to outsmart me, I will have failed. Because that’s what progress is about. Progress is each generation going further than the previous one while learning from the mistakes of olders. I’m there to tell you about my own mistakes and the mistakes of my generation. I know that most of you are only there to get a diploma while doing the minimal required effort. Fair enough, that’s part of the game. Challenge accepted. I will try to make you think even if you don’t intend to do it. In earnest, I have a lot of fun teaching, even during the exam. For my students, the mileage may vary. But for the second time in my life, a student gave me the best possible compliment: — You know, you are the only course for which I wake up at 8AM. To which I responded: – This is reciprocal. I hate waking up early, except to teach in front of you. About the author I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress. I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!
Permalinks: https://ploum.net/2026-01-19-exam-with-chatbots.html gemini://ploum.net/2026-01-19-exam-with-chatbots.gmi |
The blog post titled *Giving University Exams in the Age of Chatbots* by Ploum, an instructor at École Polytechnique de Louvain, explores innovative approaches to assessing students in a rapidly evolving technological landscape. Ploum’s pedagogical philosophy centers on transforming exams from high-stress evaluations into collaborative learning opportunities. His unconventional methods—such as allowing unrestricted resources, rejecting rigid time limits, and encouraging students to dress according to their ideal exam environment—reflect a broader commitment to reducing anxiety while fostering critical thinking. The article details an experiment in which students could choose whether to use chatbots like ChatGPT during the exam, a decision that ultimately revealed stark differences in student behavior, attitudes toward technology, and academic performance. Ploum’s reflections also extend beyond the classroom, touching on generational divides in technology adoption and institutional failures that hinder students’ engagement with open-source tools.
Ploum’s exam structure is designed to challenge traditional notions of assessment. By permitting students to use any resources, including internet access and personal study materials, he emphasizes the importance of applying knowledge rather than rote memorization. The absence of a strict time limit allows students to engage deeply with the material, though he notes that prolonged durations often signal underlying struggles. Peer discussions are encouraged but rarely initiated spontaneously; Ploum must actively pair students to facilitate collaboration, a strategy that highlights the need for structured interaction in academic settings. The rule requiring students to dress as they wish—leading to creative outfits like a T-rex costume or Minnie Mouse attire—adds an element of fun and personal expression, which Ploum views as integral to reducing exam-related stress. This approach underscores his belief that learning should be an immersive, dynamic process rather than a rigid exercise in compliance.
A central focus of the article is Ploum’s experiment with integrating chatbots into exams. Recognizing that students might use tools like ChatGPT for research, he designed a questionnaire to clarify their intentions. Students could opt between two paths: either fully disengaging from chatbots or using them under strict accountability. The latter required documenting all interactions with the AI, explaining how prompts were used, and identifying errors in its outputs. Ploum framed this as a way to teach students about the ethical and practical implications of AI, rather than simply testing their ability to avoid cheating. However, the results were revealing: out of 60 students, only three chose to use chatbots. The majority either preferred traditional methods or avoided the tools due to perceived risks, such as being held more strictly accountable for AI-generated mistakes. Ploum observed that students who used chatbots often struggled to articulate their reasoning, suggesting a reliance on the tool rather than independent thought. This insight led him to question whether AI’s role in education should be one of augmentation rather than substitution.
The article also delves into the psychological and social dynamics of exam-taking in the digital age. Ploum notes a pervasive fear among students regarding academic integrity, particularly concerning tools like Google and chatbots. Many expressed anxiety about unintentional cheating, such as encountering autogenerated answers or being unable to disable features they found intrusive. This reflects a broader cultural shift in which even minor infractions are treated as severe transgressions, with potential consequences including expulsion or long-term academic bans. Ploum contrasts this with his own experiences as a student, where collaboration—such as sharing exam questions or discussing solutions in groups—was seen as a natural part of learning. He argues that modern institutional policies, which prioritize punitive measures over fostering critical thinking, have eroded traditional forms of peer support. This tension between innovation and tradition is a recurring theme in the post, highlighting how educational systems often fail to adapt to technological changes without exacerbating student stress.
Another notable innovation in Ploum’s exam design was the "stream of consciousness" exercise, which required students to document their thoughts in real time during the test. By prohibiting deletions, corrections, or backtracking, this method aimed to capture unfiltered problem-solving processes. The resulting files provided unexpected insights into students’ mental states, revealing both their struggles and moments of clarity. Some admitted regret for not using AI, while others expressed frustration with the inability to revise their work—a limitation that mirrored real-world challenges in collaborative environments. Ploum used these reflections to identify students who, despite low grades, demonstrated a solid grasp of the material but struggled under pressure. This approach underscored the value of self-awareness and transparency in learning, as well as the limitations of standardized assessments in capturing nuanced understanding.
Ploum’s reflections on technology extend beyond the classroom, addressing systemic issues in educational infrastructure. He criticizes the university’s shift to Microsoft Outlook for email services, arguing that it has alienated students by creating a poor user experience and fostering distrust in institutional communication. The forced adoption of proprietary tools like GitHub, he contends, has distorted students’ understanding of open-source principles, with many confusing GitHub as the definition of Git itself. These institutional choices, he suggests, reflect a broader trend of prioritizing corporate interests over pedagogical needs, alienating students and stifling innovation. Ploum’s frustration is palpable when he recounts a student’s distress over realizing that GitHub was part of Microsoft, highlighting how institutional policies can obscure fundamental concepts.
Despite these challenges, Ploum remains optimistic about the potential of technology to enhance learning when used thoughtfully. He acknowledges that chatbots can be valuable tools if students understand how to engage with them critically, but emphasizes that true mastery often emerges from independent effort. His own experiences as a student, marked by a late realization of his strengths, inform his belief that each generation must build upon the mistakes of the past. He encourages students to outpace their predecessors, not through passive compliance but through proactive inquiry and resilience. This ethos is reflected in his final remarks about a student who expressed gratitude for waking up early to attend his class, a compliment that encapsulates the human connection he seeks to foster through unconventional teaching methods.
The post concludes with Ploum’s broader vision for education, one that balances technological integration with a commitment to critical thinking and ethical engagement. He laments the "older generation’s" fear of new technologies, which he argues has led to institutional decisions that prioritize short-term control over long-term learning. By contrast, he highlights the adaptability of younger students, who, despite their struggles with bureaucracy, remain open to exploring tools like Git and open-source practices. His critique of Microsoft’s dominance in academic infrastructure serves as a reminder that technological progress must be guided by pedagogical values rather than corporate agendas.
Ultimately, *Giving University Exams in the Age of Chatbots* is a multifaceted exploration of how education can evolve to meet the demands of a digital era. Through his experimental exam methods, Ploum challenges conventional norms while advocating for a more human-centered approach to learning. His emphasis on transparency, accountability, and personal growth offers a compelling alternative to the rigid structures that often define higher education. As he reflects on the complexities of teaching in an age of AI and institutional inertia, Ploum’s voice remains one of optimism—a belief that students, when empowered to think critically and act ethically, can navigate the challenges of their time with creativity and resilience. |