MIT writing faculty comment on GPT and other AI assisted writing in education by Chris Peterson SM '13
I’m briefly emerging from hard work in the application mines to share with you some thoughts from MIT writing faculty on AI-assisted writing.
You’ve probably seen some of the recent news about GPT-3 and ChatGPT, technologies from OpenAI that (among other things) produce impressive natural language text in response to prompts. There has been a tremendous amount of commentary in the popular press (and Discourse™ on social media) about these technologies, and especially if and how they will complicate or obviate the college (or even admissions application) essay, to say nothing of broader supposed implications for all forms of human text generation.
For reasons that will soon become evident, I think most of this is at best overstated. However, it certainly does pose interesting questions for pedagogy, especially those who are focused on writing and teaching.
So I was pleased to see that last week, the Comparative Media Studies/Writing (CMS/W) collected some Advice and responses from faculty on ChatGPT and A.I.-assisted writing. CMS/W is where I did my grad degree, and also occasionally teach CMS.614: Critical Internet Studies, a Communication Intensive course designed to help fulfill our Communication Requirement, which was “developed out of the belief that MIT students, regardless of their field of study, should learn to write prose that is clear, organized, and effective, and to marshal facts and ideas into convincing written and oral presentations.” CMS/W is also the departmental home for the MIT Writing and Communication Center (WCC), where accomplished scholars and writers help students strategize about all types of academic, creative, job-related, and professional writing as well as about all aspects of oral presentations. These programs (and the people affiliated with them) have long been influential thought leaders in computational writing as well as the Going back at least to when <a href="https://en.wikipedia.org/wiki/Les_Perelman">Les Perelman</a> was demonstrating how the SAT Writing Section could be gamed. and I am happy to see that trend continue.
First, Professors Nick Montfort and Ed Schiappa wrote Advice Concerning the Increase in AI-Assisted Writing, which I read with great interest while working on my syllabus this spring, and which they gave me permission to cross-post here:
Advice Concerning the Increase in AI-Assisted Writing
There has been a noticeable increase in student use of AI assistance for writing recently. Instructors have expressed concerns and have been seeking guidance on how to deal with systems such as GPT-3, which is the basis for the very recent ChatGPT. The following thoughts on this topic are advisory from the two of us: They have no official standing within even our department, and certainly not within the Institute. Nonetheless, we hope you find them useful.
Newly available systems go well beyond grammar and style checking to produce nontrivial amounts of text. There are potentially positive uses of these systems; for instance, to stimulate thinking or to provide a variety of ideas about how to continue from which students may choose. In some cases, however, the generated text can be used without a great deal of critical thought to constitute almost all of a typical college essay. Our main four suggestions for those teaching a writing subject are as follows:
* Explore these technologies yourself and read what has been written about them in peer-reviewed and other publications,
* understand how these systems relate to your learning goals
* construct your assignments to align with learning goals and the availability of these systems, and
* include an explicit policy regarding AI/LLM assistance in your syllabus.
Exploring AI and LLMs
LLMs (Large Language Models) such as those in the GPT series have many uses, for instance in machine translation and speech recognition, but their main implications for writing education have to do with natural language generation. A language model is a probability distribution over sequences of words; ones that are “large” have been trained on massive corpora of texts. This allows the model to complete many sorts of sentences in cohesive, highly plausible ways that are sometimes semantically correct. An LLM can determine, for instance, that the most probable completion of the word sequence “World War I was triggered by” is “the assassination of Archduke Franz Ferdinand” and can continue from there. While impressive in many ways, these models also have several limitations. We are not currently seeking to provide a detailed critique of LLMs, but advise that instructors read about the capabilities and limitations of AI and LLMs.
To understand more about such systems, it is worth spending some time with those that are freely available. The one attracting the most attention is ChatGPT. The TextSynth Playground also provides access to several free/open-source LLMs, including the formidable GPT-NeoX-20B. ChatGPT uses other AI technologies and is presented in the form of a chatbot, while GPT-NeoX-20B is a pure LLM that allows users to change parameters in addition to providing prompts.
Without providing a full bibliography, there is considerable peer-reviewed literature on LLMs and their implications. We suggest “GPT-3: What’s it good for?” by Robert Dale and “GPT-3: Its Nature, Scope, Limits, and Consequences” by Luciano Floridi & Massimo Chiriatti. These papers are from 2020 and refer to GPT-3; their insights about LLMs remain relevant. Because ChatGPT was released in late November 2022, the peer-reviewed research about it is scant. One recent article, “Collaborating With ChatGPT: Considering the Implications of Generative Artificial Intelligence for Journalism and Media Education,” offers a short human-authored introduction and conclusion, presenting sample text generated by ChatGPT between these.
Understanding the Relationship of AI and LLMs to Learning Goals
The advisability of any technology or writing practice depends on context, including the pedagogical goals of each class.
It may be that the use of a system like ChatGPT is not only acceptable to you but is integrated into the subject, and should be required. One of us taught a course dealing with digital media and writing last semester in which students were assigned to computer-generate a paper using such a freely-available LLM. Students were also assigned to reflect on their experience afterwards, briefly, in their own writing. The group discussed its process and insights in class, learning about the abilities and limitations of these models. The assignment also prompted students to think about human writing in new ways.
There are, however, reasons to question the practice of AI and LLM text generation in college writing courses.
First, if the use of such systems is not agreed upon and acknowledged, the practice is analogous to plagiarism. Students will be presenting writing as their own that they did not produce. To be sure, there are practices of ghost writing and of appropriation writing (including parody) which, despite their similarity to plagiarism, are considered acceptable in particular contexts. But in an educational context, when writing of this sort is not authorized or acknowledged, it does not advance learning goals and makes the evaluation of student achievement difficult or impossible.
Second, and relatedly, current AI and LLM technologies provide assistance that is opaque. Even a grammar checker will explain the grammatical principle that is being violated. A writing instructor should offer much better explanatory help to a student. But current AI systems just provide a continuation of a prompt.
Third, the Institute’s Communication Requirement was created (in part) in response to alumni reporting that writing and speaking skills were essential for their professional success, and that they did not feel their undergraduate education adequately prepared them to be effective communicators. It may be, in the fullness of time, that learning how to use AI/LLM technologies to assist writing will be an important or even essential skill. But we are not at this juncture yet, and the core rhetorical skills involving in written and oral communication—invention, style, grammar, reasoning and argument construction, and research—are ones every MIT undergraduate still needs to learn.
For these reasons, we suggest that you begin by considering what the objectives are for your subject. If you are aiming to help students critique digital media systems or understand the implications of new technologies for education, you may find the use of AI and LLMs not only acceptable but important. If your subject is Communication Intensive, however, an important goal for your course is to develop and enhance your students’ independent writing and speaking ability. For most CI subjects, therefore, the use of AI-assisted writing should be at best carefully considered. It is conceivable at some point that it will become standard procedure to teach most or all students how to write with AI assistance, but in our opinion, we have not reached that point.The cognitive and communicative skills taught in CI subjects require that students do their own writing, at least at the beginning of 2023.
Constructing Assignments in Light of Learning Goals and AI/LLMs
Assigning students to use AI and LLMs is more straightforward, so we focus on the practical steps that can be taken to minimize use when the use of these system does not align with learning goals. In general, the more detailed and specific a writing assignment, the better, as the prose generated by ChatGPT (for example) tends to be fairly generic and plain.
Furthermore, instructors are encouraged to consult MIT’s writing and communication resources to seek specific advice as to how current assignments can be used while minimizing the opportunities for student use of AI assistance. These resources include Writing, Rhetoric, and Professional Communication program, the MIT Writing & Communication Center, and the English Language Studies program It is our understanding that MIT’s Teaching + Learning Lab will be offering advice and resources as well.
Other possible approaches include:
• In-class writing assignments
• Reaction papers that require a personalized response or discussion
• Research papers requiring quotations and evidence from appropriate sources
• Oral presentations based on notes rather than a script
• Assignments requiring a response to current events, e.g., from the past week
The last of these approaches is possible because LLMs are trained using a fixed corpus and there is a cutoff date for the inclusion of documents.
Providing an Explicit Policy
Announcing a policy clearly is important in every case. If your policy involves the prohibition of AI/LLM assistance, we suggest you have an honest and open conversation with students about it. It is appropriate to explain why AI assistance is counter to the pedagogical goals of the subject. Some instructors may want to go into more detail by exploring apps like ChatGPT in class and illustrating to students the relative strengths and weaknesses of AI-generated text.
In any case: What this policy should be depends on the kinds and topics of writing in your subject. If you do not wish your students to use AI assisted writing technologies, you should state so explicitly in your syllabus. If the use of this assistance is allowable within bounds, or even required because students are studying these technologies, that should be stated as well.
In the case of prohibition, you could simply state: “The use of AI software or apps to write or paraphrase text for your paper is not allowed.” Stronger wording could be: “The use of AI software or apps to write or paraphrase text for your paper constitutes plagiarism, as you did not author these words or ideas.”
There are automated methods (such as Turnitin and GPTZero) that can search student papers for AI-generated text and indicate, according to their own models of language, how probable it is that some text was generated rather than human-written. We do not, however, know of any precedent for disciplinary measures (such as failing an assignment) being instituted based on probabilistic evidence from such automated methods.
The use of AI/LLM text generation is here to stay. Those of us involved in writing instruction will need to be thoughtful about how it impacts our pedagogy. We are confident that the Institute’s Subcommittee on the Communication Requirement, the Writing, Rhetoric, and Professional Communication program, the MIT Writing & Communication Center, and the English Language Studies program will provide appropriate resources and counsel when appropriate. We also believe students are here at MIT to learn, and will be willing to follow thoughtfully advanced policies so they can learn to become better communicators. To that end, we hope that what we have offered here will help to open an ongoing conversation.
Second, Professor Eric Klopfer, head of CMS/W and Literature and director of the Scheller Teacher Education Program, and Associate Professor Justin Reich, director of the Teaching Systems Lab, wrote a response, which I also am crossposting, called Calculating the Future of Writing in the Face of AI.
When the AI text generation platform ChatGPT was released, it made a lot of people take notice of the potential impact of the coming wave of Large Language Models (and even proclaim the end of the college essay). The uproar over ChatGPT prompted mathematics educators to declare, “Ha, writing people, now it’s your turn!” Or, as math teacher Michael Pershan framed the issue, in an effort to capture the mood of ChatGPT headline writers, “I FED CASIO-FX1239x MY MATH TEST AND WAS ASTONISHED BY THE RESULTS.”
There now exists a computing device which, when given open-ended queries similar to those typically offered by instructors, responds instantly and automatically with a range of plausibly correct results. A reliance on this device could deprive students of the opportunities to develop for themselves the foundational skills necessary for the human resolution of said queries. Or it could usher in a new era of human inquiry and expression assisted by technology. Or both. The humble calculator has been the perfect rehearsal for the challenges we face today.
As calculators became cheap and ubiquitous, they threatened the teaching of rote mathematics (Per Urlaub and Eva Dessein draw a similar comparison to the related issue of automated language translation). At that inflection point, teachers typically pursued any of three choices: 1) they could allow calculators and risk trivializing existing learning experiences, 2) ban calculators in the hopes of preserving the pre-Casio status quo, or 3) adapt and create some learning environments where students could not use calculators (prominent today in the memorization of math facts like times tables) but in most places shift instruction to challenge students to think about what they need to enter into the calculators.
The first two tactics were largely unsuccessful, though they persist to some extent. Our challenge is to figure out where to preserve the best of older habits, while shifting to embrace new technologies for how, like calculators, they can augment human capacity in our disciplines.
Writing instructors already have much experience with these challenges. Spellcheck software was banned in many classrooms because of its potential negative effects on student spelling. Today, a student who didn’t spellcheck their paper is labeled as careless. Wikipedia, research sources on the internet, and automated translation services (for our colleagues teaching writing in second languages) have all bedeviled writing instructors in recent years.
We should be cautious about technological determinism; we should be cautious about the creeping belief that we simply have to accept and adapt to ChatGPT. Maybe ChatGPT is terrible in ways we don’t yet understand, and bans will prove to be a viable and important option. But the history of these recent aids to human cognitive and communication suggests that they are pretty helpful; the best of their contributions can be integrated into our writing practices, and the worst of their shortcuts can be mitigated by walling them off from targeted areas of our curriculum–by teaching bits and pieces here and there where ChatGPT and future tools are banned, like calculators are banned when students memorize math facts. If ChatGPT proves to be an outlier from this historical trend, by all means let’s ban it, but until then let’s proceed cautiously but open to possibilities. The proclamations of the end of the essay have come alongside both popular press such as in The New York Times and TIME and scholarly work (such as “New Modes of Learning Enabled by AI Chatbots: Three Methods and Assignments” by Ethan R. Mollick and Lilach Mollick) with a more hopeful outlook – one that both sees short term benefits and positive long term shifts.
In terms of practical advice for instructors who rely on written papers as a form of assessment, the most important advice is to try some of these tools out yourself to see what they can and cannot do. After doing so you may decide that as they stand now, you should prohibit their use entirely. But you might instead choose to limit use to certain cases that would need to be documented. For example, you might allow students to generate ideas for an initial draft but not allow any prose generated by the models. There are several ways in which these tools could be used right away:
* They can be used to help struggling writers who don’t know where to begin with their assignments.
* They can be used as a thought partner by students as they bounce around ideas that can be further developed.
* They can help students revise their ideas through reframing or restating their prose.
The presence of disruptive technologies like ChatGPT should cause us to reflect on our learning goals, outcomes, and assessments, and determine whether we should change them in response, wall off the technology, or change the technology itself to better support our intended outcomes. We should not let our existing approaches remain simply because this is the way we have always done them. We should also not ruin what is great about writing in the name of preventing cheating. We could make students hand-write papers in class and lose the opportunity to think creatively and edit for clarity. Instead, we should double down on our goal of making sure that students become effective written and oral communicators both here at MIT and beyond. For better or worse, these technologies are part of the future of our world. We need to teach our students if it is appropriate to use them, and, if so, how and when to work alongside them effectively.
In the 19th century, a group of secondary school educators believed urgently and fervently that sentence diagramming was an essential part of writing education. They fought tooth and nail in education journals, meetings, and curriculum materials to preserve their beloved practice. They were wrong. We’re fine without them. Maybe we will be fine without the short answer questions that call for the recounting of factual information, the sorts of homework questions where ChatGPT currently seems to do pretty well (though, just like our students, it sometimes makes up incorrect but plausible-sounding responses).
Assuredly, we will need to think deeply about the new affordances of ChatGPT, and the kinds of thinking and writing we want students to do in the age of AI and change our teaching and assessment to serve them better. We will have to have the kinds of conversations and negotiations with students that we had upon the arrival of Wikipedia, other Internet research sources, grammar and spell checkers, and other aids to research and writing. They took up a good bit of class time upon their arrival, and then we settled into more efficient and productive stances.
We will also need to think about whether these technologies cause us to define new outcomes and skills that we will need to find new ways to teach. At MIT, an institution which is at the forefront of AI and one that prides itself in both its teaching in the humanities and consideration of the human implications of technology, we should lead the way in studying and defining this path forward.
Ultimately, Michael Pershan from his math classroom sends us writing teachers his good wishes: “If the lives of math teachers — who have a head start on this calculator business — are any example, it’ll change things slightly and give everyone something fun to fight about.” We look forward to the ensuing debates.
I really appreciate both of these thoughtful, hype-free, helpful takes on the future of education and AI-assisted writing from some of MIT’s experts in the field, and wanted to share them on our blogs since I know they are on the minds of so many of our readers today.
- Going back at least to when Les Perelman was demonstrating how the SAT Writing Section could be gamed. back to text ↑