Embracing Generative AI (GAI) in Education: Some Personal Reflections
We are currently interacting with what are likely the most basic GAI programs we will see in our lifetime. The inevitable advancement of these technologies will necessitate a thorough reevaluation of our pedagogical approaches and assessment criteria.
I’ve recently finished my second semester of incorporating GAI tools like ChatGPT, Bing, and Claude into my teaching. I remain firmly convinced that it’s crucial to train our students in these rapidly evolving technologies. For more on the importance of GAI in education, read ‘Why I’m Encouraging My Students to Use Generative AI’ (Strauss, 2023) [link].
So far, students coming into my classes have shown varied experience with GAI (e.g., some already subscribe to ChatGPT-4 and incorporate it into their workflow, others have occasionally used the free versions of GAI that are available, but don’t make regular use of it). Similar to the early days of word processors and spreadsheets, some students might come to class already having access to these new tools, while for other students it would be a new experience. Initially, these earlier tools demanded new skills and adaptations in teaching methods (e.g., course assignments designed to be done with typewriters and calculators needed to be rethought for use with word processors and spreadsheets). But what was once innovative (such as Excel proficiency) has become a basic professional requirement. Likewise, GAI experience will become commonplace, but we are now in a transitional phase, requiring us all to think about teaching and learning with these new tools.
From a pedagogical point of view, the first questions to address are which GAI tool(s) should be used in a course, and whether all students should be required to use the same one(s). I’ve chosen to require ChatGPT-4 for my courses, for its advanced capabilities and its reasonable cost ($60 for a 3-month semester, which is similar to the price of a textbook). This standardization simplifies student support. Other instructors may prefer to leave the choice up to the students, but in my experience, that can lead to extra time spent addressing differences between the various GAIs. The educational landscape is also changing, with universities potentially securing licenses for specific GAI tools. For instance, Microsoft has integrated GAI into its Office suite (which presumably is available to universities on an enterprise license), and Arizona State University has partnered with OpenAI to provide ChatGPT-4 access to its community.
So far, I’ve found it necessary to dedicate a portion of classroom time to address what I guess you could call GAI “housekeeping.” These include critical topics, such as personal accountability (students will be graded on what they submit, they can’t blame the “AI”), algorithmic bias, the propensity for AI-generated hallucinations, issues of alignment, the potential ethical dilemmas posed by AI, replicability concerns, and more. It’s essential that students be not just superficially aware, but also have a basic understanding, of the challenges and limitations inherent in GAI technology.
In particular, the concept of replicability in GAI is a nuanced topic that in my experience many students have not considered. Responses from a GAI have an element of randomness (or if you prefer, creativity). One consequence of this GAI trait is that their results are often not replicable. To illustrate this point, I conduct an exercise where each student submits the same prompt to ChatGPT-4, and then shares ChatGPT-4’s responses on the class discussion board. The responses students receive to identical prompts are broadly similar but never identical, providing an enlightening revelation for the students about the unpredictability and individuality of GAI responses. A related point is the black box nature of current AI’s. You can debug a spreadsheet to see where things went wrong, but a GAI cannot explain its own reasoning or why it did things. If you ask it for an explanation, it will give you a made-up answer.
This distinction starkly contrasts with deterministic software like Excel, which provides predictable and consistent results. ChatGPT-4, along with other GAI tools, inherently incorporates a degree of randomness into its design (for a comprehensive discussion of the use of randomness in GPTs, see Wolfram (2023)). Notably, students familiar with the predictability of deterministic software, often encounter the variability of ChatGPT-4’s outputs as a surprise.
So far, the quality of GAI responses (such as from ChatGPT-4) for what I teach has been moderate, typically rating a C+/B-. It’s more of a teaching or writing aid to help students refine ideas, brainstorm, prepare a draft, or understand concepts. For example, I’ll sometimes assign that students get a first draft answer from ChatGPT-4, and then improve it. This is so they’ll learn ChatGPT-4’s strengths and limits. Alternatively, I might ask for the reverse, where I will ask them to submit their first draft to ChatGPT and then evaluate its feedback. However, I anticipate continuous improvement in GAI response quality with technological advancements.
Microsoft has Introduced an added dimension of complexity to this discussion by integrating (as previously mentioned) Copilot (their version of GAI, based on ChatGPT) into their Microsoft Office suite. This integration profoundly simplifies and automates the creation of documents, essays, and PowerPoint presentations for student assignments. When Copilot is combined with Office, it becomes hypothetically feasible for a student to command the software to compose their final paper autonomously, bypassing traditional research and reading processes. Professor Ethan Mollick at Wharton has effectively demonstrated the capabilities of Copilot through a trilogy of enlightening short videos (accessible at these links: [link] [link]). Consequently, in my opinion, we are going to need to rethink our approaches to grading and assignments.
For example, I’m placing increased emphasis on class participation within the overall grading schema. This change will involve more emphasis on students presenting and discussing their written work in class, and probably incorporating more strategic and intentional cold calling. Additionally, I’m contemplating the inclusion of low-stakes, in-class quizzes. This shift in assessment methods is predicated on the premise that, in the GAI era, submitted written work might not be an accurate reflection of a student’s genuine grasp of the course material.
From my experience, integrating GAI into educational practices hasn’t instantly enhanced productivity — for either myself or my students. Instead, it has called for a substantial initial effort from both educators and students. As an instructor, I’ve had to reassess and modify the homework assignments, which may involve elevating their complexity, varying the types of questions, guiding students on effectively applying GAI to their work, and rethinking my class teaching plans (as discussed above).
For the students, integrating GAI into their work entails additional effort, since learning to use GAI as a tool will be in addition to their basic course workload. However, I anticipate that as students and instructors become more adept with GAI, these frictional transition costs will decrease and we’ll see productivity benefits.
We’re currently interacting with what are likely the most basic GAI programs we’ll see in our lifetimes. The inevitable advancement of these technologies will necessitate a thorough reevaluation of our pedagogical approaches and assessment criteria. This shift will require greater effort and flexibility from everyone involved in the realm of education.
Note: The observations shared in this note are anecdotal, based on the graduate professional students I teach at Princeton SPIA. So please approach the comments and conclusions in this note with a degree of caution. And, these are strictly my opinions and aren’t intended to represent the views of Princeton University, SPIA or anyone other than myself.