How Responsive Are AI "Tutors," Really?

The Pedagogy of Packback's Writing Lab

Apr 29, 2024

This is the first of a Prose and Processors series on the explicit and implicit pedagogies embedded in various generative AI technologies. As I wrote in my overview of the series, I aim to consider what these platforms are trying to teach us, especially about writing, via their theoretical, practical, material, embodied, ethical, and ideological dimensions.

A commonplace argument in favor of generative AI in higher education is one of access: basically, students can use AI assistants to help them learn at the moment of need (i.e., when faculty aren’t in the office and tutoring services are closed). They’re always on, nonjudgmental, and imminently patient with failure. This is more or less the idea behind Packback’s Writing Lab. In a press release, they explain:

In its simplest form, Writing Lab is a personalized AI writing assistant available to students across the country, giving them the opportunity to learn how to write and develop their own unique voice. Writing Lab incorporates structured GPT-powered feedback while maintaining its foundations in student-centered pedagogy, student privacy, and academic honesty.

Without even examining the platform, we can see several pedagogical promises in this brief statement.

Personalization is a long sought-after goal in many pedagogies. The basic idea is that the more teaching is individualized to the learner, the more successful that learner will be. Aside from office hours, one-on-one tutoring (as in a writing center) is probably about as close as most students get to this kind of experience, at least in higher education.
Wide availability speaks to the dream of equitable access to education via education technologies. We can probably dismiss this promise out of hand: without an institutional license, writers have to pay $10/month for Writing Lab Pro—not exorbitant, but not necessarily worth it for someone on a budget. For the purposes of this post, I signed up for the two-week free trial of Writing Lab Pro. I plan to cancel it.
Unique voice is a bit of a vexed concept in writing pedagogy. Some writers might find it empowering to develop a voice they can use to express their ideas and experiences, especially if they are not pressed (forced) to adopt the voice of communities and cultures not their own. For some, the experience could be liberatory. For educators, assessment of unique voice is thorny. How do you even name the features of “voice,” let alone assess them? Furthermore, what about fields in which a unique voice isn’t really a priority? For better or worse, most STEM fields are far more interested in the science than the unique voice of the scientist.
Student-centered pedagogy, in my mind, typically centers on active learning with the instructor as a “guide on the side” rather than a “sage on the stage.” In practice, Packback’s other tools facilitate the scaling up of student writing by using peer response and AI to automate assessment, thus keeping the instructor’s labor lighter. This is not necessarily bad—other contextual factors matter—but it does suggest that Packback’s vision of student-centered pedagogy means students are doing the vast majority of the labor in writing intensive classrooms.
Student privacy and academic honesty are ethical considerations. Packback addresses the former through its privacy policy, which, in my read, is pretty banal: they use data to market, maintain the platform’s core functions, and enable grading. As we shall see, they address the latter by limiting AI interactions so that the AI agent will not produce text for the writer. Both appear to be articulated specifically to contrast with more open platforms, like ChatGPT, in educational settings. Those more open platforms are not necessarily FERPA compliant, and they are built as text generators that users can copy and paste as they please.

But, of course, these are just pedagogical promises, not pedagogical realities. How well does the Writing Lab live up to its promises? What other pedagogical assumptions does it include in practice?

To put Writing Lab to the test, I decided to see how it would respond to a draft style guide I was developing as part of a consulting gig for a local manufacturer. In the screenshots below, I have redacted company information to protect their confidentiality.

Limited Genre Awareness

A couple weeks ago, I wrote about ChatGPT’s struggles with genre. Those struggles are due mainly to the rich complexity of genres as rhetorical artifacts. A limited vision of genre is baked into ChatGPT.

After beginning a new project in Writing Lab, users are asked to select which kind of document they are writing, with four main categories—essays, reports, correspondence, and creative writing—within which are more specific genres. On a theoretical level, this teaches users that genres are neatly distinct, which simply is not true. For example, what is the difference between a persuasive essay and a research report? It depends on who you ask, the assignment, and the rhetorical situation at hand. Furthermore, in practice, Writing Lab assumes students know what kind of document they are writing. More troubling, any genre that does not fall into one of those four “buckets” is labeled “other,” including a host of other genres, such as style guides, case studies, ePortfolios, press releases, infographics, and scripts. If part of ideology entails beliefs about what exists (“the real”), then these genres barely register a blip on Packback’s radar.

Instead, the platform puts a pedagogical premium on “the essay.” Although I had selected “other,” and although Writing Lab itself allows for other genres, several of the suggested options for interacting with the AI assistant name “the essay” as the genre of choice.

A screenshot from Packback’s Writing Lab. Notice the prompt on the right: “How can I improve the uniqueness of my essay’s style and voice?” But I had not selected “essay.” I had selected “Other” because this is a style guide.

That said, to the credit of Writing Lab, it did give me some genre-specific feedback when I selected “Get overall feedback on how your writing could be improved.” Here is an excerpt from my chat:

Writing Lab: It looks like you've provided a detailed style guide for laboratory reports and white papers at [Company]. This is a great start! However, it's important to identify the specific type of essay or document you are creating this style guide for. Is it for a laboratory report, a white paper, or both? Once we identify the type of document, I can provide more specific and actionable feedback.
Me: This is a style guide for laboratory reports
WL: Great, thank you for clarifying that this style guide is specifically for laboratory reports. Since this is a style guide, it's important to ensure that it provides clear and actionable guidelines for the writers. Here are a few areas where you might consider providing more specific and actionable feedback in your style guide:

1. Clarity and Consistency: Ensure that the guidelines are clear and consistent throughout the document. For example, you might want to provide specific examples of how to structure the introduction, definitions, and data collection/results sections. This will help writers understand exactly what is expected of them.

2. Formatting and Layout: Provide specific instructions for formatting, such as font size, spacing, and margin requirements. Additionally, consider including examples of properly formatted tables, figures, and headings to guide writers in creating visually appealing and organized reports.

3. Language and Tone: Offer specific guidance on the use of language and tone in laboratory reports. For instance, you could provide examples of how to use non-evaluative language and passive voice effectively, as well as how to maintain a formal and precise tone throughout the document.

4. **Visual Design and Accessibility**: Include specific recommendations for visual design, such as color schemes, font choices, and visual hierarchy. This will help writers create reports that are visually appealing and easy to navigate.

By providing more specific and actionable guidance in these areas, your style guide can better support writers in producing high-quality laboratory reports. If you have any specific questions or need further assistance in developing these guidelines, feel free to ask!

Although I had included specific guidance and examples about language and tone in my style guide, I had not yet included detailed guidance on the other areas because I was waiting to see if the company already had a template in place that would cover formatting, layout, visual design, and so on. Had I not already had such elements in mind, this would be valuable feedback.

However, its utility as a writing assistant plummeted from there because of its uneven rhetorical rhetorical responsivity.

Uneven Rhetorical Awareness

The default view when composing in Writing Lab includes a word processor on the left (which can toggle to Research Notes and References) and a Feedback Summary on the right. The feedback summary suggests a hierarchy of concerns that prioritizes grammar, followed by style and content. From an ideological viewpoint, the platform thus suggests a hierarchy of importance: genre first (understood as category), followed by grammar, then style, and finally content—the element it is probably least positioned to support.

Furthermore, in practice, these elements remain mostly separate from one another, leading to some arhetorical, generic writinf feedback. The grammar checker gives a numerical count of “grammar faults” that are based on the “style” I selected (APA) but devoid of rhetorical awareness. For example, because I had copied and pasted from a Word document, it was unable to distinguish a numbered list from any other use of a number in writing; one of my first “faults” was with list item 1, where it told me, “Numbers less than 10 should be spelled out, replace with ‘one.’” Clearly this AI assistant struggles to understand symbols in context.

Lest this seems like a “gotcha” moment, the pedagogical ramifications matter: like many grammar checkers, the tool teaches a standardized version of written English that does not apply across all rhetorical situations. This grammar pedagogy might also conflict with the tool’s professed emphasis on supporting the “unique voices” of student writers if those writers choose to compose in an English dialect historically deemed “nonstandard.”

However, the platform did display some rhetorical awareness on the topic of voice. As I mentioned above, one of the default chat options is “How can I improve the uniqueness of my essay's style and voice?” I was intrigued to see how Writing Lab responded when an individual author’s voice is not a focal point for the genre, as in a style guide. Again, to its credit, it focused instead on the company’s ethos: “One way to do this is by incorporating the company's values and mission into the language and tone of the guide. How does [Company] differentiate itself from others in the industry? What values does it prioritize? Reflecting these aspects in the language and tone of the style guide can help to establish a unique voice for [Company].” Such questions of ethos can certainly shape a company style guide without putting undue emphasis on the individual voice of the style guide’s author.

Writing Lab also demonstrated some rhetorical savvy when I asked whether my examples were successful illustrations of the guidelines in the style guide. Rather than give me a pat “yes” or “no,” it suggested, “What methods have you used to gather feedback on the clarity and effectiveness of the examples in conveying the guidelines? Have you considered conducting a survey or gathering feedback through a focus group to assess the understanding and usefulness of the examples?” This is great advice, and a commonplace technique when writing technical documentation for an organization.

A Chatbot Is Not Inherently Responsive

I want to pause here and note that I had to be a bit nomadic to gather these few bright spots of helpful feedback, moving back and forth across different parts of the Writing Lab platform. Anyone expecting this “assistant” to be anything like a writing center consultation would be mistaken. Feedback, even on a single topic like genre or voice, did not occur chronologically in a single chat.

A big part of the problem is that conversations with the Writing Lab assistant quickly devolve after a few turns. This is probably the biggest problem with the tool’s uneven rhetorical responsivity because it is supposed to be Writing Lab’s biggest selling point. Let me explain.

Immediately after Writing Lab suggested I query potential users of the style guide about their experiences of it, I asked if it could help me develop survey or focus group questions. Here is another excerpt from our chat:

WL: It's great that you're looking to gather feedback on the clarity and effectiveness of the examples in your style guide. However, my expertise lies in providing actionable feedback on writing, so I can help you develop survey or focus group questions related to the content of your laboratory reports style guide. If you have specific questions or areas of the style guide that you'd like to gather feedback on, feel free to share them, and I can assist you in developing relevant survey or focus group questions.
Me: I would like you to help me develop survey or focus group questions about:
1. The level of detail in the style guide. What else should I include to make it more useful?
2. The examples. Do they successfully illustrate the relevant guidelines?
WL: What are the key elements you want to capture feedback on in the style guide? How can you structure questions to gather detailed feedback on the examples provided in the style guide? Consider asking about specific instances where the examples effectively demonstrate the guidelines and where they may fall short. Additionally, think about how you can encourage respondents to provide open-ended feedback and suggestions for improvement.
Me: Will you suggest some specific questions for me?
WL: How can you design questions that effectively gather feedback on the level of detail and the effectiveness of examples in your style guide for laboratory reports? What are some open-ended questions you could include to allow respondents to provide more detailed feedback and suggestions for improvement?
Me: Here is an example: Do the examples in the style guide provide successful illustrations of the guidelines? Always, most of the time, sometimes, rarely.
WL: What specific questions could you ask to gather feedback on the level of detail in the style guide and the effectiveness of the examples? Consider including open-ended questions to allow respondents to provide more detailed feedback and suggestions for improvement.

At this point in the chat, I felt like I hit a brick wall. Now, I think I understand the problem, and it comes back to the ethical value of academic integrity: Writing Lab was refusing to write for me. Instead, it was using a nondirective approach in an effort to ensure that my writing was “my own.” We teach nondirective tutoring in our writing center as well. However, writing center scholars have argued that sometimes we need to be more directive. Withholding knowledge or experience in the interest of a decontextualized “rule” about nondirective tutoring can also be unethical. Even the most nondirective tutor will probably be willing to model a survey question before encouraging a writer to try for themselves. In contrast, Writing Lab simply spun its wheels, repeating the same basic question, even when I gave it an example. Even the most novice human tutor is unlikely to respond this way.

The Pedagogical Bottom Line

All this suggests to me a few pedagogical assumptions built into the platform:

AI feedback must be nondirective at all costs. The non-directive feedback imperative is called “student-centered,” but it actually appears to protect students, and Packback, from accusations of academic dishonesty. As a corollary, writing and feedback, according to Writing Lab, must remain separate unless related to grammar and style. Indeed, writing and substantive feedback are physically separate on the screen, and all my attempts to direct the chatbot to give me feedback on specific parts of my text failed, defaulting to nondirective questioning.
Writing is probably happening in a composition course that assigns essays. This may well be true for a majority of Packback Writing Lab users, but it renders an enormous number of genres in courses across disciplines basically invisible.
Genre, grammar, style, and content are mostly separate focus areas in writing. This may seem unproblematic because we do have separate names for them. It’s not necessarily a problem to pay attention to them to do different degrees at different times in writing classrooms. However, in practice they are inseparable. Grammatical principles and stylistic choices can vary widely from genre to genre, and genres are in part ways of naming the ways ideas and evidence (“content”) are wielded to address a given rhetorical situation. Writing Lab, in contrast, risks teaching them as separate, requiring users to jump across various tabs and tools to find a more unified, rhetorically-rich picture of writing.
The labor of classroom writing rests on the shoulders of individual students who can interact with algorithmic tools for feedback and guidance on writing. Of course, the “truth” of this statement depends on how the platform is deployed in context. But if we understand Writing Lab as teaching us something about writing by virtue of its interface and features, then it may teach us that writing is ultimately a lone enterprise.

I’m a bit worried that this first post has come off sounding like a review of Writing Lab. If it were, my final word would be unimpressed. This tool should not be seen as a replacement for human tutors. (To be fair, Packback doesn’t quite market it that way.) But I want to step back and end with a larger implication. My experience with Packback’s Writing Lab leads me to think we need to watch out for “disintegrative” conceptions of writing that treat genre as a container for content and grammar and style as mere polish. These problematic notions of writing have haunted writing instruction for decades. They appear poised to arise, reanimated, in AI-augmented platforms, if we are not vigilant about continuing to teach writing as a richly layered process of crafting symbols in recognizable-but-unique ways to build relationships with audiences, and potentially move them to one sort of action or another.

John Warner

Apr 30, 2024

Fantastically helpful exploration and unpacking. I'm struck by the same thing I've seen in my own explorations and seeing other people experiment with the AI writing tutor/feedback apps that for them to be truly useful to the actual process you need to already have that metacognitive awareness of your own process and the ultimate goal of the piece you're writing. I'm trying to imagine how a student would make use of this in a way that would allow them to develop those things and I'm not really seeing it. Your last paragraph on how the technology needs to sand away the notion of writing as "a richly layered process of crafting symbols in recognizable-but-unique ways to build relationships with audiences..." has been the core of my worries about how ChatGPT is being viewed as a "tool" for teaching writing.

It was the same problem when I was a grad student trying to teach discrete rhetorical modes (informative, narrative, expository) and it was like my students were literally resetting to zero each time because aside from process elements like brainstorming, drafting, revision, editing, etc...I wasn't giving them anything to latch onto that allowed them to build their overall writing practices. It was just a series of discrete assignments that they dutifully completed, learning little. We know so much more about the kinds of experiences and feedback that help students learn to write now, that it would be a shame to go backwards to accommodate what the AI is capable of.

Expand full comment

2 replies by Christopher Basgier and others

Anna Mills

This is such an interesting and helpful review--I'm grateful to read It! I'm comparing the experience with my own experiences with MyEssayFeedback.ai, and I'm curious if you would like to see software with more flexibilty about the feedback prompt so that writing instructors could design our own.

4 more comments...