I’ve been testing ChatGPT over the last couple of days. (If you don’t know what this chatbot is, here’s a good NYT article about ChatGPT and others currently in development.)
The avowed purpose of ChatGPT is to create an AI that can create believable dialogues. It does this by scouring the web for data it uses to respond to simple prompts.
By “simple,” I mean sometimes “horribly complicated,” of course. And sometimes a little ridiculous.
As has been pointed out, chatbots only generate texts based on what they have been fed, i.e., “garbage in / garbage out.” So if you push the programs hard enough, they will generate racist, sexist, homophobic etc awful stuff — because unfortunately that kind of sick and twisted garbage is still out there, somewhere online in a troll’s paradise.
So far, I have asked the program to:
- Write a haiku about winter without using the word “winter”
- Write a limerick about an Irish baseball player
- Write a dialogue between God and Nietzsche (I just had to…)
- Imagine what Jean-Paul Sartre and Immanuel Kant would say to each other (see above) but using US ’50 slang
- Have Thomas Aquinas and John Locke argue about the existence of God (that one was fun)
- Write a 300 word cause-effect essay about climate change
- Write a 300 word compare and contrast essay about the US and Japan
- Write a 1000 word short science fiction story based on Mars
- Write a 1500 word short science fiction about robots in the style of Philip K Dick
OK, and the verdict is:
It didn’t do too badly with the haiku. However, its vocabulary range was limited and when I generated a second and third haiku with the same directions (i.e., don’t use “winter” in the poem, because haiku are not supposed to directly use season’s names)…
…the program repeatedly used the word “winter,” even though I told it not to. It kept using words like “snowflakes” and “ice,” which naturally do bring “winter” to mind, but it couldn’t seem to figure out how else to describe the season.
The limerick was much worse…
“Faugh a ballagh!” Honestly. Let’s be a little more stereotypical, shall we.
I was hoping the program would substitute an actual Irish player’s name, but it simply repeated the phrase “Irish baseball player.” In subsequent tries, it identified the player as playing for a baseball in Cork.
So I suppose that means that it didn’t have access to the various historical writings by SABR (Society for American Baseball Research) members about early Irish players in the US (yes, actually born in Ireland, and there were many, many, many).
Now, this is supposed to be the raison d’être of the program, and indeed, it is amazing at how fast this thing can create dialogues in flawless English (more on that later).
But naturally there are anachronisms. Such as, ah, Thomas Aquinas being familiar with the theory of evolution.
(I forgot to copy the one between God and Nietzsche, but it disappointingly did not include the expected phrase by Nietszsche: “Huh. I thought you were dead?”)
As in the dialogue between Sartre and Kant, the dialogue sounded more like a first-person college essay than an actual dialogue. Real people don’t speak in perfect prose. They make grammar mistakes, use the wrong word, repeat themselves, backtrack and go off on tangents…all sorts of chaos.
Chatbots simply can’t do that, because they are not true AI (yet, TG). They rely on existing text and string together words based on their program. I suppose one could argue that that is what people do, but our brain is not a computer. It’s much more complicated.
We would certainly never have Aquinas — a 13th century Italian priest and religious scholar who “five ways” to prove God’s existence are well-known even to non-Christians — end a dialogue by saying “Ultimately, each person must come to their own conclusions about the existence of God.”
I didn’t even bother taking a photo of the fiction prose the program spat out, because it was incredibly descriptive and horribly boring at the same time.
In two attempts, the chatbot wrote exactly one line of dialogue — “Are you okay?”
The rest was prose. That’s it. Even the attempt to get it to write like PKD about robots was a total failure. It simply wrote descriptions, even though it finally did give one character a name (Rachael, not surprisingly).
But nothing happened. There was no plot, no dialogue, no character development, no tension or conflict to be resolved.
Fiction writers, your careers are safe. At least until we can figure out how to make a better prompt (more like, direct the program to texts publicly available, such as the Bible — I have to admit, the biblical story of removing a sandwich from the VCR is pretty funny.)
Here is where the program shined. Scarily so.
I copied the four essays written by the chatbot and pasted them into MS Word to check.
As usual, MS Word (a program designed to torture business office workers around the world) identified “reflected back” and “in order to” as being redundant (although we do use these phrases and they sound perfectly natural).
In other essays, it insisted that a comma be inserted before each and every conjunction (and, but), regardless of whether there was an SVO sentence before and after the word (i.e., it would insist on a comma prior to the previous “and” in this sentence!). And it didn’t like phrases such as “the majority of people in Japan” (MS Words thinks “most” or “many” is “more precise” a phrase…uh…no, they are exactly the opposite, MUCH less precise).
So the chatbot writes essays even better than what MS Word is programmed to judge. Scary.
In the three versions of “compare US to Japan” essays, the program used three different points of comparison, but the only point that remained the same was “collectivism” versus “individualism,” which are indeed typical comparisons made.
However, the program also failed to use any hedging modifiers such as “tend to” or “are more likely to,” which places the essays firmly in the “Essentialism” camp — i.e., claiming that all people in a certain group behave and think exactly alike.
It also never made it even close to 300 words, ending at around 220 at most in its third attempt. The conclusion paragraph was adequate, though short, and the introduction paragraph was incredibly short — a single sentence in one case.
So, even though the chatbot has its issues — obviously, it didn’t bother with references, since it would have to cite virtually the entire internet — it can produce reasonably well-written essays in borderline academic prose.
And that is what has some up in arms. Even newspaper columnists are musing whether their jobs are safe. After all, couldn’t we just tell a chatbot to write an editorial or opinion column, given the right prompt? And college professors worry that “the college essay is dead” — while others laugh at the idea of professors using a program to automatically judge papers that are automatically generated.
For me, as an EFL (English as a Foreign Language) writing instructor, it has become progressively difficult to determine which essays are written just by the student and which are written using machine translation. But I’ve never had to deal with the entire essay being generated by the machine. And that is very likely to happen once my Japanese students figure this out (i.e., once the news stories about chatbots are translated — by DeepL — into Japanese).
So I figure — there’s very little I can do about it, it’s not going away and will likely only get better, so the focus should be on the students’ learning something that they want to learn, rather than focus on how many mistakes in vocabulary or grammar they make.
Which is to say, continue to teach writing as I always have. As a means of personal expression and growth and communication with one’s peers.
And naturally I will continue to write my own fiction the old-fashioned way…