Evaluating ChatGPT


Over the last couple of weeks or months, like so many other people, I had some fun with ChatGPT. Its conversational abilities are quite impressive. It can also translate rather well, and in and from more languages than Google Translate and DeepL, including some somewhat obscure ones, like Interlingua, Volapük and Kotava.

But ChatGPT has one big problem: it is quite unreliable with facts. This may be due to my trick questions, but too often Chat GPT came up with statements that are just not true. It attributes half-true facts to the wrong persons, it mixes up oceans, and it makes fantasies about people, that may even be regarded as defaming or slanderous.

Perhaps if the makers of ChatGPT could supplement it with a component that works in a different way, and checks facts against reliable sources, ChatGPT can become a useful tool for other tasks than just post-edited machine translation (PEMT). As it is, in my opinion it is unsuitable for anything else than PEMT, and I dare say, ChatGPT is even dangerous, because people may be inclined to assign a much higher trustworthiness to it than it deserves.

In this directory I publish parts of some of my adventures with ChatGPT. Where I leave out some boring or irrelevant parts, I’ll try to be honest and avoid misrepresenting ChatGPT through missing context.

My prompts will be displayed like this, and what ChatGPT generated in response to that will look like this. In between, I will sometimes add clarifying comments, in the same print as this text, that were not part of the conversations with ChatGPT.