{"id":140,"date":"2024-02-07T12:09:44","date_gmt":"2024-02-07T11:09:44","guid":{"rendered":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/?p=140"},"modified":"2024-02-07T22:22:06","modified_gmt":"2024-02-07T21:22:06","slug":"understanding-language-the-human-vs-chatgpt-perspective","status":"publish","type":"post","link":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/","title":{"rendered":"Understanding Language: The Human vs. ChatGPT Perspective"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" id=\"h-what-is-the-difference-between-human-and-chatgpt-s-understanding-of-language-let-s-ask-linguistics\">What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.<\/h2>\n\n\n\n<p>It\u2019s over a year since ChatGPT\u2019s release to the world. There is still a lot of hype going around regarding whether ChatGPT and its later versions, including GPT-4, have a \u201ctrue\u201d understanding of human language and whether we are now close to an Artificial General Intelligence (AGI) that is as intelligent as (or even more intelligent than) humans and could possibly enslave humanity. In this article, we will explore the fabric of what ChatGPT learns about human language from a linguistic vantage point, uncovering the similarities and vast disparities to our human understanding, which will hopefully help to put some of the claims made by AGI evangelists into perspective and alleviate fears of AI domination of the world.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-does-chatgpt-learn-human-language\"><strong>How does ChatGPT \u201clearn\u201d human language?<\/strong><\/h3>\n\n\n\n<p>It\u2019s shockingly simple. Large Language Models (LLMs) like ChatGPT are trained with a task called \u201cnext token prediction\u201d. This simply means that the model is presented with a few words of a human written sentence (e.g. taken from Wikipedia) and is asked to predict the next word. The model prediction is compared to the actual word that follows in the sentence. If the prediction is incorrect, the model (which is a neural network, i.e. lots of matrices stacked onto each other) is updated so that the next time it sees the sentence (or a similar one), it is more likely that the actual next token is predicted. That\u2019s it, really.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/jalammar.github.io\/images\/gpt3\/03-gpt3-training-step-back-prop.gif\" alt=\"\" style=\"aspect-ratio:1.7797029702970297;width:768px;height:auto\" \/><figcaption class=\"wp-element-caption\">Source: Alammar, Jay. 2020.&nbsp;<a href=\"https:\/\/jalammar.github.io\/how-gpt3-works-visualizations-animations\/\">How GPT3 Works &#8211; Visualizations and Animations<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Now, the next shocking thing is that this actually works quite well, and we discovered the \u201cscaling laws\u201d: The more you scale up the neural network size (more matrix stacking) and the more sentences you give the model to train on, the more capable the model becomes. Google Research has made a nice analysis and visualization of the scaling laws:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgLXCWMlipdu0gFF6hsiJHbxg1zSaEkdDWfl-8RakQuW__8RPvlOS9KGIScNCytxT4jz9isnx0GLMwbS1G0Q4WdXzT42GszgfwIIAVX1H3J-43lVWWqcb--q9cPsxCsJFFz2dRfpKgEmLe-xfIyBqQuPq1BPYcK9CtAK1_xnhgvgAAx0GeZmODJxGNMYQ\/s16000\/image8.gif\" alt=\"\" style=\"aspect-ratio:2.7444253859348198;width:686px;height:auto\" \/><figcaption class=\"wp-element-caption\">Source: Narang, S. and Chowdhery, A. 2022. <a href=\"https:\/\/blog.research.google\/2022\/04\/pathways-language-model-palm-scaling-to.html\">Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>They found that a model with 8 billion parameters (8 billion numbers in those stacked matrices) can do some question answering quite well, but it isn\u2019t any good at, for example, translation or summarization. When the model size is increased to 540 billion parameters (but the model architecture concept and training procedure stay the same), the capability to, for instance, translate and summarize text emerges in the model. Surprisingly, we can\u2019t really explain <em>why<\/em> this happens, but it\u2019s something that we observe and exploit by making models bigger and bigger.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-alright-that-s-how-chatgpt-learns-language-why-does-it-know-so-much\"><strong>Alright, that\u2019s how ChatGPT learns language. Why does it know so much?<\/strong><\/h3>\n\n\n\n<p>ChatGPT and its siblings have ingested billions of words in their training &#8211; a lot more than any human in their upbringing. To give you an impression of how much 300 billion &#8220;tokens\u201d (ChatGPT&#8217;s version of words)<sup data-fn=\"68457223-88d3-48cc-8182-e9de558f9366\" class=\"fn\"><a href=\"#68457223-88d3-48cc-8182-e9de558f9366\" id=\"68457223-88d3-48cc-8182-e9de558f9366-link\">1<\/a><\/sup>, the training data size of GPT 3, are: If you print out 300 billion tokens on A4 paper (500 words per page) and stack them, you get a tower that is roughly 60 kilometers high &#8211; that\u2019s where the stratosphere begins, or the height of stacking more than 70 Burj Khalifas on top of each other.\u00a0<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"784\" height=\"483\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Outer_Space.png\" alt=\"\" class=\"wp-image-143\" style=\"aspect-ratio:1.6231884057971016;width:578px;height:auto\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Outer_Space.png 784w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Outer_Space-300x185.png 300w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Outer_Space-768x473.png 768w\" sizes=\"auto, (max-width: 784px) 100vw, 784px\" \/><figcaption class=\"wp-element-caption\">Source: Wikipedia. <a href=\"https:\/\/en.wikipedia.org\/wiki\/Outer_space\">Outer space<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Clearly, that\u2019s a lot more than what any human has read &#8211; You could do it in about 5700 years if you read 200 words per minute for 12 hours a day. When it comes to reading history and volume, ChatGPT clearly has the upper hand.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-okay-so-chatgpt-has-ingested-a-lot-more-texts-than-i-can-ever-read-what-s-the-difference-then-between-my-language-understanding-and-chatgpt-s\"><strong>Okay, so ChatGPT has ingested a lot more texts than I can ever read. What\u2019s the difference then between my language understanding and ChatGPT\u2019s?<\/strong><\/h3>\n\n\n\n<p>Yes, is ChatGPT not simply more powerful than you because of how much it has read? Well, let\u2019s take a look at how we humans understand and represent the meaning of a word that we have read, let\u2019s say the word \u201cpipe\u201d (as in smoking pipe). There is a famous painting by the painter Magritte:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"378\" height=\"264\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image.jpeg\" alt=\"\" class=\"wp-image-163\" style=\"aspect-ratio:1.4318181818181819;width:320px;height:auto\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image.jpeg 378w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image-300x210.jpeg 300w\" sizes=\"auto, (max-width: 378px) 100vw, 378px\" \/><figcaption class=\"wp-element-caption\">Source: Wikipedia. <a href=\"https:\/\/en.wikipedia.org\/wiki\/The_Treachery_of_Images\">The Treachery of Images<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>The painting has a text written in it that says \u201cThis is not a pipe\u201d while there clearly is a pipe visible in the painting. But wait.. is there really a pipe in the painting? Of course not, there is only a <em>painting<\/em> of a pipe. This funny and ingenious painting demonstrates an intuitive but not always obvious distinction in a simple way: There is a difference between <em>the things we refer to<\/em> and <em>the things we use to refer to them<\/em>. Trivially, a painting of a pipe is not a \u201creal\u201d pipe, it can be seen as a <em>symbol<\/em> that references a (real or imaginary) pipe.<\/p>\n\n\n\n<p>You might be wondering by now why we are talking about paintings of pipes when we try to drill down on the difference between our understanding of language and that of large language models. Bear with me, because through this door, we have entered the domain of semiotics, the study of signs and symbols.&nbsp;We can apply this distinction between real-world things and the symbols that we use to reference them to language and words seamlessly: The Swiss linguist Ferdinand de Saussure presented a well-known model that consists of \u201cthe signified\u201d (<em>signifi\u00e9<\/em>; i.e. a real pipe) and \u201cthe signifier\u201d (<em>signifiant<\/em>; i.e. the word <em>pipe<\/em> or Magritte\u2019s painting of a pipe).<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/image1.slideserve.com\/3113968\/saussure-s-sign-definition1-l.jpg\" alt=\"\" style=\"aspect-ratio:1.3333333333333333;width:526px;height:auto\" \/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/www.slideserve.com\/gypsy\/ferdinand-de-saussure-charles-sanders-peirce-semiology-based-on-m-jgan-b-y-kta-work\">FERDINAND DE SAUSSURE CHARLES SANDERS PEIRCE SEMIOLOGY Based on M\u00fcjgan B\u00fcy\u00fckta\u015f \u2019 work<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>While this seems to represent something trivial and obvious in hindsight if you think about it, it builds the basis for a more refined model of how we humans use and work with symbols to communicate with each other, which we can use to illustrate the difference of how we humans and large language models learn and represent the meaning of words differently: The Semiotic Triangle.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1995\" height=\"1563\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image.png\" alt=\"\" class=\"wp-image-164\" style=\"aspect-ratio:1.276391554702495;width:514px;height:auto\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image.png 1995w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image-300x235.png 300w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image-1024x802.png 1024w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image-768x602.png 768w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/image-1536x1203.png 1536w\" sizes=\"auto, (max-width: 1995px) 100vw, 1995px\" \/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Triangle_of_reference\">Triangle of reference<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>While the semiotic triangle includes the symbol (signifier) and the referent (signified) of de Saussure\u2019s model, it adds a third, crucial component &#8211; the \u201creference\u201d. The reference is your internal mental representation of the real-world objects around you, for example, Magritte\u2019s painting of a pipe. I can talk to you about a pipe without it being in the same room as we are in, because I (and you) have built an internal mental representation of a pipe: the <em>reference<\/em> in the semiotic triangle. I can use a \u201csymbol<em>\u201d <\/em>(i.e. a word or phrase <em>pipe<\/em>) to verbalize this reference for communication with you, and, since you also have a <em>reference<\/em> of a pipe, we can talk about it, although there is no pipe before us that we can point to (and we can also talk about Magritte\u2019s painting of a pipe, i.e. we use a textual symbol to reference a visual symbol of a non-existent pipe &#8211; pretty wild!).&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-symbols-real-world-objects-and-our-internal-reference-to-them-got-it-we-use-symbols-to-verbalize-our-mental-references-of-things-to-talk-about-the-things-the-semiotic-triangle-neat-now-how-does-this-help-me-to-understand-the-difference-to-how-chatgpt-understands-language\"><strong>Symbols, real-world objects, and our internal reference to them, got it. We use symbols to verbalize our mental references of things to talk about the things &#8211; the semiotic triangle, neat. Now how does this help me to understand the difference to how ChatGPT understands language?<\/strong><\/h3>\n\n\n\n<p>One crucial aspect of this model is the mental representation, the reference. We humans build and shape our mental representation of things as multi-sensory beings in an organic structure (the brain) that is directly coupled with the sensors. Our representation of things (the references) is an amalgam of bio-chemical and bio-electrical processes and experiences of our interactions with the real world and its effects on us. Let\u2019s take an emotionally loaded word like \u201clove\u201d, for example: If we think about the word, it can trigger all sorts of memories and associations: our personal experiences, songs, poems, movies; the activation of the representation can even invoke physical pain or exaltation, i.e. the activation of the reference that in turn re-activates certain bio-chemical processes.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"782\" height=\"415\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Love.png\" alt=\"\" class=\"wp-image-145\" style=\"aspect-ratio:1.8843373493975903;width:546px;height:auto\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Love.png 782w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Love-300x159.png 300w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Love-768x408.png 768w\" sizes=\"auto, (max-width: 782px) 100vw, 782px\" \/><figcaption class=\"wp-element-caption\">Source: Author.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Now, an interesting property of the relation between the symbol and the reference (or referent in de Saussure\u2019s model) is that it is <strong>arbitrary<\/strong>. There is no system, algorithm, or rule that helps you figure out why a tree is called <em>tree<\/em>. Nothing, none of the properties of a real-world tree gives you any hint why you should call it <em>tree<\/em>; the sequence of characters <em>tree<\/em> does not have any obvious connection to an actual tree. You could use any other (made up) word to refer to a tree, so long as all other speakers agree to use this word to refer to trees. That is, words can be seen as arbitrary artifacts that we use to communicate about our representations of things in the world. We learn the assignment of words to real-world entities throughout our childhood and through demonstration and experience (Your parents point to a tree and say <em>This is a tree<\/em>.)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-i-sense-that-you-ll-be-saying-that-a-computer-is-not-a-bio-chemical-system-and-that-s-the-difference-but-chatgpt-is-an-artificial-neural-network-isn-t-that-somehow-like-the-human-brain\"><strong>I sense that you\u2019ll be saying that a computer is not a bio-chemical system and that\u2019s the difference. But ChatGPT is an artificial neural network, isn\u2019t that somehow like the human brain?<\/strong><\/h3>\n\n\n\n<p>Sorry for taking so long, we are arriving at the crux! Yes, artificial neural networks are inspired by the human brain, but that\u2019s not the important part now. Let\u2019s now (finally) put the semiotic triangle to good use to see what differs between human word representation and word representations in large language models. The main differences are simply:&nbsp;<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>ChatGPT never has direct access to the referents<\/strong>, i.e. the real-world objects. It has no memory of real-world interaction with, for example,a tree. It has no sensory apparatus to experience a tree.<\/li>\n\n\n\n<li>Therefore,<strong> <\/strong>it <strong>cannot build a reference<\/strong> of objects in the way we humans do.<\/li>\n\n\n\n<li>There is nothing that the symbols in the triangle can be assigned to!<\/li>\n<\/ol>\n\n\n\n<p>So ChatGPT is missing two out of three components that our language understanding encompasses in the Semiotic Triangle!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-but-then-how-can-chatgpt-learn-anything-about-language-at-all-it-clearly-does\"><strong>But then how can ChatGPT learn anything about language at all? It clearly does!<\/strong><\/h3>\n\n\n\n<p>Good point! What actually happens during the training of LLMs is that they do build a kind of <em>reference<\/em>, or better, <em>representation<\/em> of the <em>symbols<\/em>, i.e. words,<em> but not the referents<\/em> (the models have no access to the real-world objects). These LLM representations are simply numbers in matrices that are optimized during the training procedure. As we said at the beginning of the article, the task used to optimize these numbers is next word prediction. Given, for example,an incomplete sentence: what is the next (missing) word in the sentence? That is, LLMs learn to predict which words fit into which contexts and they do so by updating a randomly initialized matrix representation of words which they use for the predictions. That means <strong><em>everything that an LLM ever \u201csees\u201d in the symbolic triangle are symbols<\/em><\/strong>. Through optimizing the word prediction task, the underlying neural network (the word representations) in LLMs picks up patterns and regularities in the word sequences. These representations can be thought of as representing the meaning of words, similar to how the reference in the human brain represents the meaning of a word. <strong>Crucially, this meaning representation in LLMs is formed only by observing symbols and the relations between them. <\/strong>And this works surprisingly well!&nbsp;<br>There is an actual theory of meaning in linguistics that describes this kind of meaning acquisition through symbol observation only: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Distributional_semantics\">Distributional Semantics<\/a>. This is, in fact, the whole linguistic theory on which modern NLP since the inception of word embeddings like Word2Vec (and even before that) relies! So what does it say? A colloquial phrasing of distributional semantics is: \u201cKnow a word by the company it keeps\u201d, or \u201cThe meaning of a word is derived by the context it occurs in\u201d. This observation is the basis for, for example, the Cloze tests in language learning, where the learners have to fill in missing words into a given sentence. And this is exactly the task that LLMs learn to solve!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-so-what-does-this-difference-mean-for-the-language-understanding-of-chatgpt\"><strong>So what does this difference mean for the language understanding of ChatGPT?<\/strong><\/h3>\n\n\n\n<p>Or put differently: Does it really matter that LLMs have a different mechanism to learn meaning representations? The answer is: it depends. We all know that we can do and achieve amazing things with LLMs that took way more effort before November 2022 or weren\u2019t even thought possible back then. There are two things to consider:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>One has to be aware of the limitation of the meaning representations of <strong>LLMs<\/strong>: Because they lack a human reference system, they <strong>cannot infer and reason in the same way we do<\/strong>. For creating a spam classifier this is probably less crucial than when trying to automate essay grading, juridical decisions, or governmental policies.&nbsp;<\/li>\n\n\n\n<li>The <strong>discussion of the potential emergence of Artificial General Intelligence or human-like intelligence in LLMs is rarely informed about linguistic aspects of language learning and processing<\/strong>. Having a linguistic understanding of what happens when we communicate certainly helps clear up some expectations and fears in that regard!<\/li>\n<\/ol>\n\n\n\n<p>I hope this article helped you gain a (better) understanding of what ChatGPT does with language. If you\u2019d like to read on, there are two bonus questions below! <\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\" \/>\n\n\n<ol class=\"wp-block-footnotes\"><li id=\"68457223-88d3-48cc-8182-e9de558f9366\">Tokens are ChatGPT&#8217;s version of words. A token is usually a bit shorter and a simplified version of a word. For example, the word \u201cplaying\u201d is divided into two tokens, \u201cplay\u201d and \u201c###ing\u201d. Similarly, German compound words are split into their constituents e.g. \u201cRindfleisch\u201d becomes \u201cRind\u201d \u201c###fleisch\u201d etc. This reduces the size of the vocabulary (the set of tokens) that ChatGPT has to learn. <a href=\"#68457223-88d3-48cc-8182-e9de558f9366-link\" aria-label=\"Jump to footnote reference 1\">\u21a9\ufe0e<\/a><\/li><\/ol>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-bonus-1-i-don-t-like-this-circular-definition-of-word-meaning-that-chatgpt-uses-can-t-we-do-something-else-to-teach-machines-human-language\">Bonus 1: <strong>I don\u2019t like this circular definition of word meaning that ChatGPT uses. Can\u2019t we do something else to teach machines human language?<\/strong><\/h3>\n\n\n\n<p>Indeed, it is quite surprising how brittle modern language models seem sometimes when they fail at the simplest tasks. One explanation for these failures is actually that they do not have any experience of the real world and are trapped in circles of symbols. Cognitive scientist Harnard called this the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Symbol_grounding_problem\">symbol grounding problem<\/a>. How can machines acquire the meaning of symbols without ever interacting with the referents in the physical world? The concept of embodied cognition stipulates that our cognition, communication, and language are strongly driven by our biological shape and its features and sensory capacities. So do you need a human-like body to understand human language and to develop general intelligence? This idea and many failures in classical AI have led to a partial paradigm shift in research towards <a href=\"https:\/\/link.springer.com\/chapter\/10.1007\/978-3-540-27833-7_1\">embodied AI<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-bonus-2-but-if-chatgpt-doesn-t-understand-language-as-i-do-why-does-it-get-what-i-ask-it-to-do-so-well\">Bonus 2: <strong>But if ChatGPT doesn&#8217;t &#8220;understand&#8221; language as I do, why does it \u201cget\u201d what I ask it to do so well?<\/strong><\/h3>\n\n\n\n<p>That&#8217;s a good question. How is it possible that a token-prediction machine understands my instructions? The trick that teaches language models to follow your instructions so well is showing them a lot (tens of thousands) of instructions and their completions. These examples of instruction following are hand-crafted (in most cases) and are the last batch of input data that such a language model sees in the token prediction training stage.&nbsp;<\/p>\n\n\n\n<p>What does this hand-crafted instruction following data look like? The <a href=\"https:\/\/open-assistant.io\/\">OpenAssistant project<\/a> collected and curated a large amount of such instructions and made them freely available to anyone. Let\u2019s look at an example:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"806\" height=\"206\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Datasets-at-Hugging-Face.png\" alt=\"\" class=\"wp-image-147\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Datasets-at-Hugging-Face.png 806w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Datasets-at-Hugging-Face-300x77.png 300w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Datasets-at-Hugging-Face-768x196.png 768w\" sizes=\"auto, (max-width: 806px) 100vw, 806px\" \/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/huggingface.co\/datasets\/OpenAssistant\/oasst_top1_2023-08-25?row=9\">OpenAssistant TOP-1 Conversation Threads<\/a>.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>The dataset contains special tokens in angled brackets that signify turn taking in the conversation, and the first token of a turn indicates who is speaking. This dataset contains over 13 &#8216;000 such examples. There are many more of these datasets that are available under open licences.&nbsp;<\/p>\n\n\n\n<p>What effect does the instruction tuning yield then in a language model? Let&#8217;s look at an example where we use the same (arbitrary) writing prompt for GPT-3 (not instruction fine-tuned) and ChatGPT.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"905\" height=\"387\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Curie.png\" alt=\"\" class=\"wp-image-148\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Curie.png 905w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Curie-300x128.png 300w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/Curie-768x328.png 768w\" sizes=\"auto, (max-width: 905px) 100vw, 905px\" \/><figcaption class=\"wp-element-caption\">Source: GPT-3 (curie)<\/figcaption><\/figure>\n<\/div>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"653\" height=\"545\" src=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/ChatGPT.png\" alt=\"\" class=\"wp-image-149\" srcset=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/ChatGPT.png 653w, https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/ChatGPT-300x250.png 300w\" sizes=\"auto, (max-width: 653px) 100vw, 653px\" \/><figcaption class=\"wp-element-caption\">Source: ChatGPT 3.5<\/figcaption><\/figure>\n<\/div>\n\n\n<p>As we can see, there is a big difference between the two outputs. The Curie GPT-3 model, that is not instruction fine-tuned, picks up the style and pattern of the prompt and reproduces it in variations. It doesn\u2019t recognize and react to the <em>pragmatic intent <\/em>of the instruction in the prompt. ChatGPT, by contrast, perfectly understands what the prompt asks for and delivers the desired text. That is, through examples of instruction following, ChatGPT learns to process the pragmatics of human language to some extent.<\/p>\n<div class=\"pt-sm\">Schlagw\u00f6rter: <a href=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/tag\/chatgpt\/\">chatgpt<\/a>, <a href=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/tag\/generative-ai\/\">generative AI<\/a>, <a href=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/tag\/language-models\/\">language models<\/a><br><\/div>","protected":false},"excerpt":{"rendered":"<p>What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics. It\u2019s over a year since ChatGPT\u2019s release to the world. There is still a lot of hype going around regarding whether ChatGPT and its later versions, including GPT-4, have a \u201ctrue\u201d understanding of human language and whether we are now close [&hellip;]<\/p>\n","protected":false},"author":719,"featured_media":162,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":"[{\"id\":\"68457223-88d3-48cc-8182-e9de558f9366\",\"content\":\"Tokens are ChatGPT's version of words. A token is usually a bit shorter and a simplified version of a word. For example, the word \\u201cplaying\\u201d is divided into two tokens, \\u201cplay\\u201d and \\u201c###ing\\u201d. Similarly, German compound words are split into their constituents e.g. \\u201cRindfleisch\\u201d becomes \\u201cRind\\u201d \\u201c###fleisch\\u201d etc. This reduces the size of the vocabulary (the set of tokens) that ChatGPT has to learn.\"}]"},"categories":[9],"tags":[10,11,5],"features":[],"class_list":["post-140","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-generative-ai","tag-chatgpt","tag-generative-ai","tag-language-models"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.2) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Understanding Language: The Human vs. ChatGPT Perspective - Artificial Intelligence - Research and Applications<\/title>\n<meta name=\"description\" content=\"What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Understanding Language: The Human vs. ChatGPT Perspective\" \/>\n<meta property=\"og:description\" content=\"What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\" \/>\n<meta property=\"og:site_name\" content=\"Artificial Intelligence - Research and Applications\" \/>\n<meta property=\"article:published_time\" content=\"2024-02-07T11:09:44+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-02-07T21:22:06+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Don Tuggener\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Don Tuggener\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\"},\"author\":{\"name\":\"Don Tuggener\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#\/schema\/person\/c85f09a451847c18ea753c129d872c7a\"},\"headline\":\"Understanding Language: The Human vs. ChatGPT Perspective\",\"datePublished\":\"2024-02-07T11:09:44+00:00\",\"dateModified\":\"2024-02-07T21:22:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\"},\"wordCount\":2658,\"commentCount\":0,\"image\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg\",\"keywords\":[\"chatgpt\",\"generative AI\",\"language models\"],\"articleSection\":[\"Generative AI\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\",\"url\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\",\"name\":\"Understanding Language: The Human vs. ChatGPT Perspective - Artificial Intelligence - Research and Applications\",\"isPartOf\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg\",\"datePublished\":\"2024-02-07T11:09:44+00:00\",\"dateModified\":\"2024-02-07T21:22:06+00:00\",\"author\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#\/schema\/person\/c85f09a451847c18ea753c129d872c7a\"},\"description\":\"What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.\",\"breadcrumb\":{\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage\",\"url\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg\",\"contentUrl\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg\",\"width\":1024,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Startseite\",\"item\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Understanding Language: The Human vs. ChatGPT Perspective\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#website\",\"url\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/\",\"name\":\"Artificial Intelligence - Research and Applications\",\"description\":\"A Blog of the ZHAW Zurich University of Applied Sciences\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#\/schema\/person\/c85f09a451847c18ea753c129d872c7a\",\"name\":\"Don Tuggener\",\"sameAs\":[\"https:\/\/www.zhaw.ch\/en\/about-us\/person\/tuge\/\"],\"url\":\"https:\/\/blog.zhaw.ch\/artificial-intelligence\/author\/tuge\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Understanding Language: The Human vs. ChatGPT Perspective - Artificial Intelligence - Research and Applications","description":"What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/","og_locale":"en_GB","og_type":"article","og_title":"Understanding Language: The Human vs. ChatGPT Perspective","og_description":"What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.","og_url":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/","og_site_name":"Artificial Intelligence - Research and Applications","article_published_time":"2024-02-07T11:09:44+00:00","article_modified_time":"2024-02-07T21:22:06+00:00","og_image":[{"width":1024,"height":1024,"url":"http:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg","type":"image\/jpeg"}],"author":"Don Tuggener","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Don Tuggener","Estimated reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#article","isPartOf":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/"},"author":{"name":"Don Tuggener","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#\/schema\/person\/c85f09a451847c18ea753c129d872c7a"},"headline":"Understanding Language: The Human vs. ChatGPT Perspective","datePublished":"2024-02-07T11:09:44+00:00","dateModified":"2024-02-07T21:22:06+00:00","mainEntityOfPage":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/"},"wordCount":2658,"commentCount":0,"image":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage"},"thumbnailUrl":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg","keywords":["chatgpt","generative AI","language models"],"articleSection":["Generative AI"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/","url":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/","name":"Understanding Language: The Human vs. ChatGPT Perspective - Artificial Intelligence - Research and Applications","isPartOf":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#website"},"primaryImageOfPage":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage"},"image":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage"},"thumbnailUrl":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg","datePublished":"2024-02-07T11:09:44+00:00","dateModified":"2024-02-07T21:22:06+00:00","author":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#\/schema\/person\/c85f09a451847c18ea753c129d872c7a"},"description":"What is the difference between human and ChatGPT\u2019s understanding of language? Let\u2019s ask linguistics.","breadcrumb":{"@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#primaryimage","url":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg","contentUrl":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/files\/2024\/02\/robot_smoking_magrittes_pipe.jpeg","width":1024,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/2024\/02\/07\/understanding-language-the-human-vs-chatgpt-perspective\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Startseite","item":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/"},{"@type":"ListItem","position":2,"name":"Understanding Language: The Human vs. ChatGPT Perspective"}]},{"@type":"WebSite","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#website","url":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/","name":"Artificial Intelligence - Research and Applications","description":"A Blog of the ZHAW Zurich University of Applied Sciences","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Person","@id":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/#\/schema\/person\/c85f09a451847c18ea753c129d872c7a","name":"Don Tuggener","sameAs":["https:\/\/www.zhaw.ch\/en\/about-us\/person\/tuge\/"],"url":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/author\/tuge\/"}]}},"_links":{"self":[{"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/posts\/140","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/users\/719"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/comments?post=140"}],"version-history":[{"count":16,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/posts\/140\/revisions"}],"predecessor-version":[{"id":199,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/posts\/140\/revisions\/199"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/media\/162"}],"wp:attachment":[{"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/media?parent=140"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/categories?post=140"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/tags?post=140"},{"taxonomy":"features","embeddable":true,"href":"https:\/\/blog.zhaw.ch\/artificial-intelligence\/wp-json\/wp\/v2\/features?post=140"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}