Thoughts on ChatGPT

 I recently finished DeepLearning.AI's new course on ChatGPT and wanted to write down a few of my thoughts and observations regarding the powerful Large Language Model. I have also experimented with the model in OpenAI's user friendly GUI as well.

1.) The model likes structure and specificity. If you give it vague and ambiguous prompts, it will do its best to respond appropriately, but the more you give it regarding what you are looking for (tone, purpose, role etc.), the more effective and relevant the response it will give you in return. For example, ChatGPT struggles to reverse the order of words like "lollipop" due to its usage of tokens (reminiscent of breaking down words by roots/prefixes/suffixes rather than letters). On the other hand, if you ask ChatGPT to reverse the order of "l-o-l-l-i-p-o-p", essentially spelling out to treat the word as a series of letters, it is able to do so!

2.) The version freely available to the public (v3.5) has only been trained on data through 2021. It raises a question on how to most efficiently re-train and update LLMs in real time as events occur and available information/data on the Internet increases as it does every second. Is there an efficient and reliable way of literally further training the model every second? Or would discrete updates be needed, say once a month akin to updates/patches for apps on your phone?

3.) The model is configured with a "temperature" variable which is essentially a stochastic element between 0 and 2 where higher temperatures correspond to more random responses. This is definitely an important parameter to be aware of-- depending on the use case, one may want to keep this value near 0 to prevent potentially inappropriate responses. Larger values around 1 may be desirable to produce more creative less predictable responses. However, I would definitely be wary of values closer to 2, I noticed that ChatGPT's responses became incoherent for temperature > 1.5 or so.

4.) LLMs like ChatGPT certainly appear to possess something passably resembling human critical thinking and verbal skills, but ultimately, they are re-expressing the data on which they have been trained. That is to say, it is unclear the extent to which present LLMs can go beyond their training data and innovate novel ideas. Take for example, powerful search engines such as Google that provide users with the ability to search for info on the Internet through prompts in the browser. In some sense, ChatGPT is providing a similar service, in that it is providing users with relevant output informed by the vast amounts of data from the Internet on which it was trained, but rather than providing links to sources with information relevant to the user's prompts, it returns a re-processed remix of the information in a reasonably concise series of characters that mirror human language.

5.) An important topic in AI ethics is combatting the biases that can be imparted to models based on the data used for training. I was both impressed and disappointed to find that ChatGPT can produce output in my family's native language of Telugu, but noticed on closer inspection that the actual words being used were not quite right despite heading in the right direction. The differences in ChatGPT's awareness for different languages and cultures is interesting and something to be monitored. When we allow our common tools to intentionally or unintentionally prioritize certain cultures, it ends up marginalizing others.

6.) Significant abstraction of coding to prompt based machine learning. Essentially spelling out how you want the LLM to behave in clear human language rather than going through extended programming processes. My sense is AI and machine learning have become increasingly abstracted and user friendly over the years from TensorFlow at one point being considered a conveniently abstracted pythonic framework that does not require touching low level code like C directly, to libraries like Keras which allowed one to build neural networks in terms of layers rather than mathematical objects like tensors, to still higher level tools that require little to no coding at all.


References:

- Try ChatGPT for free using OpenAI's sleek online platform https://chat.openai.com

- Check out Andrew Ng's new short course on ChatGPT Prompt Engineering for Developers. A great and quick way to learn a bit more of how ChatGPT ticks under the hood. https://learn.deeplearning.ai/chatgpt-prompt-eng


Comments

Popular Posts