Getting to Know Generative AI 

By Zi Yuan Wong on August 14, 2024

The tried-and-true ‘modern’ method whenever users encounter a specific problem used to be searching or ‘Googling’ for answers online, hopefully finding one that is properly explained by a subject matter expert. Nowadays, the go-to method of locating solutions would be to ask ChatGPT or other AI chatbots for answers. It seemed interesting and kind of strange after thinking about it that generative AI has become deeply ingrained in our lives seemingly overnight, when our knowledge about them seemed nonexistent not too long ago. So, what exactly is Generative AI and what is the reason for its current popularity?  

Generative Artificial Intelligence (AI) defines the creation of new content like images, music, and videos through utilizing AI, typically after accepting user prompts. The AI models have been trained to learn complex subject matters like verbal languages, art, and biology, which are then used to form answers to solve problems. In recent times, there has been a boom of generative AI news since the launch of large language models like GPT-4 and a survey has shown that adoption of AI has more than doubled since 2017. ChatGPT is one of the many applications of generative AI that uses trained large language models (LLMs) to generate new answers for its users. These models are the cornerstone of generative AI where they are trained to recognize patterns through vast amounts of information which are then used to generate new content based on their trained knowledge. 

The following are some interfaces of generative AI that have generated answers or new content for their users based on their language models. 

Types of GenAI Interfaces 

1. DALL-E 

The DALL-E interface is an AI image generator first released in 2021 by OpenAI and was named after artist Salvador Dali and lovable robot Wall-E. The interface works by generating images from multiple user prompts using models trained on a large dataset of images with their associated descriptions. DALL-E has seen applications like visual aids for education, initial artwork drafts for designers, and marketing ad campaigns. 

DALL-E 3 is the latest release and compared to its previous release DALL-E 2, the image quality is superior and could replicate specific artistic styles. It can only be accessed through ChatGPT if the user is a ChatGPT Plus subscriber, which costs $20 per month. 

2. ChatGPT 

OpenAI’s claim to fame, ChatGPT, is a chatbot interface that interacts with its users in a conversational tone through a dialogue format, which allows the model to answer any follow-up questions or admit potential mistakes. It was first released in 2022 by OpenAI and uses the GPT-3 large language model (LLM) as a foundation. GPT-4 was released in 2023 which surpasses GPT-3.5 with its more advanced reasoning capabilities and eliminating potential hallucinations, or inaccurate results generated by AI models due to insufficient training, incorrect assumptions being made, or biases in training data.  

ChatGPT is available to be used by anyone, but additional functionalities and up-to-date information are included in the ChatGPT Plus subscription which is $20 per month as mentioned above. In addition, the popularity of ChatGPT since its initial release prompted additional investments by Microsoft, which bodes well for the future of OpenAI.  

3. Gemini (Formerly Bard) 

Gemini is Google’s answer to OpenAI’s ChatGPT, where it utilizes its family of trained LaMDA large language models to answer user queries in the form of a public-facing chatbot. It was released in 2023 as Bard before being rebranded into its current name the following year. Unlike ChatGPT, Gemini obtains its information in real-time through Google searches and provides the sources in for its information when prompted. Gemini can either be accessed either through its website or other products within its ecosystem like its electronic devices, or the mobile app. 

A version with additional functionalities like additional storage and ability to upload documents, Gemini Advanced, can be accessed with a subscription of $20 per month. A new version of Gemini built on Google’s PaLM (Pathways Language Model) 2 is released, after the botched launch and lukewarm response of Gemini which also saw Google’s stock prices dip. After its latest release, only time will tell whether it will be received warmly by its users and reach the same heights as ChatGPT in terms of popularity.