In the world of artificial intelligence, a seismic shift is unfolding. This shift is the rise of generative AI, and it’s redefining the limits of machine creativity. Yet this technological marvel has a dual-edged nature. Is generative AI a beacon of creativity… or will it bring overwhelming job loss and disruption? The answer is: probably both.
In this article, we’ll explain what generative AI is, how it works, and what you should know about its implications.
What is Generative AI?
Generative AI (GAI) is a type of artificial intelligence that can generate new content, such as images, videos, audio, text, and 3D models. It first learns patterns from a large dataset in an open-ended manner. Then, it can apply those learnings to generate new, original outputs.
What makes generative AI especially attention-grabbing is that it can also learn “styles” from the source data. If you train a Gen. AI model on a massive media library, you can ask it to write in the style of famous authors.
Unlike traditional AI that focuses on analyzing data, generative models are meant to create new content. They can generate realistic images, write text, compose music, and more.
Key Terminology
Generative AI models are a subset of AI models. Yet generative AI’s impact could be as large as all other AI combined. That’s because it gives AI a “voice,” allowing it to communicate in ways never before possible.
Here’s how the AI landscape shakes down.
Artificial Intelligence (AI)
Artificial intelligence is about programming machines to imitate human intelligence.
Machine Learning (ML)
Machine learning is an approach to AI that has taken over in popularity since the 2000’s. The idea is to show the computer many examples, then let it learn patterns from the data. These patterns combine to form a “model.”
In the past, there were other approaches to AI, such as big decision trees of IF-THEN conditions. Those approaches have largely become obsolete.
Deep Learning
Deep learning is an approach to ML that has taken over in popularity since the 2010’s. These models are inspired by how the human brain works. They consist of deep neural networks with many layers of artificial “neurons.”
Other ML approaches, such as regression or tree-based methods, are still used today. But deep learning is much more effective for large amounts of unstructured data.
Foundation Models
Foundation models are deep learning models trained on vast amounts of unstructured data. Instead of training for a certain task (e.g. predicting a stock price), these models are open-ended. Their goal is to learn as many patterns and subtle intricacies about the data as possible.
For example, GPT-4 is a language model that learns how humans write based on a vast amount of Internet text. DALL·E 3 is a visual model that learns how to create images based on 400 million pairs of image-captions. These are both foundation models.
Foundation models can be used for many tasks out-of-the-box. But they can also be fine-tuned to specific use cases. For example, a language model can be fine-tuned to answer customer support questions as a chatbot.
Large Language Model (LLM)
LLMs are foundation models designed to learn the relationships between words (“tokens”). For example, GPT-4 is the LLM behind ChatGPT and LaMDA is the LLM behind Bard.
Generative AI (GAI)
Generative AI refers to creative AI built on foundation models. For example, ChatGPT is the generative AI built using the GPT-4 foundation model.
How Does Generative AI Work?
Conceptually, generative AI works in a very simple way. Let’s use LLMs as an example; they’re easiest to understand, and others work the same way. Here’s how an LLM works:
Given a sequence of words (or “tokens”), predict the most likely next word.
That’s it. A trader might look at a stock chart and try to predict where its price will go next. A meteorologist might look at a hurricane’s trajectory and predict where it will hit.
Likewise, an LLM looks at a sequence of words and predicts what should come next. For example, maybe you have the sequence:
The quick brown fox jumped over the ____
The LLM’s prediction table might look something like this:
Token | Probability |
---|---|
fence | 80% |
ditch | 10% |
log | 5% |
[other…] | 5% |
Thus, a generative AI built on that LLM would print out the token “fence.” Its next decision might look like this:
The quick brown fox jumped over the fence ____
Token | Probability |
---|---|
. | 75% |
, | 15% |
! | 5% |
quickly | 3% |
[other…] | 2% |
Next, it prints out the period token (The quick brown fox jumped over the fence.), and so on.
Training Data
When you talk to a generative AI chatbot, you can ask about its day. You can ask for answers to math questions. You can even ask how it’s feeling. It will appear knowledgeable, considerate, and even thoughtful at times.
All that is an illusion. In reality, it’s a prediction machine that churns out text based on what it thinks you want to see. How natural it sounds, how many topics it “knows” about, and how accurate it is will depend on the size of its training data.
It’s critical to remember that generative AI, in its current state, is not really intelligent the same way humans are… even though it’s superb at giving that type of illusion. Instead, they read in huge corpuses of text or other data, remix it, and spit it back out in new ways.
Input Token Limit
How contextually relevant a generative AI’s outputs are depends on its input token limit. If it has a limit of 90 tokens, you can ask it to summarize a paragraph with no problem. But to summarize an entire novel, it will need a limit of 9,000 or more tokens.
Why Should You Care?
Picture traditional AI as a skilled but rigid craftsman, following blueprints to the letter. While impressively precise, it’s confined to its rules and instructions. It excels in structured tasks—think of a chess-playing robot, unbeatable within the 64 squares but clueless beyond. These AIs operate in a world of black and white.
Enter generative AI, a vibrant and versatile “artist” compared to its predecessor. Generative AI learns from a vast array of data, absorbing knowledge like a sponge.
This AI is not just a student of its programming; it’s a learner of the world. It’s as if it has read millions of books, viewed countless artworks, and listened to a myriad of music pieces, all to create something entirely new.
Natural Language
Generative AI models are able to write stories, translate languages, and answer questions. The most prolific writers of our time will not be human authors, but rather GAI models.
Of course, the quality of that writing is often still quite shoddy. After you read enough AI generated content, it becomes fairly easy to spot, as it tends to repeat ideas and use peculiar expressions from time to time.
That said, remember that we’re only on the 4th/5th generation of some of these models. We’re basically at the Model T stage of generative AI. After a few more iterations, the writing output will be all but indistinguishable from the source material.
Arts and Visual Media
In the field of art, generative AI has revolutionized the creation of visual content. Models like DALL-E and Artbreeder can generate realistic images from simple text descriptions. These models can also capture complex artistic styles. They can mimic the works of renowned artists while also producing original creations.
Science and Research
Generative AI’s impact extends to science and research too. In drug discovery, generative AI can identify molecule candidates with desired properties. It can also help in developing new advanced materials with tailored properties.
The creative potential of generative AI is still being explored, and its impact is likely to grow exponentially in the years to come. As these models continue to evolve, they will transform some industries, eliminate others, and give birth to new ones as well.
The Dual-Nature of Generative AI
Generative AI is a powerful tool that can be used for both good and evil. On the one hand, it has the potential to create new jobs, industries, and forms of art. On the other hand, it could also lead to job losses, automation, and the spread of misinformation.
The Creative Potential of GAI
GAI has the ability to create new and original content, including art, music, and literature. It can also be used to automate tasks that are currently done by humans, such as customer service and data entry. This could create new jobs in areas such as AI development, training, and maintenance.
GAI could also lead to new industries, such as AI-powered healthcare, education, and transportation. These industries could create millions of new jobs and improve the quality of life for everyone.
The Destructive Potential of GAI
GAI could also lead to job losses, as machines become capable of doing more and more of the work that is currently done by humans. This could lead to widespread unemployment and social unrest.
GAI could also be used to automate the spread of misinformation and propaganda. This could undermine democracy and lead to social unrest.
Deepfakes: Unethical Digital Clones
Deepfakes are realistic-looking videos or audio recordings of people saying or doing things they never did. This is done by using GAI models to manipulate existing footage or audio recordings. Deepfakes can be used for a variety of nefarious purposes, like spreading misinformation, damaging reputations, or committing fraud.
Note: Not to be confused with digital twins.
The risk of deepfakes is that they can be very difficult to detect. Even experts can have trouble telling the difference between a real video or audio recording and a deepfake. This makes it easy for deepfakes to be spread online and deceive people.
For example, a scammer could use a deepfake to create a video of a CEO of a company authorizing a large financial transaction. This video could then be used to trick the company’s bank into transferring the money to the scammer.
The use of deepfakes is a serious concern, and it is important to be aware of the risks. There are a number of things that can be done to mitigate the risk of deepfakes, including:
- Developing better detection techniques for deepfakes.
- Educating the public about the risks of deepfakes.
- Creating laws that make it illegal to create or distribute deepfakes without the consent of the person in the video or audio recording.
The Evolution of Generative AI
Early Foundations (1950s-1990s)
The roots of generative AI can be traced back to the 1950s with the development of Markov chains. These are statistical models that generate new sequences of data based on input.
In the 1960s, the concept of neural networks emerged, inspired by the structure of the human brain. Markov chains and neural networks set the groundwork for modern generative AI models.
The Rise of Deep Learning (2000s-2010s)
The 2000s were a turning point in the field of deep learning. In 2009, AI researchers discovered that GPUs could be adapted to speed up deep learning systems by 100-fold. This would later be known as the “big bang” of deep learning. It opened up the feasibility of training complex neural networks on large datasets.
In 2014, generative adversarial networks (GANs) were introduced. GANs pit two neural networks against each other. The “generator” creates new data and the “discriminator” tries to distinguish between real and generated data. This adversarial process leads to increasingly realistic and high-quality outputs.
Explosive Growth and Applications (2010s-Present)
The 2010s and onwards have witnessed an explosion of advancements in generative AI. Many new models emerged, including variational autoencoders (VAEs), diffusion models, and transformers.
Generative AI has taken footholds in a wide range of domains, including art, music, and natural language processing. GAI models can now generate photorealistic images, compose music, and even write creatively.
The Future of Generative AI
The future of generative AI is a topic brimming with potential and possibilities, marked by both optimism and caution. Here’s a glimpse into what we could expect:
Enhanced Creativity and Artistic Collaboration
Generative AI will likely become an indispensable tool for artists, designers, and creators. It will enable new forms of artistic expression, blending human creativity with AI’s capabilities. This collaboration between human and machine could lead to the emergence of new art forms and creative genres.
Revolution in Content Creation
In fields like marketing, advertising, and entertainment, generative AI will streamline content creation. It could generate content personalized to the reader, write scripts for games, or even create virtual influencers. Generative AI could also be a game changer for the growth of the metaverse, serving as a tool to quickly populate virtual worlds.
Advancements in Life Sciences and Healthcare
In the life sciences, generative AI could play a pivotal role in drug discovery and personalized medicine. It could help find new treatment methods and customize healthcare to individual genetic profiles.
Ethical and Societal Implications
As generative AI becomes more prevalent, ethical considerations will become increasingly important. Issues like data privacy, bias, IP, and the potential for misuse (such as deepfakes) will be front and center. They’ll require robust legal frameworks and ethical guidelines to manage.
Impact on Employment and Skill Requirements
Generative AI will change the job landscape, automating tasks and displacing jobs. Previously, white collar jobs were considered “safe” from automation. But with the rise of GAI, that may no longer be true. That said, GAI will also create new roles, especially in AI management, oversight, and collaboration.
Personalization in Everyday Life
Generative AI could make personalized experiences the norm. Imagine tailored educational resources to customized entertainment and shopping experiences. GAI could adapt these experiences to individual preferences and needs.
The future of generative AI is certain to be transformative across multiple sectors. The unique blend of creativity and efficiency will serve as a huge output multiplier. Balancing innovation with ethics will be key to harnessing its full potential in a way where everybody wins.