The Current AI Arms Race: ChatGPT, Generative AI, and The Future of AI.

Episode #104
Duration: 33:29 Mins
Release Date: 27/03/2023

This episode will give a bird’s eye view on how this arms race came about, what’s happening in the race today, who the main players are, what the technologies can be applied to, and give some thoughts on where we might be headed with this undeniably powerful and exciting technology….and yes…much of this will have been written or at least influenced by ChatGPT.

Timecodes:

00:00:56 – Podcast Begins

00:03:06 – History of Generative AI

00:06:02 – Current AI Arms Race

00:11:29 – Text to Image

00:14:42 – Text to video

00:16:19 – The power of GPT-4

00:17:48 – GPT-4 Applications

00:21:11 – The Future Impacts of generative AI

00:22:21 – Job Loss

00:23:27 – AI Safety, Ethics, Bias

00:24:18 – Will AI increase cultural polarization?

00:26:04 – Will generative AI create an Existential Crisis?

00:30:24 – Why we should be hopeful for the future

00:31:35 – ChatGPT’s summary of the AI Arms Race

Transcript

Welcome to the Future tech and foresight podcast. I’ve been really looking forward to doing another episode on AI, especially with all the exciting new releases coming out of OpenAI and other large organizations creating what many have been calling a new AI arms race. By now you’ve hopefully all tried out ChatGPT to help with writing reports, essays, media content, desk research etc. And maybe a number of you have tried the various image generators online like Midjourney or stable diffusion to create strange and unique and sometimes beautiful images for blogs, websites, or just for fun. 

What I want to try and do today is give a birds eye view on how this arms race came about, what’s happening in the race today, who the main players are, what the technologies can be applied to etc, and give some thoughts on where we might be headed with this undeniably powerful and exciting technology….and yes…much of this will have been written or at least influenced by ChatGPT, how could I not. 

So, before diving into how ChatGPT has blown the floodgates open, and what the current state of generative AI is, let’s briefly look at how we got here. 

History of Generative AI

Generative AI has evolved significantly over the past several decades, from simple rule-based systems to complex deep learning models. For a deeper breakdown I’ll have an extended blog post on this in the next week. And remember, Generative AI is not to be confused with deep learning or the past excitement coming out of systems like DeepMind’s Alphafold. Deep learning techniques are often employed in generative AI models to achieve their goals of creating new, realistic data. While deep learning is a broader field encompassing various tasks and applications, generative AI is a specific area within AI that leverages deep learning techniques to generate novel content.    

  • Early AI research in the 1950s and 1960s laid the foundation for the field, focusing on basic problem-solving and reasoning.
  • In the late 1950s, neural networks emerged with the invention of the Perceptron, which later evolved into deep learning with multi-layer networks.
  • Simple programs like ELIZA (1964) simulated natural language conversation with a limited set of rules, allowing for basic chatbot-like interactions.
  • Rule-based systems in the 1970s and 1980s used pre-programmed rules for text generation. Systems like Racter (1983) used pre-defined rules to generate text or stories, while expert systems applied rules for tasks like medical diagnosis and decision-making. These highlighted the challenges of creating more flexible AI.
  • Neural networks: Multi-layer networks (deep learning) have been applied to various tasks, including image classification, object detection, and speech recognition.
  • Hidden Markov Models in the late 1980s and 1990s helped model sequences of data, improving natural language processing and speech recognition.(e.g., IBM’s ViaVoice), part-of-speech tagging, and bioinformatics for gene prediction.
  • Recurrent Neural Networks (RNNs) in the 1990s allowed AI to handle time-dependent data better, making them suitable for tasks like language modeling and machine translation (e.g., Google’s Neural Machine Translation), sentiment analysis, and music generation (e.g., Magenta by Google). 
  • Generative Adversarial Networks (GANs) in 2014 revolutionized generative AI, enabling the creation of more realistic images as with NVIDIA’s StyleGAN
  • Variational Autoencoders (VAEs) offered another generative approach, learning probabilistic mappings between data and latent space for tasks like image synthesis  (e.g., generating new images of faces), and text generation (e.g., creating natural language sentences).
  • Transformers in 2017 replaced RNNs with self-attention mechanisms, leading to state-of-the-art large language models like BERT which can do sentiment analysis, the GPT series with text generation, and T5 from Google with content summarization. 

These applications showcase the versatility and potential of generative AI techniques across various domains, from text and image generation to natural language understanding and decision-making. But what’s so exciting about today’s generative AI’s? 

The Current AI Arms Race

With Chat GPT blowing open the floodgates of generative AI and becoming the fastest growing platform of all time with over 1 million users signing up in its first 5 days, organizations who were already researching or developing AI have been launching their own versions to the public over the last few months. So let’s look at what these releases entail, and what each of the AI systems actually do. 

ChatGPT has been seen as the lightning rod moment that gathered international attention to generative AI, and an infusion of investments, applications, and content entered into the space. However, it can be argued that the arms race was already quietly happening and OpenAI simply released ChatGPT before many of its competitors released their own AI systems, even though there was internal skepticism of the chatbot’s success. 

Months earlier, an A.I. chatbot that Meta had released, BlenderBot, had flopped, and another Meta A.I. project, Galactica, was pulled down after just three days. And a few years previously, Microsoft’s AI twitter bot, Tay, needed less than 24 hours to be corrupted by Twitter users and started responding with all forms of actual racist and misogynistic views, and was promptly removed.   

But with ChatGPT’s simple interface, impressive responses, and OpenAI’s quick ability to limit and enforce rules that stopped it from going down the path of previous systems, the floodgates were opened… 

Chatbots

  • Microsoft’s Bing – One of the first ‘alternatives’ was Microsoft. With its $10 billion dollar investment in OpenAI, Microsoft Bing’s Search engine and Edge browser incorporated GPT-4 (the improved version of OpenAI’s language model that formed the base of ChatGPT). With this integration and Bing’s connection to the internet, current up-to-date information can be received from Bing Chat, which is a clear improvement over ChatGPT which had a September 2021 cut off to real world information. You still need to get on a waitlist to access Bing Chat’s full power, but this not only started to make Bing a rival to Google’s search, but also prompted the Internet giant to release its own Chatbot alternative.   
  • Google’s Bard being powered by Google’s own Lambda system and not OpenAI’s GPT series was perhaps the most anticipated rival but arrived with the most disappointment. During the first demo Bard answered a question incorrectly regarding the James Webb Space Telescope which sent Google’s share down 9%, losing investors a total of $100 billion. After a round of closed testing by experts, Google’s Bard is available to the public for those who have joined a waitlist. What is exciting about Bard is that Google plans to have it integrated across the google product suite, which fits into the future reality of these AI systems, but more on that in a bit.       
  • A number of other AI chatbots are being integrated into search engines just like Bing and Google. Baidu (china’s biggest search engine) recently released Ernie. Though some say its accuracy is greater than ChatGPT, it seems to avoid responses that are political in nature, and there are concerns about its training data due to China’s internet controls and limitations. Opera and Brave both recently released integrated chatbots as well, and there are numerous other alternatives to ChatGPT like Deepmind’s Sparrow, Anthropic’s Claude, and Jasper Chat.  
  • Meta released an open source and smaller version named LlaMa, so researchers with less computing power can access and use it. 
  • Other’s like Alibaba, Tencent, Amazon, Apple, and even the UK government have also motioned to be joining in the new Arms Race by creating their own chatbot. Interestingly, Elon Musk originally invested $100 million in OpenAI for the goal of creating open and free AI systems. But has recently openly criticized OpenAI for shifting towards a profit based model. According to the news website The Information, Musk has approached researchers in recent weeks about forming a new research lab that would rival OpenAI

Image Generation

Another battle has been waged alongside the text generation fight, that of AI Image generation. 

Similar to the chatbot battle, several competitors have already produced systems that allow users to simply write a description of what they want the AI to create, and within a few seconds various pictures pop up approximating the image. These range from highly detailed and accurate to the written text, to far out and even grotesque renditions, but continually improving over the last year or so. 

  • The app that really kicked this image generation AI into the public’s consciousness was called Midjourney with a piece of AI created art winning first place in an art competition back in September of 2022. Though this event sparked debate, and controversy it shed significant light on the Generative AI space and brought attention to many of the other AI platforms already available. 
  • For example, OpenAI’s Dall-E and Dall-E2 were one of the first used to create images for users with improved clarity and resolution. 
  • Shutterstock, which many people said was ripe for obsolescence due to the growing use of image generation, has now incorporated OpenAI’s Dall-E2 to allow people to create their own images…for a price. 
  • I personally use Nightcafe and starry.ai which can render your prompted images with different artistic styles built into the generation with a simple click of a button. 
  • And perhaps the system most geared to open source and availability is Stability AI’s Stable Diffusion which was released in August of 2022. Even though it has a more open source nature to it, Stability AI was caught in one of the first class-action lawsuits brought forth by artists who claim that the art generators scraped their work without consent for use in the training datasets. 
  • And finally, in the past2 weeks, Midjourney v5 was released. This is important as it can produce images with photorealistic quality. I’ll have a few pictures in the shownotes for you to see, but essentially these pictures which were generated by AI are pretty much imperceptible from photos taken by professional photographers, and generate hands correctly…which was always one of the immediate Uncanney Valley ‘tells’ that an AI had made the images.    

Generative AI Video

One of the early interviews I did on generative AI was with the CEO of Synthesia, Victor Riparbelli. His company employed deepfake or synthetic media technology to make it look like people were able to speak different languages, like their viral video of David Beckam looking like he spoke 9 different languages. During the interview he warned that in the next few years we will be able to generate video that will be indistinguishable from using real actors. Though he may have been a little bit off with his prediction timeline, we are now seeing the first products come out that allow video to be generated from text alone. As an example, last week Runway released its Gen-2 AI video model, which allows for an initial 3s clip to be made entirely from text, with plans to extend the duration. Soon to be available for public use, the videos won’t look realistic to start with but as we’ve seen with the text-to-image tools, video creation most probably won’t take long to be realistic.    

So to summarize, generative AI so far has given us text, image, and the early phases of video generation from basic text prompts. As opposed to a standalone platform like ChatGPT, we are starting to see text generation being integrated into browsers in the form of chatbots, and image generation being added to legacy online platforms like shutterstock in order to remain relevant. But what’s next? 

GPT-4

Not even 2 weeks ago, OpenAI again shattered the internet again with the release of its new GPT model version, the multimodal GPT-4, making it instantly available for anyone using the premium version. GPT-4 is 10x more advanced than the latest GPT-3.5 predecessor, or 100x more powerful than the original GPT-3 version, making it more accurate, better at understanding nuance and context, and providing overall better responses. It has already passed numerous high school and college level exams, including the bar exam being in the top 90% for many. The discipline areas include: law, history, psychology, statistics, environmental science, physics, economics and many others. 

It can accurately assess images through a roundabout connection to discord, and has already been implemented in a few pilot programs with known organizations like the Khan academy, Morgan Stanley, and the Government of Iceland.  

But the exciting part of GPT-4 lies beyond the chatbot interface.      

  1. With the new OpenAI ‘App store’, plugins can access specific databases and translate them into natural language. Most notably it has connected to Zapier’s automation software suite, essentially enabling a phrase of text to turn into a set of actions that the AI will carry out. (Eg. “It can write an email, then send it for you. Or find contacts in a CRM, then update them directly. Or add rows to a spreadsheet, then send them as a Slack message. The possibilities are endless.”)
  2. Real Time insights into company data. As GPT-4 can take substantially more text as input, businesses can ask it to analyze internal information like financial data and assess actions, give recommendations, and shift performance or actions so that an organization is more competitive. 
  3. You can now link GPT-4 to the current internet, allow it to search and scrape up to date information, analyze and summarize what you want to search for. This can essentially replace most desk research. This is why the Arms Race is so competitive, if people become more accustomed to using ChatGPT, and it performs better than a Google search, Google’s main search function will become severely disrupted, if not obsolete. 
  4. There are several more, such as assessing and describing images, analyzing Ethereum contracts and finding vulnerabilities, write code for programs, and much much more 

To finalize this thought, GPT-4 was released less than a month ago, and in its first 2 weeks, hundreds of new applications for it were created. Significantly more are on the way as developers and users explore how it can be integrated into new tasks and forms of work, even writing large sections of articles like this one… And while this is happening, GPT-5 is expected to be released by around 2025, perhaps earlier, which will cause another wave of disruptions, opportunity, and a flurry of excitement as people try to figure out what new things it can be applied to. So with all of this understanding, what does the future of Generative AI and this current AI Arms Race hold for society?

What does the future of Generative AI look like?

  • Firstly, discussions of job loss have been reignited. I’ve covered this topic extensively in the first 100 episodes of the podcast, but this new wave of generative AI does accelerate some of the ideas my guests and I have previously discussed. Currently there is a large focus on the shifting nature of tasks within organizations that are using just the base ChatGPT model. But the ability to automate a significant amount of tasks with the GPT-4 API is being explored as we speak, and will be a continuous source of focus for this topic. Funnily enough ChatGPT was was even asked which jobs it think it might replace, and responded with a more or less logical list of jobs: Data entry and data processing, customer service and support roles (e.g., answering frequently asked questions), translation tasks and report writing and content generation. But I’m sure this list will grow in time. 
  • Safety, Ethics, and AI bias, have always been at the center of concern for the future of AI systems. We have already seen some users have their conversation titles leaked to other users due to a glitch, some users had their credit card details and personal account information revealed due to another bug, and a number of tests have exposed political bias for one candidate over another. Though OpenAI has addressed and is dealing with these concerns, these are typical issues that have been identified in other AI systems in the past, and it is more than likely that they will persist into the future. How this will disrupt society will have to be seen as time goes on, but many fear that AI might not just perpetuate the polarization of our cultures but widen the gap between people’s perspectives at an accelerated rate. For this reason many are calling for regulation of the space to be ramped up. 
  • Then there are possible developments that weren’t expected to materialize for years, already appearing. One of the large constraints on the generative AI space is the cost to train, and maintain the AI’s due to the vast amount of data they need to train on to become as powerful as they are. Enter Alpaca, an open source ChatGPT made for less than $600. Stanford researchers were able to use Meta’s open LLaMa model to create a chatbot that has similar ChatGPT functions, for a fraction of the cost. In fact, ARK investment management predicted a 99% reduction in AI costs would take until 2030 (for GPT-3 like performance), Alpaca seems to have done it within 5 weeks of this prediction being published. It is still being peer reviewed, but if true, there might be another massive wave of disruption coming before this first wave has even settled. If we can created our own AI models nearly on our smartphones, where’s the incentive for large players like Amazon or Google to pour billions of dollars into the space? 
  • Then there are the deeper, perhaps more existential problems that these AI systems bring about. Peter Nixey, a founder and programmer recently wrote that Stack overflow is the main repository for programming Q&A where 100 million users ask questions to other developers, receive answers, and saves several years worth of time as the answers can be seen by millions who have the same question. GPT-4 might however bring this to the end. As Peter writes, “What happens when we stop pooling our knowledge with each other & instead pour it straight into The Machine? Where will our libraries be? How can we avoid total dependency on The Machine?” And as programmers will now be asking GPT-4 questions in private, the next versions will have less and less data to train on.
    • This raises a more profound question. If this pattern replicates elsewhere & the direction of our collective knowledge alters from outward to humanity to inward into the machine then we are dependent on it in a way that supercedes all of our prior machine-dependencies.
    • Whether or not it “wants” to take over, the change in the nature of where information goes will mean that it takes over by default.
    • Like a fast-growing Covid variant, AI will become the dominant source of knowledge simply by virtue of growth. If we take the example of StackOverflow, that pool of human knowledge that used to belong to us – may be reduced down to a mere weighting inside the transformer.
    • Or, perhaps even more alarmingly, if we trust that the current GPT doesn’t learn from its inputs, it may be lost altogether. Because if it doesn’t remember what we talk about & we don’t share it then where does the knowledge even go? Peter doesn’t have an answer to this, some people responded by alluding to the printing press, or books and how people were scared of losing valuable ‘human’ traits with those technologies as being similar to the fears around GPT-4. But perhaps this time really is different?
  • The same could be true for the capacity of large language models and similar AI tools to shape the creation of cultural products – fiction, art, music etc. In this context, much depends on our appetites for consuming AI-generated content once its novelty wears off. Do we want to read novels or social media posts with no human author? Will we be happy with procedurally generated playlists, or will we prefer the human touch (no matter how imperfect or slow) to the spotlessness of an AI? Will cheap, instantaneously, and ‘good enough’ content make up 99% of our needs, with the artisanal, time-consuming, and expensive products of human labour filling the remaining 1%? Will ‘humanmade’ be to cultural products what ‘handmade’ is to physical products? I don’t know. 
  • This idea is reinforced with Yuval Noah Harari’s recent New York Times op ed on his concerns about AI eating culture.
    • Essentially “A.I. could rapidly eat the whole of human culture—everything we have produced over thousands of years—digest it, and begin to gush out a flood of new cultural artifacts.” 
    • These chatbots are our second contact with AI. The first being the AI that curated the social media posts that we have been seeing for over a decade. With our first contact a massive wave of political and cultural polarization has seen realized, not to mention severe psychological problems as discussed in detail by modern thinkers like Jonathan Haidt
    • Harari and his co-authors write that we cannot afford to lose again. A.I. is dangerous because it now has a mastery of language, which means it can “hack and manipulate the operating system of civilization and governments, regulators, and developers must take the necessary steps to respond properly.  
  • But there is hope. In OpenAI’s most recent long form podcast appearance with Lex Fridman, Sam Altman the CEO discussed how hopeful he was for the future when Lex began describing some of the responses that ChatGPT had given him. The responses, though elaborate and accurate, were also nuanced. Giving both sides of the argument, and often simply laying out the pros and cons of each side’s thinking and not necessarily giving an opinion on an argument. The hope is that if social media removed the ability for nuance in conversation, with these chatbots and their more powerful future replacements become integrated across all our communication platforms, search browsers, and digital work tools, a renaissance of nuance and deeper understanding may be down the line for humanity. This may breach the cultural gap, slow down and perhaps end the polarization we have been seeing for the last several years. Then again, the same hopes were stated with the emergence of the Internet…   

Conclusion

In conclusion, as generative AI models like ChatGPT continue to advance, we find ourselves entering a brave new world of human-AI interaction. These cutting-edge models are transforming the way we communicate, collaborate, and create, offering unprecedented capabilities in generating human-like text and images. As we harness the power of generative AI, we must navigate the challenges and opportunities it presents, striking a balance between the immense potential benefits and the ethical considerations of AI-driven technologies. This exciting era holds the promise of reshaping industries, revolutionizing human-computer collaboration, and unlocking new possibilities for creativity, problem-solving, and knowledge discovery.

Leave a Reply

Your email address will not be published. Required fields are marked *