How to make money from generative AI
By Michael Stothard
For the past 50 years the internet has had one core function: moving content from one place to another. In the past year, that core function has been radically upended. This is due to rapid advances in large language models, otherwise known as Generative AI, which allows people to use the internet to create rather than just move content.
We are just now getting to grips with what this might mean for society, business and indeed humanity. The atmospheric portrait of the woman above is not a photo, but an image generated by an AI in 2.1 seconds from a simple one-sentence written prompt. Some of this report is written by Chat GPT, a model from OpenAI. Right now people are using generative AI to write a marketing campaign, build an app, contest a parking fine and (probably) cheat a little with their homework.
Imagine a world where all art, science, architecture, poetry, legal document writing, marketing, advertising, sales, gaming, graphic design, financial modelling, data analysis, therapy, product design, code, university essays and pretty much everything else that involves human creative endeavour can be either augmented or competed by a machine. What would that mean?
If you’re an entrepreneur, be an entrepreneur in this [space]
Emad Mostaque, the London-based founder of Stability AI, recently said that, fundamentally, generative AI means that individuals or small companies will be able to compete with large enterprises like never before. That the technology will be a transformational, democratising force. "This is like the example of the steel mill [in the 1980s]," he says. "There were big vertically integrated steel mills that were out competed by lots of little steel mills, micro mills. The big corporations, the big programmes, the big things will be out-competed by just individuals and small groups building on top of this technology."
"If you’re an entrepreneur, be an entrepreneur in this [space],” he adds, “If you are an artist, you will become the most efficient artist in the world.”
To be clear, generative AI (an umbrella term for a number of machine learning methods, including Large Language Models and Generative Adversarial Networks) has been around for some time (Grammarly and Deepmind have used them well). What has changed recently is their increased power and speed as well as lower cost - opening up access to a much wider community. Today the public is being stunned by great products that can be built, such as Chat GPT, a conversational chatbot from OpenAI, which hit 1m users in 5 days (a feat that took Twitter 24 months, Facebook 10 months and Instagram 2.5 months).
Generative AI looks set to be a core driver of the next decade of enterprise and consumer products - much in the way that mobile and cloud did over the past decade. Imagine the economic value of making all knowledge workers just 10% more efficient. It's no wonder that venture capital money has been pouring into this category ($100bn in 2022) and companies have been raising huge rounds (e.g Jasper, the content creation tool, raised at a $1.5bn valuation). And the models that form the basis for this technology are only getting more powerful.
This report is a deep dive into the opportunity to make money from generative AI, something which is already becoming a crowded space. It's informed by the fact that firstminute has spoken to dozens of companies in this space in recent months, with unique takes on generative gaming, synthetic data, augmented sales tools, enterprise search and creator economy tooling.
This report is our view about what’s coming just around the corner, driven also by insights from the firstminute portfolio and some of the leading figures in the field and including a market map and an overview of some of the 210+ startups in the space.
The first part of the report is about the Generative AI tech stack, where we argue that building higher up the tech stack (e.g creating large models) is a dangerous, expensive and probably fruitless game best left to big tech. In the second part, we dig into the application layer (where we argue most of the value in GenAI will be created for startups) and look at what the key categories are and who is building what today.
We're trying to answer the question: where will the value go? And what is our advice to startups building right now?
The Generative AI tech stack
We at firstminute are most excited about some of the application and tooling layers for generative AI. But before we go into the applications of Generative AI, we want to take a step back, and take a (hopefully helpful) look at the roughly 4 parts of the tech stack that founders could be attacking. These are:
Foundational general models created by the likes of Open AI, Cohere AI, Anthropopic AI and AI21 Labs. Their businesses will all be based on an API model, which they will fine-tune for different use cases. Big money, big costs and likely big competition.
Fine tuned models are trained on more narrow data sets for more specific purposes (e.g finding fake news). Even more specific are the hyper-local models, which use hyper-local proprietary data sets to build even better models. Both are interesting areas. Verticals here are CoPilot for programmers (GitHub and Microsoft) as well as LLMs for Chatbots (e.g Character AI, Inflection AI), LLM for robotic process automation (e.g AdeptAILabs) or for search (e.g Neeva)
The operating systems and tooling companies are the picks and shovels of this particular gold rush will also do well. These are companies such as API orchestrators that tie different models together for a single application as well as useful tools like labelling (e.g Scale AI or Snorkel AI), training infrastructure (e.g MosaicML, Stronger Compute) or infrastructure companies (e.g Gooseai NLP).
The applications layer itself is the products or services that people actually interact with (e.g a chatbot or a bit of legal writing software). One way to think of them is kind of “prompt engineering layers” on top of the foundation model players, and most will look like vamped-up versions of the enterprise SaaS companies from the last decade with an “AI intelligence layer”. The good ones will build great product and a data moat. Big companies will be built!
First idea: don't build a big model
Every part of the Generative AI tech stack is exciting, the best idea for a founder today is probably to look at the bottom of this tech stack (the application layer) rather than the top.
That's because, what’s been notable about these big algorithmic developments is that the commercial advantage for the creator now lasts a very small time - in part because everyone publishes what they've done.
This means that within weeks or months of the latest development, we typically have a free open source version which can be run on your laptop. For example, Eleuther.ai’s GPT-NeoX-20B launched Feb 2022 as the open source alternative to OpenAI’s GPT-3 for text generation. StabilityAI’s Stable Diffusion, launched August 2022, is the open source alternative to OpenAI’s DALL-E 2 for images and videos.
This is also striking particularly because of how labour-intensive these models are to make. GPT-3, for example, was initially trained on 45 terabytes of data and employs 175 billion parameters or coefficients to make its predictions; a single training run for GPT-3 cost $12m. Wu Dao 2.0, a Chinese model, has 1.75 trillion parameters.
The latest big model to be released is ChatGPT, and we would expect a similar open source version of that to come out soon. Not that there is no value there at all. The company behind it, OpenAI, is reportedly about to get a $10bn investment from Microsoft at a $29bn valuation. But just that long term there is an issue that Sam Altman, the founder of OpenAI talks about often - which is that the marginal cost of intelligence is heading towards zero.
What does that mean? In a short amount of time, everyone will have access to the best models. It's cool to be first, but it won't be much of an advantage for very long.
So if everyone is effectively being gifted these breakthrough models by big tech companies, and nobody has much of an advantage - where does the commercial advantage lie? What should startups be building?
Better to fine-tune a specific model instead
For startups, much more interesting than Large Language Models is the growth of fine tuning these models.
This fine tuning means taking a model that has been trained (perhaps expensively) to represent a general domain (for example, by predicting missing words in Wikipedia) and using its output as input to a much smaller model that is trained for a specific task (for example, classifying fake news).
The smaller model is typically not only much cheaper in compute time to train but needs less data. Interestingly ChatGPT is an example of this strategy. Its improvement over GPT-3 is as a result of fine-tuning with reinforcement learning from human trainers interacting with it and telling it what they would and would not like to get back. Its model is 100 times *smaller* than GPT-3 and yet it performs much better.
Other specialised models include BERT — for biomedical content (BioBERT), legal content (LegalBERT), and French text (CamemBERT) — and GPT-3 for a wide variety of specific purposes. NVIDIA’s BioNeMo is a framework for training, building and deploying large language models at supercomputing scale for generative chemistry, proteomics, and DNA/RNA.
Chamath Palihapitya calls this new area "models-as-a-service" replacing "software-as-a-service".
That creates a lot of opportunity in verticals: fine tune one of the big models to your domain (e.g. from a general vision model to estimating insurance claims based on photos of car damage). This trend has been going a while but will accelerate. Chamath Palihapitya calls this "models-as-a-service" replacing "software-as-a-service".
Imagine a chat interface like ChatGPT but customised to a use case like editing (e.g verb.ai). You could call this 'co-pilot for everything'. Models that can be fine tuned to existing enterprise data stores will be particularly interesting.
Even these fine-tuned models though are unlikely to be the best in class for a long time and see the kind of defensibility to build generational billion-dollar businesses. The winning models will be those that are focused on product (ChatGPT is an amazing product, as well as a model) as well as the ability to collect proprietary data that others struggle to access. Some will get a flywheel of a tweaked hyper-local model getting better and better than the competition because people are using it more - making it in turn better and better again.
We are just entering an AI-powered golden age of writing, art, music, software, and science. It's going to be glorious. World historical. (Marc Andreessen)
Build today in the application layer
There will be a lot of money to be made in tooling. As mentioned above, this is a core part of the tech stack.
But we at firstminute are probably most excited by the application layer, won by people who can build great product and a data moat, fundamentally focusing on product and customers.
This is tough, of course, because big tech is working hard to incorporate this technology into their networks. There will be areas though where big tech is too sleepy or too cumbersome to move fast enough - something the equivalent of a Twitch where YouTube could have integrated that into its model, but didn’t.
There are a few major applications for this technology. Many are summed up in this market map below, but here are some broad categories that we are excited about. If you are building in any of these, let us know.
Part 2: Some exciting applications for Generative AI
There are some great resources out there on Generative AI. Our market map is just one - there are other here and here and here. There are lists of companies here and here as well as great work by Dealroom here. There are some great pieces about the topic here and here.
Ultimately, we agree with Ciarán O'Mara, the co-founder of the ML vision company and firstminute portfolio company Protex that "Generative AI will become part of everyone’s toolbox, and learning how to leverage and harness its capabilities is no different than learning how to Google effectively." Below though, is how we look at the landscape and areas where we see people building the fastest.
It’s hard to categorise what is being built, particularly as many of the most interesting companies are crossing mediums, but as a (hopefully) helpful heuristic, we have split the application layer into eight categories: text, code, images, audio, video, data, gaming / 3D and biology. The other way to have done this would be by category (e.g generative AI for health, generative AI for the creator economy etc... but maybe for the next report!)
1) Text
Some of the earliest applications of Generative AI has been to augment human writing or replace human writing. There is a whole world of opportunity for Generative AI to achieve automated content generation (articles, blog posts, or social media posts); to improve content quality (AI models are able to learn fast what’s good); to create more varied content (AI models can generate a variety of content types) and also create personalized content (imagine AI models that can generate personalized writing - books, advertising copy, screenplays - based on the preferences of individual users). This could help big companies run more efficiently - or empower ordinary people to compete with these companies.
As Dominik Angerer, the co-founder of the content management software Storyblok, says that "generative AI will impact the way people brainstorm and develop content ideas.... Why should an article only be available in long-form when AI can provide a summary? Think of AI as an easy way to generate multiple headlines, images you can use right away without buying assets, and even a full page of content at the press of a button that a human can then iterate on."
The future for text is probably in vertical-specific writing assistants, where general writing models (e.g Chat GPT) get fine-tuned (see above) for very specific use cases. There are potentially hundreds of these use cases, but some that are flourishing are:
Copywriting & Writing: Writing SEO content on websites and writing advertising copy is a perfect use case for generative ai, given the short nature of the text. Big companies have already been built in this area. Examples: Jasper.ai, Compose AI, Mutiny, Anyword, Pepertype.ai, regie.ai, Copy.ai, Otherside AI, Copy Monkey, Conto, Text.Cortex, Mentum.
Sales and customer relations. One exploding area is companies using Generative AI to help salespeople write better cold outbound emails or generally deal better with customers. The idea is that if a trained model can reduce the time taken to write a good outbound email from 10 minutes to 2 minutes that’s a huge saving. Examples: Outplay, Lavendar, Twain, Sampled, Creatext, polyai
Knowledge and research. One of the biggest topics in generative AI is its impact on search (Google faces a huge innovators dilemma here: does it disrupt itself?). There are reports that Microsoft is planning to add OpenAI's generative AI-powered ChatGPT to its Bing search engine. There are already startups trying to transform search, such as Andi, You and Metaphor (all of which are very cool). There are also more general knowledge organisational systems, such as Mem, a self-organising workplace (a generative AI version of Notion), or Glean, which is automation for finance teams. Pragma is another one, aiming to centralise all your organisation's knowledgebase for easy reference.Examples: Andi, You, Metaphor, Mem, Seek AI, Pragma and Glean.
Conversational AI / chatbots. LLMs are increasingly being used at the core of conversational AI or chatbots. Spending 5 minutes playing with Chat GPT and you can see why that’s powerful, and a huge leap from the V1 of chatbots that was a startup craze a few years ago. These can be used in customer service, clearly, but also potentially in health (communicating with doctors or therapists); education (teaching people online); HR (corporate training or onboarding) or in travel (coming up with a travel itinerary). Examples: Ada, ASAPP, Observe.ai, Cresta, Woebot Health, Forethought, Kasisto, PolyAI, Balto, Ushur, Mavenoid, EliseAI
Legal support or writing. Legal documents are painful and expensive to put together, so it’s easy to image a future where documents are done (either partly or completely) by fine-tuned legal language models. Already there is PatentPal, wich automates part of the writing in patent applications. There are also companies such as Darrow, which uses AI to discover legal violations, and Do Not Pay, which is using generative AI to fight for consumer rights. Maximilian Vocke, a lawyer at the lawfirm Osborne Clarke, says that one of the lawyer's core tasks, drafting documents, could become “secondary” thanks to Generative AI, at least in terms of simpler more standardised cases. He adds that “with data banks of case law and commercial registries becoming digitalised and many widely used legal templates being provided online,” there is more material than ever for AI to work with and use, for example, in legal due diligence. Examples: PatentPal, Darrow, Do Not Pay
2) Code
Some coders have been (perhaps unwisely!) showing how Chat GPT can do their jobs for them with ease. See one example of it working from Amjad Masad below. Maybe generative AI will not replace coders just yet, but it’s certainly making them more efficient: GitHub Copilot is now generating nearly 40% of code in the projects where it is installed.
So Generative AI could go a long way to making coders better and faster. Another opportunity may be opening up access to coding for consumers - allowing them to learn to prompt rather than learn to code.
Maximilian Eber, the founder of Taktile, a company that helpsfintech companies test and deploy decision-making models, says: "Code generation (Codex/GitHub Copilot) will be one of the first applications of Generative AI with a substantial enterprise footprint because making developers more productive has such obvious value for businesses.”
Examples of coding startups/products: GitHub Copilot, Replit, Warp, Tabnine, Codacy, Bloop, Codiga, aiXcoder, Maya, MutableAI, Amazon CodeWhisperer, Quickchat, Drafter, Smartly.ai
3) Images
Generative AI first truly captured the public’s imagination last spring, when OpenAI unveiled a system called DALL-E, which lets people generate photo-realistic images simply by describing what they wanted to see. It was startling to play with (and a lot of fun). So as well as disrupting writing and coding, generative AI is also disrupting the visual world. This is, of course, a huge opportunity for startups. Here are some of the key areas that are being attacked:
Design: Imagine you could type in prompts such as “next-generation Nike trainers” or “more durable car bonnet” and a programme would build you sketches and 3D design models? High-fidelity renderings from rough sketches and prompts are already a reality, but getting better all the time. This could work for physical products, but also for digital products such as apps (e.g “design me a dating app”) This can extend to areas such as kitchen design, architecture and complex engineering projects as well. Examples: Diagram, Vizcome, Aitister, Uizard, Test Fit, Uizard, Swapp, Diagram, Cala
Image generation: Related to the above but there are also an enormous number of startups just generating images from text. Some of the biggest are MidJourney, DALL-E and Stability AI (mentioned above). Interesting here is their impact on the creator economy and ordinary artists as well as how these image models get integrated into other products (e.g image generation + text generation to make beautifully designed books or content). Also the disruption of stock photo libraries like Getty (why not just generate your own) and image software like the Adobe suite. This is also a marketing application here as well, very similar to the one described above for text. Examples: Stable Diffusion, Lightricks, PhotoRoom, Facet, PixelVibe, Prisma Labs, Imagen, Let's Enhance, KREA, Depix
4) Audio
The top-10 pop music charts are pretty formulaic. Why can’t generative AI write them? Why can’t we get Stephen Fry to narrate ALL our audio books at the click of a button? We are not so far away from any of this. In the recent documentary The Andy Warhol Diaries, filmmakers used generative technology to recreate the late artist's voice (with access to only 3 minutes of audio of him speaking). New software allows producers to change a singer’s tone of voice and even change a lyric with text with a button. Which are the startups going after the audio market? And what are the sub-categories?
Music generation: One core category is the creation of music from scratch, allowing for infinite copyright-free music for commercial use. This might be background music - but it may also be the next global hit song! Examples: Mubert, Musenet, boomy, Loudly, Endel.
Speech generation: There are also a host of startups that can create speech from scratch, either mimicking others or building sophisticated text-to-voice programs. Examples: Resemble, Wellsaid, Coqui, polyai, voicemod, DeepBrain AI
5) Video
If you can make images, then you can also make videos. It is easy to see how “make me an oil painting of a man made out of fruit in the style of Rembrandt” can become “make me a Pixar-style movie about a little boy who does not want to grow up” or “make me a safety training video for my oil refinery”.
Text-to-video convertor: Runway, Tavus, Fliki and others are text-to-video converters that allows you to create video (or audio) content using AI voices. Some of the core use cases here are for corporate training videos, promising to cut down time to make a video from weeks to minutes. Examples: Tavus, Fliki, Colossyan, Synthesia, Runway, Rephrase, Opus, munch, Synesia, Hour One.
Making a movie. Meta and Google have both announced software that converts text prompts into short videos; another tool, Phenaki, can do whole scenes. None of these video generators has been released to the public yet, but the company D-ID offers an AI app that can make people in still photos blink and read from a script, and some have been using it to animate characters created by Midjourney. Belfast-based composer Glenn Marshall's AI short “The Crow” foretells a future where entire feature films are produced by text-to-video systems
6) Data
Synthetic data is information that's artificially generated rather than produced by real-world events. It’s powerful because many fields need enormous quantities of data (like training AI models!) which can be difficult and expensive to get. Synthetic data generation is powered by deep generative algorithms, which use data samples as training data, learn the correlations, statistical properties and data structures. Once trained, the idea is that the algorithm can generate data that is statistically and structurally identical to the original training data. All this is arguably not really an application lawyer, but a deep model, but is still an exciting area for progress. Examples: Gretel, Tonic, Aindo, MostlyAI, MDClone, Mindtech, Hazy, DataGen,
7) Gaming & 3D
A longstanding dream in generative AI has been to create dynamic digital 3D environments – whole worlds of objects, spaces, and characters that can interact together. This could be used to create synthetic worlds and enormous amounts of synthetic data, but also new content for games and metaverses as well as digital twins of the physical world.
Already in gaming, there are big advances here, with LLMs able to produce textures and non playable characters. As yet, there is no text-to-game engines (games are complicated) but there are startups edging this way such as Versed. It's a huge prize. It takes tens of thousands of hours of manpower to make one hour of games, which is tough for even big studios. Generative AI could lead to a new ear of creativity in gaming, a trend that Robolox and Unity have already started.
Examples: Masterpiece Studio, Replica Studio, Latitude.io, AI Dungeon, Character.AI, CSM, Spellbrush, Hidden Door, Ponzu, Moatboat, Plask The Simulation (a metaverse only filled with AIs.)
8) Biology & Science
Companies such as Insilico and Deepmind have been using machine learning speed up science with the help of generative AI. Will we end up in a world where we can say: "computer, design me a harder type of concrete" or "computer, look at my cancer and design me the best chemo drug to fight it"? Examples: Insilico, Aqemia, Valence, Cradle, Neuro, X1MDM
9) Other ideas
Ecommerce: there is an opportunity to integrate generative AI into the whole ecommcerce stack. For example: "Hey AmazonGPT, can you make me some custom wallpaper for my hall based on this image?". Also, what happens when Shopify stores get pre-loaded with generative AI tools?
Multi-modal models. This ecommerce point speaks to another key trend - the growth of multi-modal models: those that can understand several domains (e.g. text and image) based on a shared model.
A new type of job: prompt engineering. As AI's become more and more powerful, the ability to get the most out of them (communicate with them?) becomes a job in itself. This is already at least one great business, Promptbase, which is a marketplace for great prompts. But does prompting become the ultimate high-level coding language?
Disrupting email. There is a lot of talk about search being disrupted (see below), but Sam Lessin makes the point that the real problem with Gmail is it's hard to search. What could another LLM-focused email client do to help people access, use and categorist decades' worth of data on themselves?
Ethics. Companies will spring up fighting back against generative AI, not just in text but also in video protecting IP and warning about deepfakes. Ciarán O'Mara, the co-founder of the ML vision company Protex, says: "I think there will be big winners in the ethics space. This could be solving the problem of the Generative AI outputs being biased or controversial or it's potential to controversially disrupt the art industry. Talented artists are having their unique style used to train a model that outputs a body of work that they get no credit for."
Generative AI and blockchain. If anyone can create art in any style at the touch of the button, what does that mean for authenticity? Is this an opportunity for Web3-powered art and NFTs? Isaac Kamlish, the co-founder of NFT primary minting platform Fair.xyz, says: "We are consistently seeing visual art being democratised through generative AI. We therefore anticipate artist provenance to play an even more valuable role. A Damien Hirst will only feel like a Damien Hirst if it’s created by him (even if it looks like his style!). People will lean more into the who, what and the why rather than the judging art purely on aesthetic. We believe that blockchain will be the vehicle to drive this change."
Generative AI and content management systems. Dominik Angerer, the co-founder of the content management software and firstminute portfolio company Storyblok, sees how it's important for the infrastructure layer of content. "Generative AI will impact the way people brainstorm and develop content ideas.... Why should an article only be available in long-form when AI can provide a summary? Think of AI as an easy way to generate multiple headlines, images you can use right away without buying assets, and even a full page of content at the press of a button that a human can then iterate on."
Conclusion:
There is extraordinary hype around generative AI. VCs like us are obsessed with it. There are some people that are chiming in with notes of caution as well, comparing it to the web3 bonanza of the past few years that led to some not-so-great outcomes (e.g FTX).
François Chollet at Google says: “The current climate in AI has so many parallels to 2021 web3 it's making me uncomfortable. Narratives based on zero data are accepted as self-evident. Everyone is expecting as a sure thing "civilization-altering" impact (& 100x returns on investment) in the next 2-3 years”
“The bull case is that generative AI becomes a widespread UX paradigm for interacting with most tech products… Near-future iterations of current AI models become our interface to the world's information.”
“The bear case is the continuation of the GPT-3 trajectory, which is that LLMs only find limited commercial success in SEO, marketing, and copywriting niches, while image generation (much more successful) peaks as a XB/y industry circa 2024. LLMs will have been a complete bubble.”
There are also big questions about how to build defensibility in this area when, as discussed above, models become commoditised and data moats can be hard to maintain. Finally, there is a big question, brilliant laid out here by ex-Google and ex-Twitter investor Elad Gil, on how much of the value at all with go to incumbents or big tech companies.
Still, with that note of caution, it’s impossible to ignore a new area of technology that is so wide-reaching and touches so much of the economy (and what makes us human). We at firstminute believe this is the time to build - and if you want to build in AI with us, do get in touch at michael@firstminue.capital, steve@firstminute.capital or deals@firstminute.capital.
Thanks most of all to Steve Crossan for help with this report. But also to all those featured and also to folks I have been reading and quoting: Dominik Angerer, Ciarán O'Mara, Isaac Kamlish François Chollet, Elad Gil, Sam Lessin, Amjad Masad and Emad Mostaque.