AI systems on the rise, but at what cost?

ChatGPT and other generative AI models have been gaining popularity lately, but the cost of using them can be prohibitive. That became clear when small startup Latitude grew in popularity with its AI dungeon game that generates fantastical stories based on user input. CEO Nick Walton found that the cost of maintaining the text-based role-playing game software continued to rise as more users played the game. AI Dungeon's text generation was based on Microsoft's GPT language technology developed by OpenAI. Additionally, content marketers used the game to create ad copy, which also impacted Latitude's AI bill.

AI systems on the rise, but at what cost?

According to Walton, in 2021 the company was spending nearly $200,000 a month on OpenAI and Amazon Web Services generative AI software to keep up with the millions of user requests it had to process every day. "We joked that we had human workers and AI workers, and we spent about the same on each of them," Walton said. "We spent hundreds of thousands of dollars on AI every month, and we're not a big startup, so it was a very high cost."

In late 2021, to reduce costs, Latitude switched to a cheaper but still capable language software from AI21 Labs. In addition, the company integrated free open source language models into its services. The company's monthly AI bills have dropped to under $100,000 as a result. Latitude now charges its players a monthly subscription for advanced AI features to help cover costs.

Latitude's expensive AI bills show that the cost of developing and maintaining generative AI technologies can be prohibitively high, both for companies developing the underlying technologies and for those using the AI to power theirs run your own software. This is an uncomfortable reality for the industry, as big companies like Microsoft, Meta, and Google use their capital to carve out a technology lead that smaller challengers can't match. If the margin for AI applications is permanently smaller than the previous margin for Software-as-a-Service (SaaS) due to the high computing costs, this could dampen the current boom.

The high cost of training and "inference" - actually running AI models - on large language models is a structural cost distinct from previous computing booms. Even once the software is built or trained, it still requires an enormous amount of computational power to run large language models since they perform billions of calculations every time they return an answer to a prompt. In comparison, the cost of deploying and maintaining software has historically been comparatively low.

Despite these challenges, demand for generative AI technologies remains high as they can help companies create new products and work more effectively. The industry is working to reduce the cost of using AI technologies by developing more energy-efficient hardware and introducing new algorithms and architectures that require less computing power. Additionally, the open source community can help by providing free AI models and tools.

Overall, AI technology is still in the development phase, and the high cost is a challenge that companies and developers must overcome to successfully deploy the technology. Costs are expected to decrease over time as the industry continues to innovate and develop more efficient solutions.

AI Training Models -The cost of training!

AI Training Models -The cost of training

Analysts and technologists estimate that training large language models like OpenAI's GPT-3 could cost more than $4 million. More advanced models could even cost upwards of "high single-digit millions" to train, according to Rowan Curran, a Forrester analyst specializing in AI and machine learning.

Meta recently released its largest LLaMA model, trained on 2,048 Nvidia A100 GPUs to train 1.4 trillion tokens (roughly 750 words is about 1,000 tokens). The training took about 21 days and required about 1 million GPU hours. With AWS dedicated pricing, the training would cost over $2.4 million. Although the 65 billion parameter model is smaller than the current GPT models at OpenAI, like ChatGPT-3 with 175 billion parameters, it is still an expensive endeavor.

According to Clement Delangue, the CEO of AI startup Hugging Face, training the company's large Bloom language model took more than two and a half months and required access to a supercomputer that was "about the equivalent of 500 GPUs". Organizations that create large language models must be careful when retraining their models to improve their capabilities, as it is very expensive, he stressed.

Delangue noted that it's important to realize that these models aren't being trained all the time, like they are every day. Some models like ChatGPT may not be aware of the latest events. Delangue also emphasized that ChatGPT's knowledge ends in 2021.

Currently, Hugging Face is conducting training for the second version of Bloom, which will cost no more than $10 million. However, Delangue said they don't want to do such workouts every week.

Inference: An expensive process when using AI text generators.

Inference: An expensive process when using AI text generators.

Engineers use trained machine learning models to predict or generate text using the "inference" method. This process can be significantly more expensive than training the model, as it may need to be run millions of times for a popular product. For a product as popular as ChatGPT, with an estimated 100 million monthly active users in January, researcher Curran reckons that OpenAI may have spent $40 million processing millions of prompts over the course of a month.

The costs increase dramatically when these tools are used billions of times a day. Financial analysts estimate that Microsoft's Bing AI chatbot, based on an OpenAI ChatGPT model, will require at least $4 billion in infrastructure to serve all Bing users with answers.

Latitude, a startup that accesses an OpenAI language model, didn't have to pay to train the model but did factor in the cost of inference, which a company spokesman said was about "half a cent per call" and given "a couple of million requests per day". Curran estimates that his calculations are rather conservative.

To fuel the current AI boom, venture capitalists and tech giants are investing billions of dollars in startups specializing in generative AI technologies. For example, according to media reports, in January Microsoft invested up to $10 billion in OpenAI, which oversees GPT. Salesforce Ventures' venture capital arm recently raised a $250 million fund to support generative AI startups.

Many entrepreneurs see risks in relying on potentially subsidized AI models that they don't control and only pay per use. Suman Kanuganti, the founder of personal.ai, a chatbot in beta mode, advises entrepreneurs not to rely solely on big language models like OpenAI or ChatGPT. Companies like enterprise tech firm Conversica are exploring how to leverage the technology through Microsoft's Azure cloud service at a reduced price. Conversica CEO Jim Kaskade declined to comment on the startup's costs, but stressed that the subsidized costs are welcome as they are exploring how language models can be used effectively.

The Future of AI Development: Challenges and Opportunities

The Future of AI Development: Challenges and Opportunities.

It is unclear whether the cost of developing AI in the industry will remain high. Base model companies, semiconductor manufacturers, and startups all see business opportunities in lowering the cost of ownership of AI software.

Nvidia, which holds about 95% of the AI chip market, continues to develop more powerful versions specifically designed for machine learning. However, chip performance improvements in the industry have slowed in recent years.

Still, Nvidia CEO Jensen Huang believes AI will be "a million times" more efficient in 10 years. Not only the chips, but also the software and other computer components are improved. "Moore's Law, in its prime, would have delivered 100x in a decade," Huang said at an earnings conference last month. "By developing new processors, systems, interconnects, frameworks and algorithms, and collaborating with data scientists and AI researchers to develop new models, we have accelerated the processing of large language models millions of times."

Some startups have focused on the high cost of AI as a business opportunity. D-Matrix developed a system to save money on inference by doing more processing in the computer's memory than on a GPU. The founders believe GPUs are expensive and weren't built for the bottom line. Delangue, the CEO of HuggingFace, believes that more companies would be better served by focusing on smaller, specific models that are cheaper to train and operate than the large language models.

OpenAI reduced the cost of accessing its GPT models last month. It now costs a fifth of a cent for about 750 words of output. OpenAI's lower prices have caught the attention of AI dungeon maker Latitude. Latitude CEO Nick Walton said that OpenAI's decision to cut costs will allow them to offer even more users access to their amazing AI-generated stories.

Overall, the future of AI development will depend on many factors, including cost, availability of skilled workers, advances in technology, and regulatory frameworks. However, it remains clear that AI development will play a crucial role in many industries in the coming years and that companies investing in this technology early could have a decisive competitive advantage.