Introducing the GenAI models you haven’t heard of yet

Ever since OpenAI’s ChatGPT set adoption records last winter, companies of all sizes have been trying to figure out how to put some of that sweet generative AI magic to use.

In fact, according to Lucidworks’ global generative AI benchmark study released August 10, 96% of executives and managers involved in AI decision processes are actively prioritizing generative AI investments, and 93% of companies plan to increase their AI spend in the coming year.

Many, if not most, enterprises deploying generative AI are starting with OpenAI, typically via a private cloud on Microsoft Azure. The Azure deployment gives companies a private instance of the chatbot, meaning they don’t have to worry about corporate data leaking out into the AI’s training data set. And many organizations already have a relationship with Microsoft, and are comfortable with the security, manageability, and enterprise support they get from the company.

For example, software vendor Nerdio uses generative AI to generate Powershell scripts for its customers, convert installer code from one language to another, and create a custom support chatbot.

ChatGPT is capable of doing many of these tasks, but the custom support chatbot is using another model called text-embedding-ada-002, another generative AI model from OpenAI, specifically designed to work with embeddings—a type of database specifically designed to feed data into large language models (LLM). Common approaches include vector databases and graph databases.

“We’re building a vector database with all our scripts and escalation tickets, and providing it to our AI instance,” says Stefan Georgiev, Nerdio’s senior technical product manager.

Using embeddings allows a company to create what is, in effect, a custom AI without having to train an LLM from scratch.

“It would be very difficult for us to get the amount of data needed to train a generative AI model ourselves,” says Georgiev. “We’d have to build data pipelines to collect and aggregate all our data and usage patterns before being able to build our own model, suited for our space. But we didn’t and we don’t plan to because there are already pretty good generative AI models out there. All we need to do is specialize them for our needs.”

But although OpenAI was the first company out of the starting gate, it’s no longer the only game in town. Companies are looking at Google’s Bard, Anthropic’s Claude, Databricks’ Dolly, Amazon’s Titan, or IBM’s WatsonX, but also open source AI models like Llama 2 from Meta.

Open source models are also getting easier to deploy. In fact, Microsoft has already announced they’ll support Llama 2 in their Azure cloud, and AWS has support for a number of LLMs through its Amazon Bedrock service, including models from Anthropic, Stability AI, AI21 Labs, and Meta’s Llama 2.

S&P Global Market Intelligence is looking at them all.

“We use Microsoft, Google, Amazon, and also open source models from Hugging Face,” says Alain Biem, head of data science for the global financial information company.

For example, S&P Global uses the OpenAI API via Azure, but it’s just one of many AI APIs the company can call on.

“We’re extremely agnostic about the large language models,” he says. “We select the LLM based on the use case. Our philosophy is not to be tied to a model, and the way we develop our products is so we can update the models or change from one vendor to another.”

The company also keeps a close eye on the Hugging Face leaderboard, he says, which, at press time, is dominated by Llama 2 and its variants.

Meta released Llama 2 in July and it stands out among other open source generative AI projects because of its size and capability, and also because of its license; companies can use it for free, even commercially. The only restriction is that companies with over 700 million active daily users will need to obtain a special license from Meta.

S&P Global is testing Llama 2, Biem says, as well as other open source models on the Hugging Face platform.

Many companies start out with OpenAI, says Sreekar Krishna, managing director for data and analytics at KPMG. But they don’t necessarily stop there.

“Most of the institutions I’m working with are not taking a single vendor strategy,” he says. “They’re all very aware that even if you just start with OpenAI, it’s just a starting gate.”

Most often, he sees companies look at Google’s Bard next, especially if they’re already using Google cloud or other Google platforms.

Another popular option is Databricks, which is a popular data pipeline platform for enterprise data science teams. The company then introduced Dolly, its open source LLMs, in April, licensed for both research and commercial use, and in July, also added support for Llama 2.

“The Databricks platform is capable of consuming large volumes of data and is already one of the most widely used open source platforms in enterprises,” says Krishna.

The Dolly model, as well as Llama 2 and the open source models from Hugging Face, will also become available on Microsoft, Krishna says.

“It’s such a fast-evolving landscape,” he says. “We feel that every hyperscaler will have open source generative AI models quickly.”

But given how fast the space is evolving, he says, companies should focus less on what model is the best, and spend more time thinking about building flexible architectures.

“If you build a good architecture,” he says, “your LLM model is just plug-and-play; you can quickly plug in more of them. That’s what we’re doing.”

KPMG is also experimenting with building systems that can use OpenAI, Dolly, Claude, and Bard, he says. But Databricks isn’t the only data platform with its own LLM.

John Carey, MD of the technology solutions group at global consulting firm AArete, uses Document AI, a new model now in early release from Snowflake that allows people to ask questions about unstructured documents. But, most importantly, it allows AArete to provide security for their enterprise clients.

“They trust you with their data that might have customer information,” says Carey. “You’re directly obligated to protect their privacy.”

Snowflake’s Document AI is a LLM that runs within a secure, private environment, he says, without any risk that private data would be shipped off to an outside service or wind up being used to train the vendor’s model.

“We need to secure this data, and make sure it has access controls and all the standard data governance,” he says.

Beyond large foundation models

Using large foundation models and then customizing them for business use by fine-tuning or embedding is one way enterprises are deploying generative AI. But another path some companies are taking is to look for narrow, specialized models.

“We’ve been seeing domain-specific models emerging in the market,” says Gartner analyst Arun Chandrasekaran. “They also tend to be less complex and less expensive.”

Databricks, IBM, and AWS all have offerings in this category, he says.

There are models specifically designed to generate computer code, models that can describe images, and those that perform specialized scientific tasks. There are probably a hundred other models, says Chandrasekaran, and several different ways companies can use them.

Companies can use public versions of generative AI models, like ChatGPT, Bard, or Claude, when there are no privacy or security issues, or run the models in private clouds, like Azure. They can access the models via APIs, augment them with embeddings, or develop a new custom model by fine-tuning an existing model via training it on new data, which is the most complex approach, according to Chandrasekaran.

“You have to get your data and annotate it,” he says. “So you now own the model and have to pay for inference and hosting costs. As a result, we’re not seeing a lot of fine-tuning at this point.”

But that will probably change, he says, with new models emerging that are smaller, and therefore easier and cheaper for companies to do the additional training and deploy them.

There’s one other option for companies, he adds.

“That’s where you build your own model from scratch,” he says. “That’s not something a lot of enterprises are going to do, unless you’re a Fortune 50 company, and even then, only for very specific use cases.”

For many companies, using off-the-shelf models and adding embeddings will be the way to go. Plus, using embedding has an extra benefit, he says.

“If you’re using the right architecture, like a vector database, the AI can include references with its answers,” he says. “And you can actually tune these models not to provide a response if they don’t have reference data.”

That’s not usually the case with public chatbots like ChatGPT.

“Humility is not a virtue of the online chatbots,” says Chandrasekaran. “But with the enterprise chatbots, it would say, ‘I don’t know the answer.’”

Going small

Smaller models aren’t just easier to fine-tune, they can also run in a wider variety of deployment options, including on desktop computers or even mobile phones.

“The days of six-plus months of training and billions of parameters are gone,” says Bradley Shimmin, chief analyst for AI platforms, analytics, and data management at tech research and advisory group, Omdia. “It now takes just hours to train a model. You can iterate rapidly and improve that model, fine tune it, and optimize it to run on less hardware or more efficiently.”

A company can take open source code for a model such as Llama 2—which comes in three different sizes—and customize it to do exactly what it wants.

“That’s going to cost me phenomenally less than using GPT 4’s API,” says Shimmin.

The smaller models also make it possible for companies to experiment, even when they don’t know much about AI when they’re starting out.

“You can stumble around without having a lot of money,” he says, “And stumble into success very rapidly.”

Take Gorilla, for example. It’s an LLM based on Llama, fine-tuned on 1,600 APIs.

“It’s built to learn how to navigate APIs,” Shimmin adds. “Use cases include data integration in the enterprise. You’ll no longer have to maintain a pipeline, and it can do root cause analysis, self-heal, build new integrations rapidly—your jaw will drop.”

The challenge, he says, is to figure out which model to use where, and to navigate all the different license terms and compliance requirements. Plus, there’s still a lot of work to do when it comes to operationalizing LLMs.

Gen AI isn’t just about language

Language models are getting most of the attention in the corporate world because they can write code, answer questions, summarize documents, and generate marketing emails. But there’s more to generative AI than text.

Several months before ChatGPT hit the news headlines, another generative AI tool that made waves—Midjourney. Image generators evolved quickly, to the point where the images produced were indistinguishable from human work, even winning art and photography awards.

DeadLizard, a boutique creative agency that counts Disney among its clients, uses not only Midjourney but several other image tools, including Stable Diffusion and ClipDrop for image editing, and Runway for adding motion.

The images are used in the company’s own branded social media content, but also as part of the idea-generation and creative development process.

“By adding an open generative AI toolset, it’s the equivalent of opening an entire Internet worth of brains and perspectives,” says DeadLizard co-founder Todd Reinhart. “This helps accelerate ideation.”

Even weird or illogical suggestions can be helpful at this stage, he says, since they can inspire solutions outside the usual comfort zones. In addition, new generative AI tools can dramatically improve photo editing capabilities. Previously, the company had to do custom shoots, which are usually prohibitively expensive for all but the biggest projects, or use stock photography and Photoshop.

“We find entirely new workflows and toolsets coming to light on nearly a weekly basis,” he said.

Artificial Intelligence, CIO, Generative AI, IT Leadership