Shibabrata Mondal, Founder & CEO - Wizergos, discusses the applications and benefits of Generative AI for enterprises, detailing LLMs, their differences from older AI systems, and their potential use cases such as content creation, verification, and enhancing document style and tone.
Here's a quick background for you!
Here at Wizergos, our focus is on creating and managing business applications for medium-to-large companies in sectors such as BFSI, Manufacturing, Retail, Pharmaceuticals, and more.
Generative AI, like ChatGPT, released by OpenAI about a year and a half ago, has stirred public interest. However, most of its development and use has been within large tech firms and startups. Now, enterprises are keen to understand how they can benefit from this technology and not be left behind.
Given our focus on building bespoke business applications, we’ve been studying Large Language Models (LLMs) and Generative AI with a focus on enterprise use cases. We’ve updated various features and functionalities in the Wizergos Low Code Platform based on our research.
This blog captures my current understanding of where and how enterprises can effectively use and benefit from Generative AI.
Introducing Generative AI
Before diving into the use cases of Generative AI, let’s create a mental model of what it is. In this blog, we will focus mainly on LLMs.
Let’s start by looking into what LLMs are and what they aren’t, how they differ from older AI architectures, and how they differ from Information Retrieval Systems.
Information Retrieval Systems
IR (Information Retrieval) systems help find information based on queries. An example of a natural-language-based information retrieval system is Google Web Search, where, based on some search text, the system finds documents on the web that are good matches for the search text.
Note here that the results are actual documents typically written by humans. If we trust the source of the documents, we can fully trust the results. In this case, as long as we make sure the documents we search are accurate, the results will always be factually correct.
Artificial Intelligence Systems
Older AI systems were mostly predictive systems that were trained on human-labeled data. We can discuss two examples of these in the Natural Language Processing space:
Sentiment detection: a system is trained on large amounts of text that are labeled by humans with different sentiments. Now, when we feed a new text to the system, it will predict the sentiment of the new text.
Chatbots: a system is trained on large amounts of text which are again labeled by humans as different intents. Now, when we feed a new text to the system, it will predict the intent in that text, and based on the predicted intent, the system will start a predefined workflow.
LLM Systems
LLM architecture is different from earlier AI systems and IR systems in that it:
Creates new text content that likely never existed before (unlike IR systems that retrieve existing documents).
Unlike earlier AI systems, LLMs do not require labeled data for training. So, essentially, all freely available digital textual data can be used to train LLMs.
Training LLMs do not need any labeled data. We can essentially feed the texts we have, and the system learns the sequence. So, at the end of the learning process, the system gets good at predicting the next words in a sentence given an initial partial phrase. So, we can see that the content generated by LLMs is new content and not retrieved verbatim from an existing text. This means that LLMs are creative, but unlike IR systems, they do not guarantee factual correctness.
Language as a Sequence of Words
For any NLP (Natural Language Processing) system, the key is to capture the meaning of words. Computers can only deal with numbers (surprise! surprise!). A natural choice was to represent words as numbers and try and capture the meaning of words using dictionaries. However, this did not take us too far because the semantic and syntactic meanings were hard to represent using this method. Then, around a decade back, a breakthrough came inspired by 19th-century British Linguist John Rupert Firth’s theory, “You shall know a word by the company it keeps”. This led to research on representing words as vectors in a high-dimensional space (typically around 300). Training a neural network to predict the words around each word using a large corpus of human written texts seemed to work very well.
It was found that these word vector representations were very good in both syntactic and semantic meanings of words and that synonyms of words were closely clustered together in the vector space. Interestingly, it was also found that lexical relationships of words like man : woman :: king : What(X)? worked very well by simply doing vector math like X = king + woman – man was close to queen.
One of the disappointing results of this word vector architecture was that many exactly opposite words appear in the exact same context around other words in human written texts.
I hate ice cream
I love ice cream
It was found that word vectors were not good at finding antonyms. Because of this issue, chatbots sometimes made the exact opposite inferences.
Form or Style vs. Content or Facts
The first introduction of ChatGPT impressed everyone. One reason was that it was really good. The other reason was that it generated text that was great in form and style. For us mere mortals with expertise in some fields, it is easier to create factually correct content. However, it is difficult for most of us to create content with great style (with the exception of writers, poets, and journalists). It is easy to mistake superior style for superior expertise.
Use cases of Generative AI in the Enterprise
Based on this understanding that Generative AI generates content in a very good style and could be factually incorrect in some cases, I can think of 3 broad categories in which we can have use cases of Generative AI in the Enterprise:
Use cases where the content does not try to be very deep reasoning and facts-based. These could be social media posts or other high-level marketing content or email campaigns where the style and form are of utmost importance but often are not verifiably factual in nature.
Use cases where an automated expert system or human experts can verify the facts, and it is efficient to generate the content using LLMs and then verify the facts. Code generation, extracting structured information from large unstructured documents can fall in this category.
Rewrite factual content to improve the style and tonality. Examples of this could be writing reports or bots that can first create templated content using traditional AI methods and then change the style based on context and use case.
Facts of the Enterprise
Foundation models of LLMs are built using openly available text in digital format. For enterprise use cases, this is not good enough; the use cases need to work using the knowledge available in databases and documents inside the enterprise. Training a bespoke foundation model is expensive and rarely a good choice. To solve this, there are 3 choices for the enterprise:
Fine Tuning: in this case, using the documents of the enterprise, a foundation model can be retrained where the model parameters are changed. This approach can be taken when the use case demands knowledge from a large number of internal documents that will not change very frequently.
In-context learning or what is popularly known as Prompt Engineering. This is a good choice when information needs to be extracted from one document or some document needs to be rewritten.
RAG (Retrieval Augmented Generation): This can be used when the task is to extract information from a large number of documents that will not fit for In-context learning and also where the document list changes frequently enough that Fine Tuning is not a feasible option. This option essentially combines an IR system and a Generative AI system. Most commonly, a vector database is used for the IR system, but one can use any other methods to combine an IR system and a generative AI system.
API or Run In-House
Lastly, for each use case, an enterprise needs to choose from one of two broad choices:
Should we use APIs provided by Google OpenAI or Microsoft
Should we run the models in-house (Companies like Meta and Mistral have open sourced their foundation models that you can download and run).
This choice should be based on cost-benefit and data privacy considerations.
Hope this was a helpful read for you to get a sense of how to solve real-world problems
using Generative AI. Happy building!
Author
Shibabrata Mondal
Founder & CEO, Wizergos
For a conversation on how you can begin your GenAI Journey, reach out to:
Sandeep Sharma
Vice President- Business Development
+91-7045403638
Article originally published on AJNVJ Media.
Comments