Large language models like GPT-3 aren’t good enough for pharma and finance

natural language processing (NLP) is one of the most exciting subsets of machine learning. This allows us to talk to computers as if we were humans and vice versa. Siri, Google Translate, and the handy chatbots on your bank’s website all rely on this kind of AI, but not all NLP systems are created equal.
In today’s AI landscape, smaller, more targeted models trained on data that matters are often better suited for business initiatives. However, there are large-scale NLP systems with incredible communication abilities.calledlarge scale language model‘ (LLM), these can answer plain language queries and generate novel texts. Unfortunately, most of them are novelties and not suitable for the specialized work that most professional organizations require for their AI systems.
Open AI GPT-3One of the most popular LLMs is a feat of engineering. However, they also tend to output text that is subjective, inaccurate, or meaningless. This makes these huge and popular models unsuitable for industries where accuracy is critical.
favorable outlook
There are no surefire bets in the STEM world, but the outlook for NLP technology in Europe is bright and sunny for the foreseeable future.The global market for NLP is currently estimated to be around $13.5 billion, but experts say the market is growing in Europe alone. over $21 billion By 2030.
This presents a wide open market for new startups to form, along with established industry players such as: dirty When Arria NLGThe former Dataiku was initially founded in Paris but has performed very well in the global fundraising stage and now has offices around the world. The latter company, Arria NLG, is essentially a spin-out of the University of Aberdeen and has expanded well beyond its Scottish origins. Both companies are building on their natural language processing solutions to great success by focusing on data-centric solutions that produce verifiable and accurate results for enterprise, pharmaceutical and government services.
One of the reasons these particular outlets have been so successful is that it is very difficult to train and build trustworthy AI models. For example, LLMs trained on large datasets tend to output “fake news” in the form of random statements. This is useful if you want to write ideas and inspiration, but totally unacceptable if accuracy and factual output are important.
I spoke with the CEO of one such company, Emanuel Warkner. IsopHis Paris-based company is an AI startup that specializes in using NLP for natural language generation (NLG) in standardized industries such as pharmaceuticals and finance. According to him, there is no room for error when it comes to building AI for these domains. “It has to be perfect,” he told TNW.
Problems with LLMs
Among AI journalists in 2022, GPT-3 and Google’s LaMBDAFor the first time in history, experts were able to ‘talk’ to machines, which produced fun and compelling articles. Not to mention the fact that these models have become very good at imitating humans. Some experts even think they’re getting sentient.
These systems are impressive, but as mentioned above, they usually completely unreliableThey are fragile, unreliable and prone to making things up. In layman’s terms, they are stupid liars. The reason is their training method.
LLM is an amazing marriage of mathematics and linguistics. But most fundamentally, they rely on training data. For example, you can’t expect to train an AI. Corpus of Reddit Posts, do not think that there is a factual contradiction. As the old saying goes, what you put in comes out.
For example, if you trained LLM on a dataset full of cooking recipes, you could develop a system that could generate new recipes as needed. You can even ask it to generate novel recipes for things not in its database. For example, gummy bear curry.
In the same way that human chefs need to draw on their culinary background to figure out how to integrate gummy bears into something akin to a curry dish, an AI can put together new recipes based on what it has been trained. I will try If it had been trained on a database of curry recipes, there is a good chance it would output something at least close to what a human given the same task would come up with.
But if the team training the AI used a huge dataset filled with billions or trillions of internet files that had nothing to do with curry, you never know what the machine would spew out. You might get a great recipe, or you might output a random critique of NBA superstar Stephen Curry.
This is kind of the fun part of operating a giant LLM. I have absolutely no idea what I’m getting when I run the query. But medical, financial, or business intelligence reporting leaves no room for such uncertainty.
reign over human knowledge for machine use
Companies developing AI solutions for standardized industries cannot afford to brute-force train huge models on huge databases just to see their capabilities. Output from their system is typically submitted for review by regulatory authorities such as the USFDA and global financial regulators. For this reason, these organizations must pay close attention to the types of data they train their models on.
According to Walckenaer, Yseop’s top priority is ensuring that the data it uses to train the system is accurate and ethically sourced. This means that we only use relevant data and anonymize it to remove personally identifiable information so that a person’s privacy is not compromised.
Next, the company must ensure that its machine learning system is free from bias, omissions, and negligence. Hallucination. Yes, that’s right.black box AI systems tendency to hallucinateThis is a big problem when trying to output 100% accurate information.
To overcome the problem of hallucinations, Yseop relies on putting humans in loops at every stage. The company’s algorithms and neural networks are co-developed by mathematicians, linguists and AI developers. Their database consists of data sourced directly from researchers and companies provided by the product. And most of their services are done via SaaS, designed to “augment” human experts rather than replace them.
Since humans are involved at every stage, checks are made to ensure that the AI is not taking the data it is given and “hallucinating” on new hoaxes. This prevents, for example, the system from using real patient data as a template and outputting false data about non-existent patients.
The next problem developers have to overcome with language processing is omission. This happens when the AI model skips relevant or important parts of the database when outputting information.
Large LLMs like GPT-3 don’t really suffer from abbreviation problems. Anyway, I’m not sure what to expect from these “anything goes” systems. However, targeted models designed to help professionals and companies classify finite datasets can only be “contained” in such a way as to reveal all relevant information. Helpful.
The final major hurdle that giant LLMs typically fail to cross is bias. One of the most common forms of bias is technicalThis happens when the system is designed such that the output it produces does not follow the scientific method.
A classic example of technological bias is teaching machines to “predict” a person’s sexuality. There is no scientific basis for this kind of AI (see our article on why it’s considered “gaydar”). just hogwash and snake oil), they can only produce fabricated output by employing pure technical bias.
Other common biases that can creep into NLP and NLG models include human bias (which occurs when humans label data inappropriately due to cultural or intentional misunderstanding) and institutional biases.
The last one can be a big problem for organizations that rely on accurate data and outputs to make important decisions. In standardized industries such as pharmaceuticals and finance, this kind of bias can lead to poor patient outcomes and financial ruin. Bias is one of the biggest problems in AI, and LLMs like GPT-3 are basically as biased as the database they were trained on.
While it can be difficult to eliminate bias completely, it’s best to use only the highest quality, hand-checked data and set the “parameters” of the system (basically, allow the developer to fine-tune Virtual dials and knobs) can help reduce bias. AI output — well tuned.
The GPT-3 and similar models are capable of astounding prose feats, sometimes even fooling some experts. However, it is completely unsuitable for a standardized industry where accuracy and accountability are paramount.
Why use AI
Adopting LLM or NLP/NLG when the risk is high may seem like a bad idea. For example, in the pharmaceutical industry, biases and omissions can greatly affect the accuracy of clinical reporting. And who would want to trust a machine that visions its own financial future?
Luckily, companies like Yseop aren’t using open-ended datasets full of unchecked information. Admittedly, you’re unlikely to ask Yseop’s pharma model to write a song or create a proper curry recipe (using the current dataset), but the data and parameters that govern the output are carefully selected. It is scrutinized and used, so it can be said to be reliable. Tasks on which they are built.
But the question remains, why use AI? So far, we’ve done it with non-automated software solutions.
Walckenaer told me I might soon run out of options. Human labor has not kept up, at least in the pharmaceutical industry, he said.
“The need for medical writers will triple in the next ten years,” said Walckenaer, adding that Yseop’s system could deliver a 50% efficiency improvement in the industry. It’s a game changer. And good news for those who fear being replaced by machines. He was convinced that Yseop’s system was meant to augment a skilled human workforce, not replace them.
In other standardized industries, such as the fields of finance and business intelligence, NLP and NLG help minimize or eliminate human error. While that may not be as exciting as having an LLM that can pretend to chat with you as a famous historical figure or generate fake news at the touch of a button, there are now more Thousands of businesses are saving time and money.
https://thenextweb.com/news/large-language-models-like-gpt-3-arent-good-enough-for-pharma-finance Large language models like GPT-3 aren’t good enough for pharma and finance