AI

Salesforce is betting that its own content can bring more trust to generative AI

Comment

Illustration of two business people analyzing charts and graphs.
Image Credits: z_wei / Getty Images

It has become apparent in recent weeks that generative AI has the potential to transform how we interact with software, allowing us to describe what we want instead of clicking or tapping. That shift could have a profound impact on enterprise software. At the Salesforce World Tour NYC event last week, that vision was on full display.

Consider that during the 67-minute main keynote, it took less than five minutes for Salesforce CMO Sarah Franklin to introduce the subject of ChatGPT. The company then spent the next 40 minutes and several speakers talking about generative AI and the impact it would have across the entire platform. The final speaker talked about Data Cloud, an adjacent technology. It’s fair to say that other than a few minutes of introduction, it was all the company talked about.

That included discussions of EinsteinGPT, a tool for asking questions about Salesforce content, and SlackGPT, a tool for asking Slack questions about its content. In addition, the company talked about the ability to create landing pages on the fly, write sales emails (if that’s what you want) and write Apex code (Salesforce’s programming language) to programmatically trigger certain actions in a workflow, among other things.

When you think about the fact that generative AI wasn’t even really a thing people were talking about until OpenAI released ChatGPT at the end of last year, and events like this take months of planning, the company probably had to switch gears recently to focus its presentation so completely on this single subject.

Salesforce isn’t alone in its new focus on applying generative AI to its existing products and services. Over the past several months, we’ve seen many enterprise software companies announce plans to incorporate this technology into their stacks, even if overall most of these new tools are still a work in progress.

Just last week we had announcements from Zoho, Box and ServiceNow, while other companies too numerous to mention individually have made similar announcements in recent months.

A year after we saw the crypto and metaverse hype machines come crashing down, it’s fair to ask if these companies are moving too fast, chasing the next big shiny thing without considering some of the technology’s limitations, especially its well-documented hallucination problem. For this post, we are going to concentrate on Salesforce’s view of things and how it hopes to overcome some of those known issues when it comes to incorporating generative AI onto the platform.

Got 99 problems, but data ain’t one

Perhaps it’s unfair to put generative AI in the same category as other hyped technologies because we are only now seeing the direct impact of this approach. It took decades of research, development and technological shifts to get us to this point, said Juan Perez, Salesforce’s CIO, who is in charge of the company’s technology strategies.

“This is different, actually. First of all, it’s more real, and AI is not new. We’ve had decades and decades of advancement in AI,” Perez said. And he pointed out that it’s not new for Salesforce, either. It introduced its AI layer, Einstein, back in 2016, and has been refining it ever since.

Perez told TechCrunch+ that he actually uses Einstein AI to help generate reports to do his work, and the developments we are seeing with generative AI will only make the process easier. “With the advances of generative AI, with the compute power, the large-scale systems that can support these large language models, the game is entirely different,” he said.

One theme that Salesforce kept coming back to at the event was the notion of trust and that building AI solutions on top of Salesforce data could help develop more trusted AI. A more trustworthy underlying dataset could in turn help limit hallucination issues where the AI doesn’t actually know with certainty what the response should be and essentially makes one up.

But the company is working hard to make sure that the AI is giving the best answers possible with the understanding that nobody can guarantee that the generative AI won’t hallucinate answers at this point, according to Silvio Savarese, the company’s EVP and chief scientist.

“Good quality data is key for generating good quality outputs.Training or fine-tuning models using curated high-quality CRM data allows you to build trusted generative capabilities. However, even with high-quality data, LLMs can still generate hallucinations,” he said. It’s important to understand that as you implement the technology at your company.

Salesforce is working to mitigate the problem on several fronts, he said. By building its own models, the company can control for some factors that can cause the model to hallucinate. “We have full control of the learning procedure … can inject additional labeling/instruction capabilities and embed constitutional AI methods to mitigate hallucinations,” he said.

In addition, training can be ongoing rather than training once and deploying, as is sometimes the case with LLMs today, he said. “This is especially vital in the world of CRM, where data is constantly changing and freshness is mission critical. By keeping LLMs trained on the most up-to-date information, a common source of mistakes can be minimized.” It’s worth noting, however, that as customers build or bring their own LLMs, Salesforce will still supply the data but have less control over how it gets incorporated, managed and used in external models.

A matter of trust

By using a more constrained set of data for the LLMs that comes from a source like Salesforce, the company is operating on the theory that it will limit the hallucination problem. Vishal Sikka, CEO and founder at Vianai Systems, an MLOps startup told TechCrunch+ in a recent interview that it’s imperative to solve the hallucination issue before it can be used in mission-critical applications in enterprise settings.

“The first part is the safety issue because in the current state of the art, the scientists who have built this transformer technology don’t know how to make it produce good answers and not produce bad ones. They don’t know if it is even possible that it can be done,” he said.

That means that if you have a problem that requires a precise answer, you need total certainty, and we don’t have that yet.

But Ray Wang, founder and principal analyst at Constellation Research, told TechCrunch+ that there are business cases where you don’t need total accuracy to be useful.

“Generative AI ultimately requires massive amounts of data for high precision,” he said. “This requires removing false positives and false negatives with training and human augmentation. Areas where we need 100% accuracy will be hard to achieve, but if we can live with 70% or 80% accuracy, many tasks such as self-service customer care, or sales lead scoring, or campaign automation will become easier.”

Brent Hayward, CEO at Salesforce subsidiary Mulesoft, thinks that putting humans, who understand the data in the process could help tell the model when it’s right and when it’s not, what he calls “tuning for true.” That could help correct the AI when wrong and help improve models along the way.

“If the generative AI is helping create a workflow and generating code to help, the source of that code really matters,” Hayward said. “If the dataset we’ve trained the model on is all of our API’s, you can say the trust is quite high.”

He sees possibly developing a trust score based on where the data is coming from, and how much we can rely on the answers from a given set of data, an approach he thinks will be increasingly important.

People in fact remain a key part of Salesforce’s AI vision, Savarese said. “By enabling human-in-the-loop capabilities, users can verify the quality of the output of generative AI and intervene to fix hallucinations or other factual errors. This is both a powerful safety feature and an example of our core value at Salesforce AI, which is augmenting human talent rather than attempting to replace it,” he said.

Perez anticipates that part of his job, and that of all CIOs moving forward, will be ensuring that the company’s LLMs are using trusted data. “Remember the evolution of the CIO in the areas of security and privacy. We have had to really take a much stronger stance as CIOs to ensure that security is a priority, that privacy is priority. Well, now with generative AI, I think CIOs are going to have to also be like the guards of the castle and will have to ensure that there’s trusted data in support of AI,” he said.

It’s more than hallucinations

The hallucination issue is just one of the problems associated with generative AI. Another issue will be making sure that the generative AI doesn’t supply confidential company information or other sensitive data to people who aren’t supposed to see it.

Patrick Stokes, EVP and GM of platform at Salesforce, thinks that there will be limits put on what types of data can be put in the models to prevent this from happening. “Businesses and organizations like Salesforce are going to have to start to figure out what some of those swim lanes look like,” he said.

In practice that would mean looking at hiding certain fields from the model if it includes data you didn’t want unauthorized people seeing, but that’s still something that companies like Salesforce need to work out.

There’s also the issue of data ownership. For example, if you are creating a landing page on the fly, do you have permission to use the photos on that landing page (or the source of generated images)? These kinds of legal issues could slow enterprise enthusiasm for generative AI until there are clearer answers.

It’s going to be imperative to solve all of these problems, and others that are sure to arise, as we insert generative AI into more of our software. But of all the issues, limiting hallucinations is going to be paramount because everyone using the generative AI capabilities in Salesforce (and all enterprise software) is going to need to trust that the answers they are getting from the system are true and accurate and not putting the company at risk.

Salesforce is making a big bet that using its own data in LLMs will be the key to doing this. Time will tell if this is right, or at least, if it can help limit the problem.

More TechCrunch

Featured Article

Hacked, leaked, exposed: Why you should never use stalkerware apps

Using stalkerware is creepy, unethical, potentially illegal, and puts your data and that of your loved ones in danger.

40 mins ago
Hacked, leaked, exposed: Why you should never use stalkerware apps

The design brief was simple: each grind and dry cycle had to be completed before breakfast. Here’s how Mill made it happen.

Mill’s redesigned food waste bin really is faster and quieter than before

Google is embarrassed about its AI Overviews, too. After a deluge of dunks and memes over the past week, which cracked on the poor quality and outright misinformation that arose…

Google admits its AI Overviews need work, but we’re all helping it beta test

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. In…

Startups Weekly: Musk raises $6B for AI and the fintech dominoes are falling

The product, which ZeroMark calls a “fire control system,” has two components: a small computer that has sensors, like lidar and electro-optical, and a motorized buttstock.

a16z-backed ZeroMark wants to give soldiers guns that don’t miss against drones

The RAW Dating App aims to shake up the dating scheme by shedding the fake, TikTok-ified, heavily filtered photos and replacing them with a more genuine, unvarnished experience. The app…

Pitch Deck Teardown: RAW Dating App’s $3M angel deck

Yes, we’re calling it “ThreadsDeck” now. At least that’s the tag many are using to describe the new user interface for Instagram’s X competitor, Threads, which resembles the column-based format…

‘ThreadsDeck’ arrived just in time for the Trump verdict

Japanese crypto exchange DMM Bitcoin confirmed on Friday that it had been the victim of a hack resulting in the theft of 4,502.9 bitcoin, or about $305 million.  According to…

Hackers steal $305M from DMM Bitcoin crypto exchange

This is not a drill! Today marks the final day to secure your early-bird tickets for TechCrunch Disrupt 2024 at a significantly reduced rate. At midnight tonight, May 31, ticket…

Disrupt 2024 early-bird prices end at midnight

Instagram is testing a way for creators to experiment with reels without committing to having them displayed on their profiles, giving the social network a possible edge over TikTok and…

Instagram tests ‘trial reels’ that don’t display to a creator’s followers

U.S. federal regulators have requested more information from Zoox, Amazon’s self-driving unit, as part of an investigation into rear-end crash risks posed by unexpected braking. The National Highway Traffic Safety…

Feds tell Zoox to send more info about autonomous vehicles suddenly braking

You thought the hottest rap battle of the summer was between Kendrick Lamar and Drake. You were wrong. It’s between Canva and an enterprise CIO. At its Canva Create event…

Canva’s rap battle is part of a long legacy of Silicon Valley cringe

Voice cloning startup ElevenLabs introduced a new tool for users to generate sound effects through prompts today after announcing the project back in February.

ElevenLabs debuts AI-powered tool to generate sound effects

We caught up with Antler founder and CEO Magnus Grimeland about the startup scene in Asia, the current tech startup trends in the region and investment approaches during the rise…

VC firm Antler’s CEO says Asia presents ‘biggest opportunity’ in the world for growth

Temu is to face Europe’s strictest rules after being designated as a “very large online platform” under the Digital Services Act (DSA).

Chinese e-commerce marketplace Temu faces stricter EU rules as a ‘very large online platform’

Meta has been banned from launching features on Facebook and Instagram that would have collected data on voters in Spain using the social networks ahead of next month’s European Elections.…

Spain bans Meta from launching election features on Facebook, Instagram over privacy fears

Stripe, the world’s most valuable fintech startup, said on Friday that it will temporarily move to an invite-only model for new account sign-ups in India, calling the move “a tough…

Stripe curbs its India ambitions over regulatory situation

The 2024 election is likely to be the first in which faked audio and video of candidates is a serious factor. As campaigns warm up, voters should be aware: voice…

Voice cloning of political figures is still easy as pie

When Alex Ewing was a kid growing up in Purcell, Oklahoma, he knew how close he was to home based on which billboards he could see out the car window.…

OneScreen.ai brings startup ads to billboards and NYC’s subway

SpaceX’s massive Starship rocket could take to the skies for the fourth time on June 5, with the primary objective of evaluating the second stage’s reusable heat shield as the…

SpaceX sent Starship to orbit — the next launch will try to bring it back

Eric Lefkofsky knows the public listing rodeo well and is about to enter it for a fourth time. The serial entrepreneur, whose net worth is estimated at nearly $4 billion,…

Billionaire Groupon founder Eric Lefkofsky is back with another IPO: AI health tech Tempus

TechCrunch Disrupt showcases cutting-edge technology and innovation, and this year’s edition will not disappoint. Among thousands of insightful breakout session submissions for this year’s Audience Choice program, five breakout sessions…

You’ve spoken! Meet the Disrupt 2024 breakout session audience choice winners

Check Point is the latest security vendor to fix a vulnerability in its technology, which it sells to companies to protect their networks.

Zero-day flaw in Check Point VPNs is ‘extremely easy’ to exploit

Though Spotify never shared official numbers, it’s likely that Car Thing underperformed or was just not worth continued investment in today’s tighter economic market.

Spotify offers Car Thing refunds as it faces lawsuit over bricking the streaming device

The studies, by researchers at MIT, Ben-Gurion University, Cambridge and Northeastern, were independently conducted but complement each other well.

Misinformation works, and a handful of social ‘supersharers’ sent 80% of it in 2020

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Okay, okay…

Tesla shareholder sweepstakes and EV layoffs hit Lucid and Fisker

In a series of posts on X on Thursday, Paul Graham, the co-founder of startup accelerator Y Combinator, brushed off claims that OpenAI CEO Sam Altman was pressured to resign…

Paul Graham claims Sam Altman wasn’t fired from Y Combinator

In its three-year history, EthonAI has amassed some fairly high-profile customers including Siemens and chocolate-maker Lindt.

AI manufacturing startup funding is on a tear as Switzerland’s EthonAI raises $16.5M

Don’t miss out: TechCrunch Disrupt early-bird pricing ends in 48 hours! The countdown is on! With only 48 hours left, the early-bird pricing for TechCrunch Disrupt 2024 will end on…

Ticktock! 48 hours left to nab your early-bird tickets for Disrupt 2024

Biotech startup Valar Labs has built a tool that accurately predicts certain treatment outcomes, potentially saving precious time for patients.

Valar Labs debuts AI-powered cancer care prediction tool and secures $22M