AI

LlamaIndex adds private data to large language models

Comment

Futuristic file cabinet, that represents data storage in the cloud.
Image Credits: Yuichiro Chino / Getty Images

Last fall, after playing around with OpenAI’s GPT-3 text-generating AI model — the predecessor to GPT-4 — former Uber research scientist Jerry Liu discovered what he describes as “limitations” around the model’s ability to work with private data (e.g., personal files). To solve for this, he launched an open source project, LlamaIndex, designed to unlock the capabilities and use cases of large language models (LLMs) like GPT-3 and GPT-4.

“LLMs offer incredible capabilities for knowledge extraction and reasoning — they can perform question-answering, summarization and insight extraction and even sequential decision making with an external environment,” Liu told TechCrunch in an email interview. “But LLMs have limits.”

As the project grew in popularity (to the tune of 200,000 monthly downloads), Liu joined forces with Simon Suo, one of his old colleagues at Uber, to turn LlamaIndex into a fully fledged company. Today, LlamaIndex (the company) offers a framework to assist developers in leveraging the capabilities of LLMs on top of their personal or organizational data.

“LlamaIndex [helps] developers manage their data for LLM applications,” Liu said. “Our toolkit contains the most depth in this aspect, and we make it easy to integrate with other tools the developer is using.”

LlamaIndex
Image Credits: LlamaIndex

The LlamaIndex framework allows developers to connect data from files like PDFs, PowerPoints, apps such as Notion and Slack and databases like Postgres and MongoDB to LLMs. The framework includes connectors to ingest data sources and data formats, as well as ways to structure data so that it can be easily used with LLMs.

In addition, LlamaIndex features a data retrieval and query interface that lets developers feed in any LLM input prompt to get back — as Liu describes it — “context and knowledge-augmented” output.

“There are other LLM application frameworks out there that offer basic building blocks for LLM applications and agents,” Liu said. “What’s specific to LlamaIndex is that we focus on connecting your data sources with LLMs, and we have extensive tools around data ingestion, data management and indexing and data retrieval with respect to LLM applications.”

The prospect of augmenting LLMs in this way wooed investors, which pledged $8.5 million toward LlamaIndex in a recently closed seed funding round. Greylock led with participation from angel investors, including Jack Altman, Lenny Rachitsky and Charles Xie.

So what will LlamaIndex spend the money on? Liu says that it’ll be used to build an “enterprise solution” atop the open source LlamaIndex project, set to launch later this year. One capability will allow customers to use “protection-grade” data connectors to parse and transport large volumes of data, while another, related capability will let them index “domain-specific” data.

“LlamaIndex is not tied to a specific piece of technology, so that we can continue to be used with LLMs as the technology evolves,” Liu said. “The AI industry is moving so quickly that any initial stacks that are emerging will likely change in the course of the next few months.”

More TechCrunch

Sanil Chawla remembers the meetings he had with countless artists in college. Those creatives were looking for one thing: sustainable economic infrastructure that could help them scale rather than drown…

Creator fintech Slingshot raises $2.2 million

A startup called Firefly that’s tackling the thorny and growing issue of cloud asset management with an “infrastructure as code” solution has raised $23 million in funding. That comes on…

Firefly forges on after co-founder’s murder by Hamas

Mistral, the French AI startup backed by Microsoft and valued at $6 billion, has released its first generative AI model for coding, dubbed Codestral. Codestral, like other code-generating models, is…

Mistral releases Codestral, its first generative AI model for code

Pinterest announced today that it is evolving its Creator Inclusion Fund to now be called the Pinterest Inclusion Fund. Pinterest teamed up with Shopify’s Build Black & Narrative program to…

Pinterest expands its Creator Fund to allow founders

Cadillac may seem a bit too traditional to hang its driving cap on EVs. And yet, that hasn’t stopped the GM brand from rolling out — or at least showing…

Cadillac’s new Optiq EV is designed to hook young hipsters

Alex Taub, a longtime founder with multiple exits under his belt, believes it’s time to disrupt the meme industry. “I have this big thesis that memetech is going to be…

This founder says memetech is the next big thing

Lux, the startup behind popular pro photography app Halide and others, is venturing into video with its latest app launch. On Wednesday, the company announced Kino, a new video capture app…

Kino is a new iPhone app for videographers from the makers of Halide

DevOps startup Harness has shown itself to be an ambitious company, building a broad platform of services while also dabbling in M&A when it made sense to fill in functionality.…

Harness snags Split.io, as it goes all in on feature flags and experiments

U.S. Rep. Elissa Slotkin will introduce a bill to Congress that would limit or ban the introduction of connected vehicles built by Chinese companies if found to pose a threat…

House bill would ban Chinese connected vehicles over security concerns

Microsoft’s Copilot, a generative AI-powered tool that can generate text as well as answer specific questions, is now available as an in-app chatbot on Telegram, the instant messaging app.  Currently…

Microsoft’s Copilot is now on Telegram

HBO’s new documentary, “MoviePass, MovieCrash,” tells a story that many of us know about: how MoviePass, the subscription-based movie ticketing startup, was a catastrophic failure. After a series of mishaps…

MoviePass co-founders speak their truth in HBO’s new documentary 

The watch features a variety of different 3D games, unlocking more play time the more kids move.

Fitbit’s new kid smartwatch is a little Wiimote, a little Tamagotchi

In the video, a crowd is roaring at a packed summer music festival. As a beat starts playing over the speakers, the performer finally walks onstage: It’s the Joker. Clad…

Discord has become an unlikely center for the generative AI boom

After the Wirecard scandal, Germany’s financial regulator BaFin started to look more closely at young fintech startups that wanted to grow at a rapid pace — it’s better to be…

Germany’s financial regulator ends anti-money laundering cap on N26 signups after $10M fine

Among other things, this includes the ability to trace code from source to binary packages across both platforms, single sign-on support and unified project structures.

JFrog and GitHub team up to closely integrate their source code and binary platforms

The company’s public fund disbursement and e-commerce platform makes accepting school tuition and enabling educational enrichment more accessible. 

Tech startup Odyssey goes on journey to help states implement school choice programs

A new startup called Kinnect aims to help people privately save generational memories, traditions, recipes and more. The company’s app, launched this month, lets people create invite-only spaces where they…

Kinnect’s new app aims to help families record and store generational memories

Spotify has hiked its premium subscription in France by an eye-watering €0.13, in response to a new music-streaming tax.

Spotify hikes subscription price in France by 1.2% to match new music-streaming tax

The European Union has taken the wraps off the structure of the new AI Office, the ecosystem-building and oversight body that’s being established under the bloc’s AI Act. The risk-based…

With the EU AI Act incoming this summer, the bloc lays out its plan for AI governance

Solutions by Text, a company that gives people a way to pay their bills and apply for loans via text messaging, has secured $110 million in new growth funding. Edison…

Bootstrapped for over a decade, this Dallas company just secured $110M to help people pay bills by text

Owners of small- and medium-sized businesses check their bank balances daily to make financial decisions. But it’s entrepreneur Yoseph West’s assertion that there’s typically information and functions missing from bank…

Relay raises $32.2 million to help smaller businesses manage their cashflow

When other firms were investing and raising eye-popping sums, Clean Energy Ventures took a different approach. It appears to be paying off.

How Clean Energy Ventures avoided the pandemic bubble and raised a $305M fund

PwC, the management consulting giant, will become OpenAI’s biggest customer to date, covering 100,000 users.

OpenAI signs 100K PwC workers to ChatGPT’s enterprise tier as PwC becomes its first resale partner

Tech enthusiasts and entrepreneurs, the clock is ticking! With just 72 hours remaining until the early-bird ticket deadline for TechCrunch Disrupt 2024, now is the time to secure your spot…

72 hours left of the Disrupt early-bird sale

Avendus, the top investment bank for venture deals in India, confirmed on Wednesday it is looking to raise up to $350 million for its new private equity fund.  The new…

Avendus, India’s top venture advisor, confirms it’s looking to raise a $350 million fund

China has closed a third state-backed investment fund to bolster its semiconductor industry and reduce reliance on other nations, both for using and manufacturing wafers — prioritizing what is called…

China’s $47B semiconductor fund puts chip sovereignty front and center

Apple’s annual list of what it considers the best and most innovative software available on its platform is turning its attention to the little guy.

Apple’s Design Awards nominees highlight indies and startups, largely ignore AI (except for Arc)

The spyware maker’s founder, Bryan Fleming, said pcTattletale is “out of business and completely done,” following a data breach.

Spyware maker pcTattletale says it’s ‘out of business’ and shuts down after data breach

AI models are always surprising us, not just in what they can do, but also in what they can’t, and why. An interesting new behavior is both superficial and revealing…

AI models have favorite numbers, because they think they’re people

On Friday, Pal Kovacs was listening to the long-awaited new album from rock and metal giants Bring Me the Horizon when he noticed a strange sound at the end of…

Rock band’s hidden hacking-themed website gets hacked