Featured Article

How engineers fought the CAP theorem in the global war on latency

CockroachDB EC-1 Part 2: Technical design

Comment

Image Credits: Nigel Sussman

CockroachDB was intended to be a global database from the beginning. The founders of Cockroach Labs wanted to ensure that data written in one location would be viewable immediately in another location 10,000 miles away. The use case was simple, but the work needed to make it happen was herculean.

The company is betting the farm that it can solve one of the largest challenges for web-scale applications. The approach it’s taking is clever, but it’s a bit complicated, particularly for the non-technical reader. Given its history and engineering talent, the company is in the process of pulling it off and making a big impact on the database market, making it a technology well worth understanding. In short, there’s value in digging into the details.

In part 1 of this EC-1, I provided a general overview and a look at the origins of Cockroach Labs. In this installment, I’m going to cover the technical details of the technology with an eye to the non-technical reader. I’m going to describe the CockroachDB technology through three questions:

  1. What makes reading and writing data over a global geography so hard?
  2. How does CockroachDB address the problem?
  3. What does it all mean for those using CockroachDB?

What makes reading and writing data over a global geography so hard?

Spencer Kimball, CEO and co-founder of Cockroach Labs, describes the situation this way:

There’s lots of other stuff you need to consider when building global applications, particularly around data management. Take, for example, the question and answer website Quora. Let’s say you live in Australia. You have an account and you store the particulars of your Quora user identity on a database partition in Australia.

But when you post a question, you actually don’t want that data to just be posted in Australia. You want that data to be posted everywhere so that all the answers to all the questions are the same for everybody, anywhere. You don’t want to have a situation where you answer a question in Sydney and then you can see it in Hong Kong, but you can’t see it in the EU. When that’s the case, you end up getting different answers depending where you are. That’s a huge problem.

Reading and writing data over a global geography is challenging for pretty much the same reason that it’s faster to get a pizza delivered from across the street than from across the city. The essential constraints of time and space apply. Whether it’s digital data or a pepperoni pizza, the further away you are from the source, the longer stuff takes to get to you.

The essential objective of data architecture is bringing data as close as possible to the consuming party. One approach is just to have an offline copy on your own device. This is why Amazon downloads an e-book onto your Kindle. There’s nothing between you and the data — no network, no firewall, nothing. You can’t get much closer to the data than holding it in your hand.

While a single download works for data that doesn’t change often, like an e-book, rapidly changing data such as the price of a stock requires a technology far more sophisticated. This is where databases come into play.

A database is intended to be a central location that handles massive amounts of reading and writing of data that occurs quickly, often at intervals of a fraction of second. While often only one author will write a Kindle book, a database is designed to allow millions, maybe billions of “authors” to write data to it and then read that new data once it’s written. It’s a very powerful technology, but it’s not magical. The constraints of time and space are always present.

Image Credits: Andriy Onufriyenko / Getty Images

Let’s go back to the Quora example Kimball describes above. A New York-based user writing the answer to a question to a database hosted in NYC is going to see that answer well before users in Sydney. Depending on the condition of the network, that lag could be anywhere from a half a second to a few seconds. As Kimball says, “That’s a huge problem.”

Global applications address this problem by setting up database servers all over the world. When a user makes a change in one nearby database server, it’s replicated to all the other database servers around the planet. Then when it comes time to read the data, users access the database server closest to them.

Figure 1. Under the right conditions, replicating data across the globe can allow users to read the latest write to a database. Image Credits: Bob Reselman with Bryce Durbin

Replication makes it so users don’t have to take a trip across the planet to read data, but it creates another problem: Synchronization. Again, let’s look at the Quora scenario, which is depicted in Figure 1.

Let’s say a user in NYC asks a question, “Who is the greatest guitarist of all time?” That user uploads the question to the NYC database server, and users in NYC will see the question immediately. Users in Sydney won’t see it for a second or two. Thus, the database server in Sydney is out of sync with the database server in NYC. Sydney will eventually get the new data as part of the worldwide replication process, but there will be lag. Once all the database servers catch up, if there’s no more activity against the guitarist question, things are in sync.

But then, somebody in Paris provides an answer, “Eric Clapton is the greatest guitarist of all time.” The user in Paris answers the question by making a write to a database server in Western Europe. At that point once again, all the other replicating database servers are out-of-sync. People in Paris know the Eric Clapton answer. But users in NYC don’t and neither do the users in Sydney. It’s going to take time for the replication to propagate so all the database servers know about Eric Clapton.

In a way, we’re back to where we started: The immutable law of time and space that says the closer you are to the data, the faster you can read it. Only in this case, instead of the distance being defined as the distance between user and database server, it’s the distance between the database servers themselves.

This problem of data inconsistency has been around for a long time, and in fact, it has a formal definition called the CAP theorem.

According to the CAP theorem (which stands for consistency, availability, partition), when you spread data out over many database servers or partitions (P), you must make a sacrifice. You can have the most consistent data (C) eventually or you can have older, inconsistent data available (A) immediately. What you can’t have is the most current data immediately.

This problem is what makes reading and writing data over a global geography hard, and it’s what CockroachDB is attempting to solve.

How does CockroachDB address the problem?

Cockroach Labs staff. Image Credits: Cockroach Labs

CockroachDB addresses the problem of writing and reading data as fast as possible on a global basis by placing geographic optimization front and center in its database design while also automating a lot of the detailed work around performance. According to Andy Woods, group product manager at Cockroach Labs, “This is about making it easy for you to have reliable, low-latency applications on a global scale.”

To this end, Cockroach Labs introduces a feature in database design called multiregion database architecture. Let’s dive into the details.

Understanding multiregion databases

To handle the scale of most datasets, database technologies often segment data using a logical strategy called sharding, which defines that certain data will be placed on certain database servers.

Traditionally, shards are based on some standardized feature of the dataset. For instance, the full range of zip codes could be subdivided into 10 categories based on the first number of the zip code. Each shard would be stored on a distinct computer, hard drive or even a data center, according to the given sharding logic.

The benefit of sharding is that it makes retrieving data easy. Whenever a user executes a query against data in a database, the database engine scours the database to retrieve and assemble the data according to criteria of the query. For example, “give me all the users who live in a range of zip codes between 50000 and 50009” would be directed to the database server that hosts that particular range of zip codes.

In a relational database, this means going through each table’s rows to find the data of interest. This can take a lot of time in a large database that might have millions of rows of data. However, when the rows are grouped together in close proximity on a computer’s hard disk or even within the same regional data center according to a logical rule, the amount of time to execute the query decreases, in many cases dramatically as depicted in Figure 2. Sharding data alleviates some of the burden imposed by the CAP theorem.

Figure 2. Sharding data according to region makes data retrieval faster. Image Credits: Bob Reselman with Bryce Durbin

The downside of sharding is that it takes a lot of technical know-how and ongoing administration to get it to work reliably. Doing something as simple as adding a new column to a table or changing the sharding logic are risky, time-consuming tasks that require a good deal of technical expertise. One wrong move and the company’s database can be in a world of pain.

Most sharding logic is based on data, but there is a different approach: Taking into account physical geography to group data based on where a user is located. In a multiregion database, data is segmented and spread out across a cluster of physical database servers according to a policy defined by geography. CockroachDB uses a multiregion database architecture to make a lot of the pain of sharding go away.

With CockroachDB, a database engineer will declare a logical database to be multiregion. Then the logical database or even individual tables will be assigned to one or many regions throughout the planet.

Figure 3 illustrates a CockroachDB database named MyDB that’s bound to database servers in the US EAST and US WEST regions. Notice that three tables in the MyDB database are bound to the US WEST region and one table is bound to the US EAST region. This is an example of using the multiregion feature to segment tables in a CockroachDB to specific geographic locations.

Figure 3. CockroachDB’s multiregion feature abstracts away data partitioning and replication according to region. Image Credits: Bob Reselman with Bryce Durbin

Using CockroachDB’s multiregion feature to segment data according to geographic proximity fulfills Cockroach Labs’ primary directive: To get data as close to the user as possible.

Understanding row-level partitioning

There’s more to this. Cockroach Labs takes the concept of geographical data segmentation even further by providing the ability to assign individual rows of a table to a region. This is called row-level geopartitioning, and in the world of database geekiness, it’s a very, very big deal. So much so, in fact, that Cockroach Labs offers it only to paying enterprise customers.

Take a look at Figure 4, which shows a simple table of customers listed by last name, first name, city, state and zip code.

Figure 4. CockroachDB allows developers to assign rows in a table to specific geographic regions. Image Credits: Bob Reselman with Bryce Durbin

Notice that the first two rows of the table have customers with zip codes that correspond to the Western United States while the last row has a customer with a zip code that corresponds to the East Coast. When CockroachDB’s row-level partitioning feature is enabled, it automatically puts the data for the West Coast customers in the US WEST region and the East Coast customers in the US EAST region. The benefit is that applications running in the West Coast will access West Coast customer data lightning fast.

In a way, the company is trying to defy the essential principle the CAP theorem. Remember, in a partitioned database, you can have consistent data eventually or inconsistent data immediately, but you can’t have consistent data now. CockroachDB is cheating the theorem by focusing on the “P” — partitioning the data in a way that most users can have both consistent and available data most, if not all, of the time.

Row-level geopartitioning is a very big deal. It’s very hard to pull off. But it is well worth the effort because it provides the benefits of fine-grain sharding without the huge operational headaches. CockroachDB wants to do all the work.

What does it all mean for those using CockroachDB?

CockroachDB has a lot to offer companies that need to manage data on a global scale. The data management features that support partitioning, whether at the database, table or row levels, are done automatically by the CockroachDB server engine. It’s a powerful addition to database architecture.

But it does incur a learning curve. Developers will need to take some time to learn the enhancements that Cockroach Labs implemented in its version of SQL, the standard query language used to program databases.

CockroachDB has another feature that provides greater control over fault tolerance. called a Survival Goal. Fault tolerance is the ability of a database to recover from system failure. When you have a worldwide database infrastructure that can have hundreds, if not a few thousand, physical database servers, losing a server due to machine or network failure is commonplace.

A Survival Goal is a configuration setting by which a CockroachDB database administrator (DBA) declares how the logical database will replicate data across a cluster of database servers. Survival Goals loom large in Cockroach Labs’ approach to its database architecture. Developers and DBAs will need to take the time to learn how to work with Survival Goals, as it is a concept baked into the CockroachDB way of doing things.

Finally, a real-world impact that companies will encounter when deciding upon CockroachDB is accepting the degree of commitment that goes with adopting the technology. Deciding on a database is a big decision. When adopted, a database typically will be with a company for years, if not decades. Once a company chooses to use CockroachDB, it can be a career-enhancing or a career-ending event for those making the decision. It all depends on the success of the adoption. Careful planning about how the database fits with the architecture of an application is critical.

While it can be used for any application, CockroachDB is best suited for enterprises that want their users across the world to have lightning-fast access to global data in an accurate, reliable manner. There are simpler databases for applications with fewer requirements and less need for performance.

All that said, CockroachDB offers something special and unique for very valuable customers, and that makes its business incredibly interesting, the topic of the next part of this EC-1.

“Developers, as you know, do not like to pay for things”


CockroachDB EC-1 Table of Contents

Also check out other EC-1s on Extra Crunch.


More TechCrunch

Featured Article

I’m rooting for Melinda French Gates to fix tech’s broken ‘brilliant jerk’ culture

Women in tech still face a shocking level of mistreatment at work. Melinda French Gates is one of the few working to change that.

56 mins ago
I’m rooting for Melinda French Gates to fix tech’s  broken ‘brilliant jerk’ culture

Blue Origin has successfully completed its NS-25 mission, resuming crewed flights for the first time in nearly two years. The mission brought six tourist crew members to the edge of…

Blue Origin successfully launches its first crewed mission since 2022

Creative Artists Agency (CAA), one of the top entertainment and sports talent agencies, is hoping to be at the forefront of AI protection services for celebrities in Hollywood. With many…

Hollywood agency CAA aims to help stars manage their own AI likenesses

Expedia says Rathi Murthy and Sreenivas Rachamadugu, respectively its CTO and senior vice president of core services product & engineering, are no longer employed at the travel booking company. In…

Expedia says two execs dismissed after ‘violation of company policy’

Welcome back to TechCrunch’s Week in Review. This week had two major events from OpenAI and Google. OpenAI’s spring update event saw the reveal of its new model, GPT-4o, which…

OpenAI and Google lay out their competing AI visions

When Jeffrey Wang posted to X asking if anyone wanted to go in on an order of fancy-but-affordable office nap pods, he didn’t expect the post to go viral.

With AI startups booming, nap pods and Silicon Valley hustle culture are back

OpenAI’s Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources, according to a person from that team. But…

OpenAI created a team to control ‘superintelligent’ AI — then let it wither, source says

A new crop of early-stage startups — along with some recent VC investments — illustrates a niche emerging in the autonomous vehicle technology sector. Unlike the companies bringing robotaxis to…

VCs and the military are fueling self-driving startups that don’t need roads

When the founders of Sagetap, Sahil Khanna and Kevin Hughes, started working at early-stage enterprise software startups, they were surprised to find that the companies they worked at were trying…

Deal Dive: Sagetap looks to bring enterprise software sales into the 21st century

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI moves away from safety

After Apple loosened its App Store guidelines to permit game emulators, the retro game emulator Delta — an app 10 years in the making — hit the top of the…

Adobe comes after indie game emulator Delta for copying its logo

Meta is once again taking on its competitors by developing a feature that borrows concepts from others — in this case, BeReal and Snapchat. The company is developing a feature…

Meta’s latest experiment borrows from BeReal’s and Snapchat’s core ideas

Welcome to Startups Weekly! We’ve been drowning in AI news this week, with Google’s I/O setting the pace. And Elon Musk rages against the machine.

Startups Weekly: It’s the dawning of the age of AI — plus,  Musk is raging against the machine

IndieBio’s Bay Area incubator is about to debut its 15th cohort of biotech startups. We took special note of a few, which were making some major, bordering on ludicrous, claims…

IndieBio’s SF incubator lineup is making some wild biotech promises

YouTube TV has announced that its multiview feature for watching four streams at once is now available on Android phones and tablets. The Android launch comes two months after YouTube…

YouTube TV’s ‘multiview’ feature is now available on Android phones and tablets

Featured Article

Two Santa Cruz students uncover security bug that could let millions do their laundry for free

CSC ServiceWorks provides laundry machines to thousands of residential homes and universities, but the company ignored requests to fix a security bug.

2 days ago
Two Santa Cruz students uncover security bug that could let millions do their laundry for free

TechCrunch Disrupt 2024 is just around the corner, and the buzz is palpable. But what if we told you there’s a chance for you to not just attend, but also…

Harness the TechCrunch Effect: Host a Side Event at Disrupt 2024

Decks are all about telling a compelling story and Goodcarbon does a good job on that front. But there’s important information missing too.

Pitch Deck Teardown: Goodcarbon’s $5.5M seed deck

Slack is making it difficult for its customers if they want the company to stop using its data for model training.

Slack under attack over sneaky AI training policy

A Texas-based company that provides health insurance and benefit plans disclosed a data breach affecting almost 2.5 million people, some of whom had their Social Security number stolen. WebTPA said…

Healthcare company WebTPA discloses breach affecting 2.5 million people

Featured Article

Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Microsoft won’t be facing antitrust scrutiny in the U.K. over its recent investment into French AI startup Mistral AI.

2 days ago
Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Ember has partnered with HSBC in the U.K. so that the bank’s business customers can access Ember’s services from their online accounts.

Embedded finance is still trendy as accounting automation startup Ember partners with HSBC UK

Kudos uses AI to figure out consumer spending habits so it can then provide more personalized financial advice, like maximizing rewards and utilizing credit effectively.

Kudos lands $10M for an AI smart wallet that picks the best credit card for purchases

The EU’s warning comes after Microsoft failed to respond to a legally binding request for information that focused on its generative AI tools.

EU warns Microsoft it could be fined billions over missing GenAI risk info

The prospects for troubled banking-as-a-service startup Synapse have gone from bad to worse this week after a United States Trustee filed an emergency motion on Wednesday.  The trustee is asking…

A US Trustee wants troubled fintech Synapse to be liquidated via Chapter 7 bankruptcy, cites ‘gross mismanagement’

U.K.-based Seraphim Space is spinning up its 13th accelerator program, with nine participating companies working on a range of tech from propulsion to in-space manufacturing and space situational awareness. The…

Seraphim’s latest space accelerator welcomes nine companies

OpenAI has reached a deal with Reddit to use the social news site’s data for training AI models. In a blog post on OpenAI’s press relations site, the company said…

OpenAI inks deal to train AI on Reddit data

X users will now be able to discover posts from new Communities that are trending directly from an Explore tab within the section.

X pushes more users to Communities

For Mark Zuckerberg’s 40th birthday, his wife got him a photoshoot. Zuckerberg gives the camera a sly smile as he sits amid a carefully crafted re-creation of his childhood bedroom.…

Mark Zuckerberg’s makeover: Midlife crisis or carefully crafted rebrand?

Strava announced a slew of features, including AI to weed out leaderboard cheats, a new ‘family’ subscription plan, dark mode and more.

Strava taps AI to weed out leaderboard cheats, unveils ‘family’ plan, dark mode and more