Enterprise

3 questions to ask before adopting microservice architecture

Comment

Digital generated image of data cloud server.
Image Credits: Andriy Onufriyenko (opens in a new window) / Getty Images

Madison Friedman

Contributor

Madison Friedman is an investor intern at Vertex Ventures US and an MBA candidate at the Wharton School of Business.

As a product manager, I’m a true believer that you can solve any problem with the right product and process, even one as gnarly as the multiheaded hydra that is microservice overhead.

Working for Vertex Ventures US this summer was my chance to put this to the test. After interviewing 30+ industry experts from a diverse set of companies — Facebook, Fannie Mae, Confluent, Salesforce and more — and hosting a webinar with the co-founders of PagerDuty, LaunchDarkly and OpsLevel, we were able to answer three main questions:

  1. How do teams adopt microservices?
  2. What are the main challenges organizations face?
  3. Which strategies, processes and tools do companies use to overcome these challenges?

How do teams adopt microservices?

Out of dozens of companies we spoke with, only two had not yet started their journey to microservices, but both were actively considering it. Industry trends mirror this as well. In an O’Reilly survey of 1500+ respondents, more than 75% had started to adopt microservices.

It’s rare for companies to start building with microservices from the ground up. Of the companies we spoke with, only one had done so. Some startups, such as LaunchDarkly, plan to build their infrastructure using microservices, but turned to a monolith once they realized the high cost of overhead.

“We were spending more time effectively building and operating a system for distributed systems versus actually building our own services so we pulled back hard,” said John Kodumal, CTO and co-founder of LaunchDarkly.

“As an example, the things we were trying to do in mesosphere, they were impossible,” he said. “We couldn’t do any logging. Zero downtime deploys were impossible. There were so many bugs in the infrastructure and we were spending so much time debugging the basic things that we weren’t building our own service.”

As a result, it’s more common for companies to start with a monolith and move to microservices to scale their infrastructure with their organization. Once a company reaches ~30 developers, most begin decentralizing control by moving to a microservice architecture.

Large companies with established monoliths are keen to move to microservices, but costs are high and the transition can take years. Atlassian’s platform infrastructure is in microservices, but legacy monoliths in Jira and Confluence persist despite ongoing decomposition efforts. Large companies often get stuck in this transition. However, a combination of strong, top-down strategy combined with bottoms-up dev team support can help companies, such as Freddie Mac, make substantial progress.

Some startups, like Instacart, first shifted to a modular monolith that allows the code to reside in a single repository while beginning the process of distributing ownership of discrete code functions to relevant teams. This enables them to mitigate the overhead associated with a microservice architecture by balancing the visibility of having a centralized repository and release pipeline with the flexibility of discrete ownership over portions of the codebase.

What challenges do teams face?

Teams may take different routes to arrive at a microservice architecture, but they tend to face a common set of challenges once they get there. John Laban, CEO and co-founder of OpsLevel, which helps teams build and manage microservices told us that “with a distributed or microservices based architecture your teams benefit from being able to move independently from each other, but there are some gotchas to look out for.”

Indeed, the linked O’Reilly chart shows how the top 10 challenges organizations face when adopting microservices are shared by 25%+ of respondents. While we discussed some of the adoption blockers above, feedback from our interviews highlighted issues around managing complexity.

The lack of a coherent definition for a service can cause teams to generate unnecessary overhead by creating too many similar services or spreading related services across different groups. One company we spoke with went down the path of decomposing their monolith and took it too far. Their service definitions were too narrow, and by the time decomposition was complete, they were left with 4,000+ microservices to manage. They then had to backtrack and consolidate down to a more manageable number.

Defining too many services creates unnecessary organizational and technical silos while increasing complexity and overhead. Logging and monitoring must be present on each service, but with ownership spread across different teams, a lack of standardized tooling can create observability headaches. It’s challenging for teams to get a single-pane-of-glass view with too many different interacting systems and services that span the entire architecture.

For example, years ago, one company had 10 different monitoring systems, three different logging systems and additional third-party vendors throwing their own data into the mix. The sprawl of observability tooling creates issues that flow downstream, making critical operations like incident response far more difficult.

“It’s important to get the number of services right,” said Andrew Miklas, co-founder of PagerDuty. “There’s a basic sanity check for the number of services in your organization. If each dev supports three services, you’re probably creating too many. On the other hand, if a full team of 10 devs supports one service, it may be time to break it apart.”

Managing a microservice architecture also implies managing multiple codebases, each with its own set of dependencies. Each codebase also needs to be hooked up to its own release pipeline. As the practice of CI/CD matures and companies can deploy multiple times per day, if something breaks, it becomes increasingly difficult to determine which change from which codebase caused the issue.

Problems are plentiful, but our interviewees also found creative solutions to overcome challenges with microservices.

Which strategies and tools help companies overcome these challenges?

Speaking with developers and engineering managers helped us identify essential strategies and tools to help companies manage their microservice pain points:

1. Embracing the silos that form in your organization. The key is to make it easy for teams to break out of their silo when needed. Establishing best practices and standardized formats for cross-team contracts helps break down silos when teams need to work together. For example, Laban said he has “seen a lot of companies lay out high level guidelines and say ‘if you follow these best practices then you get a license to operate autonomously.’”

On the tooling side, Kodumal says using new technologies like feature flagging can help teams embrace siloed teams and services, “once you can decouple deploying a change from exposing it to end users you have a greater flexibility in how you roll things out.”

2. Balancing your team with generalists and specialists can also help overcome the organizational drawbacks of distributed architecture. Every team has specialists that know their codebase inside and out. But adding in generalists can help facilitate connections between teams for features that span multiple codebases, share best practices across the company and help educate the team on how all the codebases work together. Sourcing internal generalist candidates from platform teams and setting up rotational programs between groups can set up teams for long-term success.

3. Standardization helps simplify microservices. Independent service ownership gives teams the flexibility to choose the best technologies that fit their requirements, but too much autonomy breeds too much complexity.

Laban recommended that “you should only introduce new technologies if they have a large impact. Don’t look for a 10% improvement, look for a 2x improvement.” Before switching to microservices, specify a preferred set of technologies that engineering teams can standardize on.

“When making technical decisions, try to keep the needs of the broader company in mind and don’t just focus on what works best for the piece of software you’re writing today,” said Miklas. This makes it easier to hire devs, share knowledge and focus development on the most useful tech stacks.

4. The most exciting solution I’ve seen that helps increase visibility across the microservice architecture is the service catalog. The service catalog’s goal is to reduce the pain of managing the associated overhead of microservices by having one place for developers to get all the information they require about their infrastructure. From costing, to observability, to team ownership, a successful service catalog helps developers understand how infrastructure maps to their company’s organizational structure.

Companies like OpsLevel, Cortex and Effx help dev teams build better software by painting a picture of how their services fit together and how they map to their organizational structure. John Laban laid out his vision for how teams should manage their microservice architectures and the challenges that follow: “The ideal end state is where the development teams have full ownership of their software end to end both in design and operation. This works well with a distributed architecture where people get autonomy and independence and they can move a lot faster. But this independence comes with responsibility in reliability, security and compliance, which takes a lot of added effort.”

The best products do far more than just track service SLOs; they help teams collaborate to build better services. Products like OpsLevel help teams understand what language a given service is written in, who to contact, how to contact them if something goes wrong and what changes are coming down the pipeline.

These tools can also solve practical headaches such as finding and eliminating orphaned services, cataloging services for comprehensive legal and security audits, resolving incidents and driving standardization of technologies across infrastructure.

For example, let’s say your organization needs to adopt a new technology like Kubernetes or move to a newer version of a language like Java or Python. This process might sound simple when you’ve only got a few services to keep track of. But once your organization scales beyond ~30 services, it’s nearly impossible to ensure standardization across the entire stack.

With organizational silos keeping teams focused on only their specific slice of infrastructure, cross-team engineering efforts require a microservice catalog to ensure all teams are on the same page when it comes to understanding who has yet to adopt. Companies like OpsLevel keep an up-to-date list of all your services and their owners with metadata on languages and frameworks to help you reach 100% adoption.

Another instance where microservice catalogs are a must is in incident response. Our interviews highlighted the magnitude of pain teams face when trying to resolve incidents in a microservice architecture. With services spread across so many teams and monitoring data spread across so many siloed products, it’s difficult to get context on the timeline of events leading up to an incident.

Tools like Effx embrace the service catalog to aggregate data across services and data sources to give the complete picture of an incident. For example, let’s say I’m a developer on support rotation. I can use Effx to pull in monitoring data from tools like Datadog to scan my services and identify any incidents quickly. If one pops up, I can build a timeline of events using monitoring data, past deployments, active feature flags and provisioning changes to diagnose the problem.

Tying everything together

Microservices are now widely accepted as a way to help companies scale their infrastructure with their org structure.

Most teams start with a monolith to lower overhead. As the company scales to naturally develop distinct areas of focus and ownership, microservices help link the right services to the right teams.

Increasing complexity creates challenges in managing this infrastructure. Service sprawl, technical and organizational silos, and dependency management slow development and take the fun out of building software.

But development teams are problem solvers by nature, and they’ve devised strategies to overcome these issues. Standardized cross-team contracts, balancing dev teams, standardization of practices/tooling and service cataloging help teams manage the complexity.

These strategies have helped our interviewees scale their architectures to support some of today’s most popular products. We’re excited to see what new products startups come up with to help teams fully realize the power of microservices. If your company is looking for feedback on how to uplevel microservice infrastructure or if you’re working on a new product to tame microservices, leave a comment below or get in touch.

Vertex Ventures US has a financial interest in LaunchDarkly and OpsLevel.

How Roblox completely transformed its tech stack

More TechCrunch

Founder-market fit is one of the most crucial factors in a startup’s success, and operators (someone involved in the day-to-day operations of a startup) turned founders have an almost unfair advantage…

OpenseedVC, which backs operators in Africa and Europe starting their companies, reaches first close of $10M fund

A Singapore High Court has effectively approved Pine Labs’ request to shift its operations to India.

Pine Labs gets Singapore court approval to shift base to India

The AI Safety Institute, a U.K. body that aims to assess and address risks in AI platforms, has said it will open a second location in San Francisco. 

UK opens office in San Francisco to tackle AI risk

Companies are always looking for an edge, and searching for ways to encourage their employees to innovate. One way to do that is by running an internal hackathon around a…

Why companies are turning to internal hackathons

Featured Article

I’m rooting for Melinda French Gates to fix tech’s broken ‘brilliant jerk’ culture

Women in tech still face a shocking level of mistreatment at work. Melinda French Gates is one of the few working to change that.

20 hours ago
I’m rooting for Melinda French Gates to fix tech’s  broken ‘brilliant jerk’ culture

Blue Origin has successfully completed its NS-25 mission, resuming crewed flights for the first time in nearly two years. The mission brought six tourist crew members to the edge of…

Blue Origin successfully launches its first crewed mission since 2022

Creative Artists Agency (CAA), one of the top entertainment and sports talent agencies, is hoping to be at the forefront of AI protection services for celebrities in Hollywood. With many…

Hollywood agency CAA aims to help stars manage their own AI likenesses

Expedia says Rathi Murthy and Sreenivas Rachamadugu, respectively its CTO and senior vice president of core services product & engineering, are no longer employed at the travel booking company. In…

Expedia says two execs dismissed after ‘violation of company policy’

Welcome back to TechCrunch’s Week in Review. This week had two major events from OpenAI and Google. OpenAI’s spring update event saw the reveal of its new model, GPT-4o, which…

OpenAI and Google lay out their competing AI visions

When Jeffrey Wang posted to X asking if anyone wanted to go in on an order of fancy-but-affordable office nap pods, he didn’t expect the post to go viral.

With AI startups booming, nap pods and Silicon Valley hustle culture are back

OpenAI’s Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources, according to a person from that team. But…

OpenAI created a team to control ‘superintelligent’ AI — then let it wither, source says

A new crop of early-stage startups — along with some recent VC investments — illustrates a niche emerging in the autonomous vehicle technology sector. Unlike the companies bringing robotaxis to…

VCs and the military are fueling self-driving startups that don’t need roads

When the founders of Sagetap, Sahil Khanna and Kevin Hughes, started working at early-stage enterprise software startups, they were surprised to find that the companies they worked at were trying…

Deal Dive: Sagetap looks to bring enterprise software sales into the 21st century

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI moves away from safety

After Apple loosened its App Store guidelines to permit game emulators, the retro game emulator Delta — an app 10 years in the making — hit the top of the…

Adobe comes after indie game emulator Delta for copying its logo

Meta is once again taking on its competitors by developing a feature that borrows concepts from others — in this case, BeReal and Snapchat. The company is developing a feature…

Meta’s latest experiment borrows from BeReal’s and Snapchat’s core ideas

Welcome to Startups Weekly! We’ve been drowning in AI news this week, with Google’s I/O setting the pace. And Elon Musk rages against the machine.

Startups Weekly: It’s the dawning of the age of AI — plus,  Musk is raging against the machine

IndieBio’s Bay Area incubator is about to debut its 15th cohort of biotech startups. We took special note of a few, which were making some major, bordering on ludicrous, claims…

IndieBio’s SF incubator lineup is making some wild biotech promises

YouTube TV has announced that its multiview feature for watching four streams at once is now available on Android phones and tablets. The Android launch comes two months after YouTube…

YouTube TV’s ‘multiview’ feature is now available on Android phones and tablets

Featured Article

Two Santa Cruz students uncover security bug that could let millions do their laundry for free

CSC ServiceWorks provides laundry machines to thousands of residential homes and universities, but the company ignored requests to fix a security bug.

3 days ago
Two Santa Cruz students uncover security bug that could let millions do their laundry for free

TechCrunch Disrupt 2024 is just around the corner, and the buzz is palpable. But what if we told you there’s a chance for you to not just attend, but also…

Harness the TechCrunch Effect: Host a Side Event at Disrupt 2024

Decks are all about telling a compelling story and Goodcarbon does a good job on that front. But there’s important information missing too.

Pitch Deck Teardown: Goodcarbon’s $5.5M seed deck

Slack is making it difficult for its customers if they want the company to stop using its data for model training.

Slack under attack over sneaky AI training policy

A Texas-based company that provides health insurance and benefit plans disclosed a data breach affecting almost 2.5 million people, some of whom had their Social Security number stolen. WebTPA said…

Healthcare company WebTPA discloses breach affecting 2.5 million people

Featured Article

Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Microsoft won’t be facing antitrust scrutiny in the U.K. over its recent investment into French AI startup Mistral AI.

3 days ago
Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Ember has partnered with HSBC in the U.K. so that the bank’s business customers can access Ember’s services from their online accounts.

Embedded finance is still trendy as accounting automation startup Ember partners with HSBC UK

Kudos uses AI to figure out consumer spending habits so it can then provide more personalized financial advice, like maximizing rewards and utilizing credit effectively.

Kudos lands $10M for an AI smart wallet that picks the best credit card for purchases

The EU’s warning comes after Microsoft failed to respond to a legally binding request for information that focused on its generative AI tools.

EU warns Microsoft it could be fined billions over missing GenAI risk info

The prospects for troubled banking-as-a-service startup Synapse have gone from bad to worse this week after a United States Trustee filed an emergency motion on Wednesday.  The trustee is asking…

A US Trustee wants troubled fintech Synapse to be liquidated via Chapter 7 bankruptcy, cites ‘gross mismanagement’

U.K.-based Seraphim Space is spinning up its 13th accelerator program, with nine participating companies working on a range of tech from propulsion to in-space manufacturing and space situational awareness. The…

Seraphim’s latest space accelerator welcomes nine companies