AI

Viso eyes no-code for the future of computer vision and scores funding to scale

Comment

Image Credits: Viso

Computer vision has become commonplace across innumerable industries, but the methods of creating and controlling these visual AI models aren’t so easy. Viso is building a low/no-code end-to-end platform that lets companies roll their own computer vision stack, and they just pulled in $9.2M to scale up.

There are tons of computer vision models and services out there, of course, but a lot sort of fit the description of “model as API.” Say you want to do person recognition and rate whether they’re standing or sitting, so you can tell how busy a train station or restaurant is.

There are fully-forrmed options out there for you for person and pose recognition, but they may not fit your use case, or security model, or they’re too expensive to scale with. Building your own is an option, but the expertise required to train and deploy modern CV models is non-trivial: unless you have the time and money to stand up a real team, it may be out of your reach.

That’s the type of situation that Viso wants to remedy, by providing a platform to create an enterprise-grade CV model of your own without dedicating the kind of time and resources that it often takes.

“Early in the adoption cycle, companies resort to buying/renting pre-made computer vision systems. However, they eventually need to bring all computer vision initiatives together (streamlining), and deeply integrate and customize them, and also ‘own’ them because the data is sensitive and the technology of strategic value. This is why companies across those industries are starting to hire AI engineers,” explained Viso’s co-founder and co-CEO, Gaudenz Boesch.

Examples of Viso-powered computer vision applications.

But unlike for many other enterprise-level needs, computer vision lacks a “specialized infrastructure” to efficiently build and deploy it.

“Companies have to build it from scratch, trying to assemble a plethora of disconnected software and hardware platforms (cameras, servers) across the organization,” he continued. This in turn requires expertise across numerous domains that quickly grows too expensive.

Viso’s approach will likely look familiar to anyone who has used no-code tools in other contexts. It amounts to a series of modules, both pre-built and customizable, that let a user select, train, and deploy computer vision models as needed.

One view of the model creation process.

Of course, you’ll still need some level of expertise – which object recognition model should it run? Where will training data be kept? How is inference handled? But a handful of engineers can do the work of far more, and all in one place rather than scattered across a dozen tools, APIs, and code notebooks.

Viso says it’s end-to-end, and that doesn’t seem to be an exaggeration. Computer vision requires data to start with, and training processes, and then implementation, hosting, compliance work, and so on — and it seems to really be a “soup to nuts” solution that puts all of that in one place:

That’s a big list!

So if you were making that “busy detector” from earlier, you could conceivably come into it with nothing but a hundred hours of footage and come out the other end a week or two later with a complete product. That would include low-level analysis and storage of the raw data, annotation and labeling, training and testing of the base model, product integration, deployment online or offline, analytics, updates and backups, as well as access and security… all without leaving Viso, and probably without touching the semicolon or bracket keys. (There are various case studies here.)

Though there are other computer vision platforms out there, Boesch said none were “built to manage highly complex computer vision applications at scale, and maintain them continuously,” instead being more focused on a handful of tasks from the above list. Viso aims to support as many models and methods, hardware, and use cases as possible, while ensuring the customer owns the end result.

Not being a developer myself, I can’t speak to how difficult or easy different use cases might be, but certainly there is a fundamental attraction (as evidenced by the popularity of other low-code and end-to-end tools) to using fewer and more comprehensive platforms rather than stitching together a series of disconnected ones.

Viso’s investors seem to think so, and the company has raised $9.2 million in seed stage funding, led by Accel and with various angels participating. Interestingly, the company has been bootstrapped since it was founded in 2018 in Switzerland.

Boesch said that exploding demand caused the company to do the raise, which by AI company terms is quite modest compared with the products on offer and existing customers. He said Viso has already been adopted by several large companies, including Pricewaterhouse Cooper, DHL, and Orange, and has experienced 6x in new customer growth since 2022.

 

More TechCrunch

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

It ran 110 minutes, but Google managed to reference AI a whopping 121 times during its I/O 2024 (by its own count). CEO Sundar Pichai referenced the figure to wrap…

Google mentioned ‘AI’ 120+ times during its I/O keynote

Here are quick hits of the biggest news from the keynote as they are announced.

Google I/O 2024: Here’s everything Google just announced

Google Play has a new discovery feature for apps, new ways to acquire users, updates to Play Points, and other enhancements to developer-facing tools.

Google Play preps a new full-screen app discovery feature and adds more developer tools

Soon, Android users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps.

Gemini on Android becomes more capable and works with Gmail, Messages, YouTube and more

Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.

Google gets serious about AI-generated video at Google I/O 2024

In addition to the body of the emails themselves, the feature will also be able to analyze attachments, like PDFs.

Gemini comes to Gmail to summarize, draft emails, and more

The summaries are created based on Gemini’s analysis of insights from Google Maps’ community of more than 300 million contributors.

Google is bringing Gemini capabilities to Google Maps Platform

Google says that over 100,000 developers already tried the service.

Project IDX, Google’s next-gen IDE, is now in open beta

The system effectively listens for “conversation patterns commonly associated with scams” in-real time. 

Google will use Gemini to detect scams during calls

The standard Gemma models were only available in 2 billion and 7 billion parameter versions, making this quite a step up.

Google announces Gemma 2, a 27B-parameter version of its open model, launching in June

This is a great example of a company using generative AI to open its software to more users.

Google TalkBack will use Gemini to describe images for blind people

Firebase Genkit is an open source framework that enables developers to quickly build AI into new and existing applications.

Google launches Firebase Genkit, a new open source framework for building AI-powered apps

This will enable developers to use the on-device model to power their own AI features.

Google is building its Gemini Nano AI model into Chrome on the desktop

Google’s Circle to Search feature will now be able to solve more complex problems across psychics and math word problems. 

Circle to Search is now a better homework helper

People can now search using a video they upload combined with a text query to get an AI overview of the answers they need.

Google experiments with using video to search, thanks to Gemini AI

A search results page based on generative AI as its ranking mechanism will have wide-reaching consequences for online publishers.

Google will soon start using GenAI to organize some search results pages

Google has built a custom Gemini model for search to combine real-time information, Google’s ranking, long context and multimodal features.

Google is adding more AI to its search results

At its Google I/O developer conference, Google on Tuesday announced the next generation of its Tensor Processing Units (TPU) AI chips.

Google’s next-gen TPUs promise a 4.7x performance boost

Google is upgrading Gemini, its AI-powered chatbot, with features aimed at making the experience more ambient and contextually useful.

Google reveals plans for upgrading AI in the real world through Gemini Live at Google I/O 2024

Veo can generate few-seconds-long 1080p video clips given a text prompt.

Google’s image-generating AI gets an upgrade

At Google I/O, Google announced upgrades to Gemini 1.5 Pro, including a bigger context window. .

Google’s generative AI can now analyze hours of video

The AI upgrade will make finding the right content more intuitive and less of a manual search process.

Google Photos introduces an AI search feature, Ask Photos

Apple released new data about anti-fraud measures related to its operation of the iOS App Store on Tuesday morning, trumpeting a claim that it stopped over $7 billion in “potentially…

Apple touts stopping $1.8B in App Store fraud last year in latest pitch to developers

Online travel agency Expedia is testing an AI assistant that bolsters features like search, itinerary building, trip planning, and real-time travel updates.

Expedia starts testing AI-powered features for search and travel planning

Welcome to TechCrunch Fintech! This week, we look at the drama around TabaPay deciding to not buy Synapse’s assets, as well as stocks dropping for a couple of fintechs, Monzo raising…

Inside TabaPay’s drama-filled decision to abandon its plans to buy Synapse’s assets

The person who claimed to have stolen the physical addresses of 49 million Dell customers appears to have taken more data from a different Dell portal, TechCrunch has learned. The…

Threat actor scraped Dell support tickets, including customer phone numbers

If you write the words “cis” or “cisgender” on X, you might be served this full-screen message: “This post contains language that may be considered a slur by X and…

On Elon’s whim, X now treats ‘cisgender’ as a slur

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: Watch the AI reveals live

Facebook once had big ambitions to be a major player in enterprise communication and productivity, but today the social network’s parent company Meta will be closing a very significant chapter…

Meta is shutting down Workplace, its enterprise communications business