Meet the Private Companies That Have Reached the $100m Revenue Milestone in the Data Engineering and AI Space.
By Ben Lorica and Kenn So.
In 2022, we published our first annual list of data and AI pegacorns – private companies that have reached the $100m revenue milestone. The selection criteria for data pegacorns focused on data engineering and management, while for AI companies, the emphasis was on the use of deep learning to drive the key features of their products. The $100m benchmark was chosen as a measure of a company’s success, rather than relying on venture funding valuations, which fluctuated significantly in the past 12 months. As we enter 2023, the secondary market has seen valuations of some unicorns tumble by 40-60% according to Pitchbook.
In this post, we welcome new members to the pegacorn list and highlight other promising emerging companies.
Algolia (AI infrastructure) is a search and discovery platform for websites and mobile applications. It provides search, discovery, and recommendation services through APIs to help developers create fast and relevant user experiences.
- Algolia offers search functionality that would be costly for most companies to develop on their own. This allows them to fill a gap in the market for companies that don’t have the resources to build their own search engines. Algolia’s APIs are flexible and developer-friendly, and it can handle massive amounts of data. With that said, Algolia faces competition from a new wave of startups focused on modern tools including semantic search, embeddings, vector databases, and chatbots backed by large language models.
Sigma Computing (Data application) is a cloud-based spreadsheet that enables businesses to analyze, visualize, and share big data in real-time. Because most business users know how to use spreadsheets, Sigma makes data accessible.
- As the data landscape shifts towards centralization in data warehouses and lakehouses, many business users find themselves ill-equipped to analyze the data stored within these systems. This is due to the technical proficiency required to effectively use SQL and Python. Sigma Computing provides a more familiar and intuitive interface for data access – the spreadsheet.
Anduril (AI application) is a defense technology company that develops artificial intelligence and robotics systems for use in defense applications. Their technology provides situational awareness and decision-making capabilities for military and security operations in a variety of contexts, including border security, and surveillance.
- Many SF Bay Area-based technology companies are reluctant to work with governments and military organizations. Anduril has successfully carved out a niche by targeting this underserved market. The US government spends over $800 billion each year for military purposes. In its role as a prime contractor, Anduril exclusively serves military clients. In contrast to major defense companies, Anduril assumes complete responsibility for all risks associated with research and development. Anduril takes a proactive approach to product development by designing and producing items that it believes will meet the needs of defense departments, rather than relying solely on government contracts. Through the use of an iterative manufacturing process and streamlined supply chain, the company is able to effectively and economically produce its products.
At-Bay (AI application) is an insurance company that provides coverage against cyber risks. The company covers a variety of cyber threats, including data breaches, cyber attacks, and business interruption caused by cyber incidents.
- Companies are becoming increasingly reliant on technology. However, this has also exposed organizations to a growing array of cyber risks. Traditional insurance providers have been slow to respond to this growing threat, leaving many businesses uninsured for cyber incidents. At-Bay emerged as a pioneer by addressing this market gap. By combining cybersecurity capabilities in their insurance policies, the startup made a significant advance in cyber insurance, enabling their clients to reduce their risk levels proactively.
Focusing on the application layer, targeting specific personas and pain points, and setting up feedback loops are key to success
Nearing Pegacorn status
Jasper (AI application) is a content creation platform to assist marketers in producing content for a variety of mediums, ranging from Facebook advertisements to blog posts. The AI model at the core of this technology has the capacity to generate extensive written works, far beyond mere sentence completions or brief headlines. The primary tool offered is a document editor which utilizes AI to augment the copywriting process. Additionally, the platform offers a range of templates to guide marketers.
- Businesses are becoming increasingly reliant on search engine optimization (SEO) and social media to drive online visibility and engagement. This led to a demand for content marketing solutions that can help organizations rank higher on Google’s search results and generate more social media clicks. To meet this need, Jasper built a text editor powered by fine-tuned AI models to help marketers create more engaging content. By building for the content marketer persona, Jasper quickly gained widespread adoption among its targeted persona. While there are other AI startups focused on similar user groups, Jasper prioritizes feedback loops to continually refine their language models for optimal “customer specificity”.
BigPanda (AI application) transforms the IT data into actionable intelligence, enabling incident response teams to increase uptime and efficiency. The product correlates IT events and automates steps to resolving incidents.
- The fragmentation of enterprise infrastructure across multiple clouds, each utilizing specialized tools for specific functions, has led to a proliferation of specialized monitoring solutions. This software sprawl poses a challenging coordination problem for centralized teams such as IT. BigPanda helps by ingesting all the data and extracting signal from the noise.
New pegacorn companies continue to emerge, many of which operate in the application layer. This trend is not surprising, as the application layer offers a wider range of personas to target, including function, industry, and geography, compared to the infrastructure layer. The pain point at the infrastructure level tends to be similar across companies, leading to a concentration of larger vendors.
The proliferation of foundation models also opens up opportunities
Our research also identified a growing number of companies utilizing foundation models. Jasper, for example, has already achieved $75M in annual recurring revenue. The question remains as to where value will be generated in this market – will it be the handful of research labs offering APIs as a service or the applications built on top of them? We believe the latter will prove to be more lucrative.
The emergence of new pegacorn companies brings with it both risks and opportunities. One key risk is the increasing commoditization of products as a result of the availability of pre-trained models and open-source tools. This makes it easy to create similar products with simple user interfaces. To overcome this, companies must differentiate themselves by offering a deeper product suite tailored to specific personas or even specific users. On the other hand, the proliferation of foundation models and decentralized custom models open up opportunities for tooling companies to provide solutions for building, fine-tuning, and optimizing models, as seen in projects like LangChain and GPT Index.
Kenn So writes about the most consequential AI trends and companies via Quild. He also works at Smartsheet but his writing reflects his personal views only.
Ben Lorica helps organize the Data+AI Summit and the Ray Summit, is co-chair of the NLP Summit, and principal at Gradient Flow. He is an advisor to Databricks, Anyscale, and other startups.
If you enjoyed this post please support our work by encouraging your friends and colleagues to subscribe to our newsletter: