What is DataOps? - Gradient Flow

The rise of tools and processes to manage and control data.

By Assaf Araki and Ben Lorica.

Data has emerged as an imperative foundational asset for all organizations. Data fuels significant initiatives such as digital transformation and the adoption of analytics, machine learning, and AI. Organizations that are able to tame, manage, and unlock their data assets stand to benefit in myriad ways, including improvements to decision-making and operational efficiency, better fraud prediction and prevention, better risk management and control, and more. In addition, data products and services can often lead to new or additional revenue.

As companies increasingly depend on data to power essential products and services, they are investing in tools and processes to manage essential operations and services. In this post, we describe these tools as well as the community of practitioners using them. One sign of the growing maturity of these tools and practices is that a community of engineers and developers are beginning to coalesce around the term “DataOps” (data operations).

Our conversations with members of this nascent community revealed a few key activities associated with DataOps: automation, monitoring, and incident response. In brief, DataOps is composed of tools and processes for monitoring and automating tasks and software that raise the efficiency of operations in support of all data products and services. DataOps tools and processes allow organizations to deliver data products and services quickly, reliably, and efficiently.

The Need for DataOps

More than a decade after the rise of big data management systems, the amount of data that companies need to collect, manage, and unlock keeps growing. Both data volume and the number of data sources have exploded. The emergence of cloud computing, SaaS, mobile computing, and sensors have made operational tasks pertaining to data assets much more challenging. The types of data companies are collecting have also expanded. Machine learning tools have made it possible for companies to unlock unstructured data by incorporating new techniques from computer vision, language models, and speech technologies.

Companies are under increasing competitive pressure to use data and machine learning to modernize their operations and decision-making to gain a competitive advantage in their markets. This means adopting tools that expand the pool of workers—beyond developers, engineers, and data scientists—who use data on a regular basis. Frontline workers, analysts, managers, and executives all need to incorporate data in their decision-making and operations. To raise the productivity of workers who use data, companies will need to adopt tools, such as feature stores and data catalogs, that facilitate collaboration, discovery, and reuse.

Not only do more workers and services depend on data, these new users expect a certain amount of reliability and freshness—near real-time updates in certain scenarios—in their data assets. As more people come to rely on and use data, companies need to adopt technologies and processes that ensure critical data pipelines and infrastructure are actively being monitored and managed. Failures are inevitable in a world of complex systems. The best companies have tools and processes in place that minimize their mean time to recovery from failures.

These challenges are occurring at a time when regulators and users are increasingly concerned with issues related to data privacy and security. Landmark privacy regulations in many jurisdictions have forced companies to improve their tools, not only for data security and privacy, but also for data retention and governance. Data teams are also increasingly under pressure to account for important concerns that fall under the umbrella of Responsible AI (aside from security and privacy, Responsible AI includes such issues as fairness and transparency). DataOps provides a formal set of processes and tools that can help detect, prevent, and mitigate many of the issues that arise as a consequence of Responsible AI.

The DataOps Landscape

In this section, we organize the ecosystem of DataOps tools. First, we distinguish two major areas that comprise the logical architecture of data products and services: the data plane and the control plane. The data plane includes data itself as well as tools for managing data. Data plane elements move data between systems, transforms data, trains machine learning models, and creates artifacts used in data products and services. The control plane monitors the data plane and initiates a response to an event or a failure.

While DataOps resides within the control plane, not all control plane elements are DataOps. For example, there are elements of Infrastructure Ops in Figure 2 that include ITOps and other operational areas that are not focused on data or analytics. The data plane includes data from operational systems like customer relationship management (CRM), enterprise resource planning (ERP), and customer-facing websites. These are source systems that supply data into data warehouses, data lakes, and lakehouses. DataOps includes a Metadata stack that creates a map of organizational data flow, tracks the flow of the data, and enforces access and data quality.

Figure 2: The DataOps stack, Data and Control planes.

Another layer consists of development tools used to manage complex processes that may include multiple experiments and iterations, as well as collaboration that cuts across teams and units. The MLOps and DevOps for data systems reside in this layer. MLOps includes model management and operations for the entire model lifecycle, from data preparation to training to inference. DevOps provides a set of practices that combines software development and IT operations, and aims to shorten the development life cycle and provide continuous delivery with high software quality.

The Data Product & Services layer includes tools used to track business key performance indicators (KPIs), and machine learning products and services. MLOps is a new set of tools and processes that includes monitoring and managing models in production to identify data drifts, model accuracy decline, and adversarial attacks. A related set of tools and processes—which we refer to as “BusinessOps”—focuses on key metrics, such as revenue, cost, and other business KPIs. Examples of BusinessOps capabilities include anomaly detection, forecasting, and root cause analysis.

Figure 3 provides a partial list of companies providing solutions in the areas we listed. It describes a physical architecture and lists a representative sample of companies in each major category. We previously described the logical architecture of data products and services as being composed of a data plane and control place. We note that there are companies that provide elements residing in both the control and data planes. For example, some extract, transform, load (ETL) companies provide tools for moving, loading, and transforming data, and also supply accompanying tools for ELT monitoring and observability.

Figure 3: Representative examples of tools, services, and companies in our DataOps stack.

Which of these areas are enterprises and companies focusing on? Based on a recent analysis of job postings for “data engineers”, beyond the usual focus areas – data pipelines, data management, data warehouses – the topic of data quality has emerged as a priority for companies hiring data engineers.

Closing Thoughts

In this post, we describe a new set of tools and processes that are aimed at helping companies manage and control their data assets and infrastructure. DataOps includes tools and processes used to automate and monitor everything that supports an organization’s data products and services. We aren’t alone in using this term: there are startups and data engineers who are beginning to coalesce around DataOps.

We attempted to offer a detailed, formal structure, while also highlighting a group of companies and projects in this exciting new area. XXOps—including DataOps—is a prime area for engineers, entrepreneurs, and investors. What makes DataOps particularly exciting is that data is used everywhere, and the opportunities to help organizations with DataOps cuts across domains, users, and systems.

Related content: Other posts by Assaf Araki and Ben Lorica.

Assaf Araki is an investment manager at Intel Capital. His contributions to this post are his personal opinion and do not represent the opinion of the Intel Corporation. Intel Capital is an investor in Anodot, Hypersonix, Immuta and Verta. #IamIntel

Ben Lorica is co-chair of the Ray Summit, chair of the NLP Summit, and principal at Gradient Flow. He is an advisor to Metaphor Data and Anodot.

The rise of tools and processes to manage and control data.

The Need for DataOps

The DataOps Landscape

Closing Thoughts

Share this:

Like this:

Discover more from Gradient Flow