MLOps platform consolidation: Databricks, Snowflake AI, SageMaker
The MLOps landscape of 2023 — a dozen point tools, three feature stores, four monitoring vendors, two training frameworks — has collapsed. By 2026 the workload lives on Databricks, Snowflake Cortex, SageMaker Unified Studio, or Vertex. Here's what each platform actually does, what got swept under, and who wins which workload.
In 2023 I had a folder on my laptop called mlops-vendors-to-evaluate.md.
There were eighteen names in it. Twelve are now either acquired, dead, or
quietly pivoted into a new product category. Of the remaining six, four
are open-source projects whose commercial sponsors are looking for an exit,
and two are still independent companies — for now.
That graveyard is the story of the MLOps consolidation. The bet you made on a standalone feature store in 2022 has been quietly liquidated. The “unified observability for ML” platform you picked is now either a Datadog acquisition or competing with the data platform you already pay. The training-infra vendor you signed a contract with was bought by Databricks or partnered into irrelevance.
By mid-2026, the picture has crystallised. Most enterprise ML and AI workloads run on one of four platforms — Databricks, Snowflake Cortex, AWS SageMaker, Google Vertex AI — and the point-tool world that surrounded them has been mostly absorbed. This post is about what that consolidation looks like, what each platform actually does in 2026, and what got swept under.
What “MLOps platform” means now
Two years ago the MLOps platform was a stack you assembled. Notebook service, training orchestrator, model registry, feature store, serving endpoint, monitoring, governance — each from a different vendor, glued together with custom code. The promise of MLOps platforms was that one team would own the whole stack, and you’d pay one vendor.
The promise has been mostly delivered. The 2026 definition of an MLOps platform is roughly:
Every box in that diagram was a standalone vendor category in 2022. By 2026, almost all of them ship as features of the data platform — and the ones that don’t ship cleanly (the agents/orchestration row, the monitoring row) are the categories where the next wave of point tools will eventually be absorbed.
Databricks — what MosaicML and Tecton actually bought
Databricks’ acquisition strategy is the most aggressive on the list, and the most coherent. Two acquisitions defined the platform:
The $1.3 billion MosaicML acquisition closed in July 2023. At the time it looked like Databricks paying a premium for a generative-AI startup; in retrospect it bought them the training and serving infrastructure for foundation models without having to build it. The MosaicML team became the core of Mosaic AI — which by mid-2026 includes Agent Bricks, Vector Search, Model Serving, and the GPU-serverless compute fabric that lets Databricks customers train and serve LLMs without leaving the platform.
The August 2025 Tecton acquisition closed the real-time data loop. Tecton was the leading standalone feature store, with a credible AI-context product layered on top, and Databricks absorbed it into the broader Mosaic AI platform with sub-10ms latency serving for both classical-ML features and agent context. In one move, Databricks both eliminated its strongest standalone-feature-store competitor and added the real-time serving capability that its lakehouse architecture had previously lacked.
What you get on Databricks today: Unity Catalog as the unifying governance layer, Delta and Iceberg interoperability for the storage layer, Mosaic AI for training and serving, MLflow for experimentation and registry, Vector Search for retrieval, the Tecton-derived feature platform for real-time, and Agent Bricks for building and deploying agents. It is the most complete single-platform story for combined ML + AI workloads in 2026. The cost of that completeness is that you become Databricks-deep — your training, your serving, your governance, your feature store all live in one vendor’s ecosystem.
Snowflake Cortex — the warehouse goes agentic
Snowflake’s approach is the inverse of Databricks’. Where Databricks acquired, Snowflake built. Snowflake Cortex AI is a suite of LLM and ML capabilities native to the warehouse — including Snowflake’s own Arctic models, third-party endpoints (Anthropic, Meta, Mistral, and Google), Cortex Search for unstructured retrieval, Cortex Analyst for natural-language-to-SQL, and Cortex Agents which orchestrate across both structured (via Analyst) and unstructured (via Search) sources.
The defining moment for Snowflake’s AI story was Cortex Agents reaching general availability on November 4, 2025. That GA matters because it crossed the threshold from “Snowflake can do LLM things in a warehouse” to “Snowflake has a credible agent platform without you having to leave the data.”
The architectural argument is sharper than it sounds. If your data lives in Snowflake — and for most enterprise warehouses it does — then running the agent inside the same security and governance perimeter is operationally simpler than shipping the data to a separate AI platform. You inherit row-level security, masking policies, the data catalog, and the audit log automatically. The agent’s SQL queries can only see what the calling user can see. The compliance review is roughly the same one your team has already passed for the warehouse.
What Snowflake gives up: depth in classical ML workflows. Cortex’s ML story is real but younger than Databricks’. Teams that came from gradient-boosting-and-feature-engineering pipelines (see the previous post in this series) are still mostly happier on Databricks. Teams whose ML problem is fundamentally about extracting structured insight from already-clean warehouse data and exposing it through an LLM interface have found Cortex to be the right shape.
SageMaker Unified Studio — AWS catches up
AWS rebuilt SageMaker at re:Invent 2024, launching SageMaker Unified Studio in preview and pulling EMR, Glue, Redshift, Bedrock, and the existing SageMaker Studio into one product. The supporting components — SageMaker Lakehouse for unified storage across S3, Redshift, and federated sources; SageMaker Catalog built on DataZone for governance; SageMaker Data and AI Governance for policy enforcement — closed the gaps that had made SageMaker feel like a constellation of disconnected services.
This was AWS’s clearest acknowledgement that the standalone-service era was ending. For years SageMaker was a loose federation: SageMaker Notebooks, SageMaker Training, SageMaker Inference, SageMaker Feature Store, SageMaker Model Monitor, each evolving on its own roadmap with its own console. The Unified Studio is the consolidation pass — one workbench, shared metadata, a single permission model.
The honest read in mid-2026: SageMaker Unified Studio is now competitive on feature completeness with Databricks and Snowflake for AWS-native shops. It still lags on the AI-agent layer (Bedrock provides the foundation models and a credible agent framework, but the integration is less polished than Databricks’ Mosaic AI or Snowflake’s Cortex Agents). For AWS-first organisations, it is now a reasonable default; for multi-cloud organisations, it remains harder to argue for than the data-warehouse-anchored alternatives.
Vertex AI — Google’s quiet completeness
Google’s Vertex AI covers the same surface area as the others. Notebooks, training, serving, model garden, feature store, monitoring, agents, governance. Vertex’s defining advantage is BigQuery + Vertex tight integration — for organisations that have standardised on BigQuery, Vertex’s ML and AI workloads feel native to the data in a way that the other clouds don’t quite match for their own warehouses.
The defining disadvantage is GCP’s smaller share of enterprise data gravity. Vertex’s feature set is on par with the rest; the customer base that needs it is smaller. In practice, Vertex shows up in GCP-first organisations (media, gaming, some retail) and as the second cloud for enterprises whose primary is AWS or Azure.
What got swept under
The interesting story isn’t who survived. It’s who didn’t.
Standalone feature stores. Tecton sold to Databricks. Feast remains open source but has no commercial sponsor at scale. Hopsworks is the last credible independent — and is repositioning as an AI lakehouse, not a feature store. SageMaker and Vertex ship native feature stores; Snowflake exposes feature serving through Cortex.
Standalone monitoring vendors. WhyLabs, Fiddler, Truera — all shrunk or pivoted. The general-purpose APM vendors (Datadog, New Relic) absorbed the LLM and model monitoring use case. Arize survives by being deepest on the ML-evaluation heritage and pivoting cleanly into LLM observability. The standalone-ML-monitoring category as it existed in 2022 is largely gone.
Standalone training infrastructure. MosaicML went to Databricks. Foundry, Lambda Labs, CoreWeave, and the GPU-as-a-service vendors still exist but as compute providers, not as MLOps platforms. The “we’ll train your foundation model for you” startup category effectively became a feature of the cloud platforms.
Pipeline orchestrators. Kubeflow, Metaflow, Argo Workflows — the open-source pipeline tools still have users, but the commercial “managed Kubeflow” pitch is dead. Airflow remains as the data-engineering orchestrator; ML pipelines mostly migrated to the platform-native workflow tools.
What survived as standalone: dbt for data transformation; Airflow for orchestration; Hugging Face as the model repository (though that’s a different category); Anyscale/Ray for distributed Python; the LLM-observability vendors (LangSmith, Langfuse, Helicone, Braintrust) — though that’s a younger category. Notice these are all either adjacent-to-MLOps or unambiguously open-source-flagship plays.
Who wins which workload
A simplified procurement guide for 2026, derived from watching dozens of teams make this decision:
- Classical ML + LLMs in one place. Databricks. The Mosaic AI + Tecton + MLflow + Unity Catalog combination is the most complete story, and it works for teams that have both gradient-boosting pipelines and agentic AI workloads.
- Already deep on the warehouse, want AI without moving the data. Snowflake Cortex. The agent platform inside the same security perimeter as the warehouse is the right shape for teams whose data gravity is in Snowflake.
- AWS-native organisation, want the cloud-native default. SageMaker Unified Studio. Bedrock for foundation models, SageMaker for ML, Unified Studio as the workbench.
- GCP-native organisation, BigQuery is the data centre of gravity. Vertex AI. The integration with BigQuery is the differentiator.
- You want to stay open-source-flexible at all costs. Open-source MLflow + Kubeflow + Feast + Triton, run on Kubernetes. This stack works, requires real platform-engineering investment, and is what a lot of the most sophisticated tech-company ML teams actually run. It’s not a “vendor choice”; it’s a “we’ll build it” choice.
Anti-patterns worth flagging
A few patterns I’d skip if I were re-doing 2024:
Picking a platform because the LLM endpoints are slightly cheaper. The LLM API gateway is the most commoditised part of the stack. Every platform integrates Anthropic, OpenAI, Meta, Mistral, and the open weights. Don’t pick your platform on the LLM line item; pick it on the data and governance layer, where the lock-in actually lives.
Avoiding lock-in by stitching together six point tools. This was the 2023 conventional wisdom. In 2024 and 2025 it cost teams enormous platform-engineering investment with vendors who were quietly going out of business. Lock-in is real, but it’s not the only risk; vendor fragility is the other. The platforms have won partly because they’re the ones still around.
Treating “MLOps” as separate from “data platform.” The two categories merged. The MLOps decision and the data platform decision are now the same decision. If your data is in Snowflake, your ML and AI mostly live in Cortex. If your data is in Databricks, ML and AI live in Mosaic AI. Trying to separate them is fighting the gravity of where the data sits.
What to take away
- The MLOps platform decision is now a data platform decision. Databricks, Snowflake, SageMaker, Vertex — pick where your data lives, and ML/AI follows.
- The point-tool era is over. Standalone feature stores, monitoring vendors, training-infra vendors have been acquired or commoditised. The exceptions (dbt, Airflow, the LLM-observability younger vendors) are special cases.
- LLM endpoints are the cheapest part of the platform decision. The lock-in lives in the data, governance, and feature-serving layers. Pick on those, not on the foundation-model line item.
- The next consolidation wave is the agent/orchestration layer. It’s the one row in the platform diagram where point tools (LangChain, LlamaIndex, the agent frameworks) still operate independently. Expect acquisitions there over 2026-2027.
The MLOps consolidation isn’t a sign that the field has stopped innovating. It’s a sign that the field has stopped fragmenting. The innovation has moved up the stack — to agents, to evals, to RAG infra, to the new feature-store renaissance — and the platforms have absorbed the layers below. That pattern is the same one that happened to container orchestrators in 2018, to data warehouses in 2020, and to APM in 2015. It tends to be a sign of a category maturing, not stalling. The bet, in 2026, is on the platforms.
Further reading: AWS’s own next-gen SageMaker announcement is the cleanest read on the consolidation play. Databricks’ Tecton acquisition post is worth reading alongside the MosaicML completion announcement to see the strategy in two beats. Sanjeev Mohan’s re:Invent 2024 recap is the best independent analysis of the AWS half of the story.