Let’s start with a clarification: this article is not about picking sides. The major cloud providers, AWS, Azure, Google Cloud, offer outstanding services, and Witboost runs beautifully on all of them. We work with customers using every major hyperscaler, and we have no intention of changing that.
This article is a practical walkthrough for organisations that want to understand what a fully EU-native data platform looks like in practice. We’ll walk through a reference architecture that combines three European technologies: Scality, Stackable, and Witboost. This combination acts as an end-to-end stack for data products and data contracts. Not as the only way, but as one credible, production-ready way for those who have this specific requirement.
We’ve noticed something in the conversations we’ve been having with CDOs and Heads of Data across Europe over the past 18 months. A question that used to come from public-sector procurement offices is now landing in the boardrooms of private enterprises: "Can we run our entire data platform on infrastructure that is fully European-controlled?"
The drivers are varied. For some, it’s regulatory: NIS2, DORA, the AI Act, and the ongoing debate around the European Cybersecurity Certification Scheme (EUCS) is raising questions about jurisdictional control that didn’t exist five years ago.
For others, it’s about risk management in a geopolitical landscape that has become harder to predict. And for a growing number, it’s simply about having a credible option on the table; not as a replacement for hyperscaler services, but as a well-understood alternative that can be activated if circumstances change.
In 2024 data sovereignty was mostly a concern for defence contractors and government agencies. Today, it’s a boardroom topic at banks, utilities, telcos, and manufacturing groups across Europe. What changed?
Several regulatory and geopolitical shifts converged:
|
Driver |
Implementation date |
What It Means for Data Platforms |
|
NIS2 Directive |
October 2024 |
Extended cybersecurity obligations to a wider set of “essential” and “important” entities. Supply chain risk assessment now explicitly includes cloud service dependencies. |
|
DORA |
January 2025 |
Financial entities must demonstrate operational resilience of their ICT supply chain, including concentration risk on non-EU providers. |
|
AI Act |
Phased 2025-2026 |
High-risk AI systems require transparency and auditability of data pipelines. Jurisdictional clarity of data processing is becoming a compliance accelerator. |
|
EUCS debate (ongoing) |
Ongoing |
The European Cybersecurity Certification Scheme initially included a “sovereignty” tier requiring EU jurisdiction. The debate continues and signals the direction of travel. |
|
Geopolitical dynamics |
Ongoing |
Shifting policies, supply chain bottlenecks due to regional conflict, and tariff discussions have increased awareness of dependency risks in technology stacks, even where no regulation mandates change. |
None of these drivers, on their own, mandate that European enterprises abandon hyperscaler cloud services. Most organisations will and should continue to use them, where they provide the best fit. But taken together, they create a strategic imperative for boards and CDOs: know your options.
Understand what a European-controlled alternative looks like, how it performs, and how quickly you could activate it if the regulatory or geopolitical landscape shifts further.
We’ve had customers who started exploring this question purely as a risk management exercise and ended up discovering that a sovereign stack gave them unexpected advantages. Just to name a few:
One of the traps in the sovereignty conversation is treating it as binary: either you’re on a hyperscaler or you’re sovereign. Reality is more nuanced. Data residency (where data is stored) is not the same as data sovereignty (who controls the infrastructure, the software, and the operational processes around it).
We help our customers with sovereignty requirements think in three layers that needs to be addressed independently:
|
Layer |
What It Covers |
Sovereignty Question |
|
Infrastructure |
Compute, storage, networking |
Is the hardware in an EU data centre, operated by an EU-headquartered entity, under EU legal jurisdiction? |
|
Data Platform |
Processing engines, query engines, orchestration, data formats, streaming |
Is the software open-source or EU-controlled? Are there dependencies on non-EU SaaS services for core functionality? |
|
Governance & Lifecycle |
Data product management, data contracts, metadata, access control, change management |
Does the governance layer impose technology choices, or does it work across any infrastructure? |
A common pattern we see is organisations that solve the first layer is that they put their data in an EU data centre but overlook the other two. They run proprietary SaaS processing tools that route control plane traffic through non-EU jurisdictions. Or they adopt a governance tool that is tightly coupled to a specific cloud provider, making portability difficult.
True architectural sovereignty means addressing all three layers. And critically, it means doing so without sacrificing the governance and lifecycle management capabilities that make data products operationally viable. Sovereignty without governance is just ungoverned data sitting in a European data centre. It solves a compliance checkbox but not the actual business problem.
What follows is a concrete, production-tested reference architecture for organisations that want to run data products and data contracts end-to-end on an EU-native stack. Each component is European-headquartered, open-source or open-core, and independently replaceable. All these components can run in private and air-gapped environments.
The stack is organised in 3 tiers, matching the three sovereignty layers we’ve outlined:
Scality is a French company that provides enterprise-grade object storage, deployed on-premises or in sovereign cloud environments. Its S3-compatible API means that any application written for cloud object storage works without modification.
Scality replaces the role that S3, Azure Blob Storage, or Google Cloud Storage would play in a hyperscaler deployment but with full EU jurisdictional control and on-premises flexibility.
Stackable is a German company that provides a modular, Kubernetes-native data platform built entirely on open-source components: Apache Spark, Trino, Apache NiFi, Apache Kafka, Apache Airflow, Apache Hive, and others.
Stackable replaces the managed data services that a hyperscaler would provide (e.g., EMR, Dataproc, Synapse) — but with fully open-source components that you operate on your own terms.
Witboost is the governance and data product management layer. This is where data products are bootstrapped, data contracts are defined and enforced, metadata is curated, and the entire lifecycle (from creation to retirement) is managed.
The critical point is that Witboost’s governance and lifecycle management capabilities are identical whether you run on a hyperscaler or on this EU-native stack. You don’t lose any functionality by choosing a sovereign deployment. Your data contracts, data products, policies, metadata, and lifecycle processes are portable across any infrastructure.
We believe in honest assessments. A sovereign EU-native stack is not a free lunch. Here’s what you gain and what you’re trading off:
|
Dimension |
What You Gain |
What Requires More Effort |
|
Jurisdictional control |
Full EU control over data, infrastructure, and software. No non-EU entity can access your data by legal compulsion. |
You need to manage your own infrastructure or work with an EU hosting partner (e.g., IONOS, OVHcloud, Hetzner). |
|
Vendor independence |
Fully open-source data platform components. No proprietary lock-in at any layer. Every component is replaceable. |
You lose the convenience of fully managed services. Your platform team takes on operational responsibility. |
|
Regulatory readiness |
Clean compliance story for NIS2, DORA, AI Act. No concentration risk on a single non-EU provider. |
You still need to do the regulatory work — the stack provides the foundation, not automatic compliance. |
|
Pricing predictability |
No surprise egress fees, no opaque pricing tiers. Infrastructure cost is fully within your control. |
You need capacity planning skills. There’s no elastic auto-scaling managed by someone else. |
|
Governance parity |
Identical Witboost governance capabilities as on any hyperscaler. No feature gaps. |
The initial integration between Witboost, Stackable, and Scality requires platform team investment but our starter kit covers them. |
The truth is that this architecture is best suited for organisations that already have (or are willing to build) a capable platform team. The operational model is closer to what you’d expect from a self-hosted Kubernetes environment than from a managed cloud service. For organisations with the right skills, this is a feature, not a bug: it gives you complete control and eliminates the “someone else’s computer” risk factor.
For organisations that prefer fully managed services and don’t have sovereignty as a hard requirement, the hyperscaler path remains excellent. Witboost supports both equally well.
This is the power of technology agnosticism: the same governance layer, the same data contracts, the same lifecycle management, regardless of what runs underneath.
If there’s one message we want to leave you with, it’s this: data sovereignty is an architectural requirement, not a political statement.
The organisations we see making the best decisions are those that approach sovereignty as a dimension of their platform architecture — like scalability, security, or cost efficiency. They don’t start from ideology (“we must avoid US cloud”) or from inertia (“we’ll deal with it if regulations force us”).
They start from optionality: building a governance and lifecycle layer that works across any infrastructure, so that the infrastructure choice becomes a deployment decision rather than an architectural constraint.
Here’s the practical approach we recommend:
Witboost was built on this principle from day one. Our founding conviction is that a platform should never make an architectural decision on your behalf. It should never constrain your technology choices, your cloud strategy, or your sovereignty posture. Whether you run on AWS, Azure, Google Cloud, or a fully European stack - the governance, the data contracts, and the lifecycle management work the same way.
That’s not neutrality for the sake of neutrality. It’s the recognition that the enterprises we serve operate in complex, multi-geography, multi-regulatory environments where the right answer today might not be the right answer tomorrow. The only responsible architecture is one that gives you the freedom to adapt.
If you’re exploring sovereignty requirements and want to understand how an EU-native stack would work with your specific data landscape, we’re happy to walk through it with you. No ideology. Just architecture.