The Missing Layer in Europe's AI Strategy: Data Ownership

The European AI debate has a structural problem. Not a lack of ambition – there's plenty of that. Not a shortage of regulation – the AI Act exists. The problem is that the conversation keeps circling the same questions about models, compute, and governance frameworks while largely ignoring the substrate on which all of it depends: data ownership.

This matters because the disagreement about European AI strategy isn't really one disagreement. It's at least three, tangled together in ways that make productive debate difficult. Disentangling them reveals why the data ownership question deserves more attention than it currently receives.

The Sovereignty Debate Needs Disaggregation

When policymakers invoke AI sovereignty, they might mean any of several distinct things: domestic compute infrastructure, European foundation models, regulatory autonomy, or reduced dependency on specific foreign providers. These are four different goals with different costs, different timelines, and different trade-offs. The person arguing for openness might agree with three of them while opposing one.

But there's a fifth dimension that rarely gets equal billing: data sovereignty at the operational level. Not data protection as a compliance exercise, but data ownership as a strategic asset.

A recent Tech.eu analysis frames this precisely: As models become increasingly commoditised, competitive advantage is shifting to the data layer, thereby raising the stakes around who owns and controls it. This is a facts claim, not a values claim, and it deserves to be evaluated on its merits.

The argument runs as follows: foundation models are converging in capability. The marginal advantage of one frontier model over another is shrinking. What remains differentiated is the data that fine-tunes, grounds, and contextualizes these models for specific applications. If that's true – and the evidence increasingly suggests it is – then the strategic question shifts from who builds the best model to who controls the most valuable data pipelines.

Three Types of Disagreement

The data ownership debate contains at least three distinct types of disagreement, and conflating them produces more heat than light.

The facts disagreement: Does data ownership actually confer competitive advantage in an AI economy, or is this overstated? Proponents point to the success of companies that have built proprietary data moats. Skeptics note that synthetic data generation and transfer learning may reduce the premium on proprietary datasets. This is an empirical question that deserves empirical investigation, not ideological assertion.

The values disagreement: Even if data ownership confers advantage, should European policy prioritize it? Some argue that data sovereignty is a prerequisite for democratic AI governance – that systems trained on data controlled by foreign entities cannot be fully accountable to European citizens. Others contend that data localization requirements fragment markets, raise costs, and ultimately harm European competitiveness. Both positions have merit; the question is how to weigh them.

The incentives disagreement: Who benefits from different data ownership regimes? Large incumbents may prefer arrangements that entrench their existing data advantages. Startups may prefer open data ecosystems that reduce barriers to entry. Public sector organizations may prioritize control over citizen data regardless of efficiency costs. Understanding these incentive structures doesn't resolve the debate, but it does clarify why different actors advocate different positions.

The Operational Reality

Onur Alp Soner, CEO of analytics platform Countly, offers a practitioner's perspective that cuts through some of the abstraction. As quoted in Tech.eu:

Basically, our main focus is data control and data ownership. We want companies to have complete control over the data they collect – that's why we've existed since day one.
Onur Alp Soner

Soner's framing is instructive because it shifts the conversation from policy aspiration to operational capability. The question isn't just whether Europe should pursue data sovereignty – it's whether European organizations have the technical infrastructure to exercise meaningful control over their data pipelines.

This is where the debate gets interesting. GDPR (General Data Protection Regulation, the EU's comprehensive data protection law) established strong legal frameworks for data protection. But legal rights without operational capacity are incomplete. An organization that nominally owns its data but processes it through third-party platforms with opaque data handling practices has sovereignty in name only.

The Center for the Study of Democracy's analysis of DeepSeek illustrates this tension. When European data protection authorities investigated the Chinese AI company, the core questions centered on where data goes, who has access, and whether EU citizens' data rights are being upheld. But as the analysis notes, the concern is not about corporate privacy policy. It's about the legal and political system behind the company – and what that system could do with access to foreign data and users.

The GDPR-AI Tension

The relationship between GDPR and AI development contains genuine tensions that deserve acknowledgment rather than dismissal. A European Parliament study on this relationship notes that there is indeed a tension between the traditional data protection principles – purpose limitation, data minimisation, the special treatment of 'sensitive data', the limitation on automated decisions – and the full deployment of the power of AI and big data.

The study concludes that AI can be deployed in a way that is consistent with the GDPR, but also that the GDPR does not provide sufficient guidance for controllers, and that its prescriptions need to be expanded and concretised.

This is a more nuanced position than either GDPR kills AI innovation or GDPR and AI are perfectly compatible. The truth is that reconciling data protection principles with AI development requires ongoing interpretation, technical innovation, and regulatory evolution. Pretending otherwise serves neither side of the debate.

What Would Have to Be True

The strongest version of the data sovereignty argument would require several things to be true simultaneously:

First, that data ownership genuinely confers durable competitive advantage in AI applications – not just temporary advantage that erodes as synthetic data and transfer learning mature.

Second, that European organizations can build the technical infrastructure to exercise meaningful data control without prohibitive costs or capability gaps.

Third, that data sovereignty requirements don't fragment European markets in ways that undermine the scale advantages that AI development requires.

Fourth, that the governance benefits of data sovereignty – accountability, democratic oversight, alignment with European values – are substantial enough to justify potential efficiency costs.

Each of these claims is contestable. But they're contestable on their merits, not as proxies for tribal affiliations. The question worth asking: which of these conditions is most uncertain, and what evidence would help resolve that uncertainty?

The Question That Changes the Room

Perhaps the most productive reframe is this: instead of asking should Europe pursue data sovereignty, ask what specific capabilities would European organizations need to exercise meaningful control over their AI data pipelines, and what would it cost to build them?

This shifts the conversation from abstract principle to concrete implementation. It acknowledges that sovereignty isn't just policy – it's operational. And it creates space for the kind of detailed, sector-specific analysis that actually informs decision-making.

The data ownership debate will continue. But it might generate more light and less heat if participants are clearer about what type of disagreement they're having – and what evidence would change their minds.

These questions – about data sovereignty, operational capability, and the relationship between policy aspiration and technical reality – deserve sustained attention from policymakers, technologists, and governance scholars alike. They're exactly the kind of questions that benefit from bringing diverse perspectives into the same room. Human x AI Europe, taking place May 19 in Vienna, is designed for precisely this kind of conversation: the room where Europe decides what kind of future it wants to build.

Frequently Asked Questions

Q: What is data sovereignty in the context of European AI strategy?

A: Data sovereignty refers to an organization's or jurisdiction's ability to control where data is stored, who can access it, and how it is processed. In AI contexts, this extends beyond legal compliance to include operational capability – the technical infrastructure to exercise meaningful control over data pipelines used for AI training and deployment.

Q: How does GDPR affect AI development in Europe?

A: GDPR creates tensions with AI development through principles like purpose limitation and data minimization, which can conflict with AI's need for large, diverse datasets. However, a European Parliament study concludes that AI can be deployed consistently with GDPR, though the regulation does not provide sufficient guidance for controllers and requires ongoing interpretation.

Q: Why is data ownership becoming more important than AI model development?

A: As foundation models converge in capability, competitive advantage is shifting to the data layer. Proprietary data that fine-tunes and contextualizes models for specific applications becomes the differentiating factor, making control over data pipelines strategically valuable.

Q: What are the main types of disagreement in the European data sovereignty debate?

A: The debate contains three distinct disagreements: a facts disagreement about whether data ownership actually confers competitive advantage; a values disagreement about whether sovereignty should be prioritized over efficiency; and an incentives disagreement about which actors benefit from different data ownership regimes.

Q: What concerns have European regulators raised about foreign AI companies like DeepSeek?

A: European Data Protection Authorities have investigated where user data goes, who has access, and whether EU citizens' data rights are upheld. The deeper concern is not corporate privacy policy but the legal and political system behind foreign companies – particularly whether authoritarian governance structures could compel data sharing with state intelligence operations.

Q: What capabilities do European organizations need for meaningful data sovereignty?

A: Organizations need technical infrastructure to capture, process, and analyze their own data without relying on third-party platforms with opaque data handling practices. This includes self-hosted analytics, secure data pipelines, and the expertise to maintain these systems – moving beyond legal rights to operational control.

The Missing Layer in Europe's AI Strategy: Data Ownership