← All news
Press · June 1, 2026 · 16 min read

EU AI Act August 2026: what the Digital Omnibus did not postpone — and the 60-day corpus plan

EU AI Act August 2026: what the Digital Omnibus did not postpone — and the 60-day corpus plan

The Digital Omnibus pushed Annex III to December 2027 — not Article 50, not AI literacy, not Annex IV for systems already on the market. 60-day corpus plan.

On 7 May 2026, the EU Council and Parliament reached a political agreement on the « Digital Omnibus » (official Council communiqué). The package moves enforcement of Annex III obligations for new high-risk AI systems — banking, insurance, HR, access to essential services — from 2 August 2026 to 2 December 2027. Over the three weeks that followed, the legal commentary has almost exclusively focused on what was deferred. The companion question — what is still mandatory on 2 August 2026 — has stayed largely unanswered. It is, however, a much fuller page than executive committees have been told.

Sixty-two days from the deadline, here is what actually applies on 2 August 2026, why your document corpus is its silent prerequisite, and the plan we use at K-AI for organizations whose AI systems are already in production.

What the 7 May 2026 Digital Omnibus actually did — and what it did not

The political agreement announced by the European Council on 7 May 2026 postpones by sixteen months the enforcement of obligations for new high-risk systems listed in Annex III: enforcement moves from 2 August 2026 to 2 December 2027 (Hogan Lovells, Latham & Watkins, White & Case). For legal teams that had not finished risk-classification work in time, this is a real relief. It is also a narrow one. Several obligation blocks remain anchored on 2 August 2026.

First, Article 50 on transparency for AI systems interacting with natural persons. The official text remains applicable on 2 August 2026 in its entirety, with the exception of paragraph 50(2) on watermarking of generated content, which is moved to 2 December 2026 (Article 50 — official text). Any chatbot, AI agent, or assistant in production that interacts with a customer, an employee, or a citizen must be identifiable as such, and any generated content must be traceable to its source. Second, Article 4 on AI literacy, in force since 2 February 2025 (Squirro). Any organisation that develops or deploys an AI system must ensure a sufficient level of AI literacy among the people concerned. Third — and this is the point that legal commentary has under-covered — Annex IV obligations for high-risk systems already placed on the market or already in service before 2 August 2026. The Digital Omnibus addresses new systems; it does not relieve operators of existing ones (Gibson Dunn).

In the conversations I have had with CIOs and DPOs of large European groups since the agreement, it is this third point that consistently surprises. For an operator running a credit-scoring system, an HR decision-support tool, or a customer-support assistant in production today, the Digital Omnibus brings no additional time. Annex IV technical documentation remains required. Article 12 logging remains required. Article 26 deployer obligations remain required. The market has fallen asleep on a calendar that was not addressed to it.

What is still mandatory on 2 August 2026 — for whom, for what

Three obligation blocks apply on 2 August 2026, independent of the Digital Omnibus.

Block one — Article 50, transparency. Any deployer of an AI system interacting with natural persons must inform them that they are interacting with an AI system, except for explicitly listed exceptions. Any deployer of a system generating or manipulating content published to inform the public on matters of public interest must disclose that the content has been artificially generated or manipulated (official text). For an enterprise RAG, this requirement cascades down to the source document: demonstrating that an answer is not hallucinated requires producing the documents consulted. This traceability is no longer a UX best practice.

Block two — Article 4, AI literacy. In force since 2 February 2025, it obliges providers and deployers to ensure a sufficient level of AI literacy among staff and contractors involved with the operation. Among Circle 1-2 vendors, Squirro has been the only one to formalize this obligation as a product discipline (Squirro). The blind spot, however, is intact: AI literacy training materials are themselves document corpus subject to versioning, dating, and attribution. When a regulator asks an operator what AI training their procurement team received, it is a documented artefact they are asking about, not pedagogy. That artefact must exist, be current, and be traceable.

Block three — high-risk systems already placed on the market before 2 August 2026. Annex IV requires detailed technical documentation including the description of the data used and the traceability of its lifecycle (Annex IV — official text, Article 11). Article 12 mandates automatic logging that supports traceability of decisions, with a minimum retention of six months (PipeLab). Article 26 frames deployer obligations: monitoring, logging, incident reporting. Penalties reach €35 million or 7 % of worldwide annual turnover for the most serious failures (Atlan). That is roughly seven times the GDPR ceiling.

Four recent benchmarks converge on the gap between obligation and readiness: Fivetran 15 % of organisations agentic-ready across 400 surveyed data executives (Fivetran via BusinessWire, 5 May 2026); Cloudera × HBR 7 % completely ready for AI (Cloudera, March 2026); Cisco 13 % Pacesetters across 2,500 surveyed CEOs (Cisco, 18 May 2026); and Gartner forecasts 60 % of AI projects abandoned by end-2026 owing to non-AI-ready data (Gartner). On that basis, the gap between operators able to document their Annex IV compliance and operators exposed to sanctions on 2 August 2026 is set to be substantial.

The blind spot: why Annex IV includes your inference corpus

Since the Omnibus agreement, AI-governance pure-players have published compliance guides and templates — Atlan, Collibra, Alation, Credo AI, Holistic AI, Informatica. All of them address compliance at the model layer (registries, model cards, high-risk classification) or the structured dataset layer (lineage, quality, provenance) (Collibra AI Command Center, Alation, 11 May 2026, Atlan). None — to our knowledge — descends to the level of the individual unstructured document — PDF, Confluence page, SharePoint document, archived email — which makes up 70 to 90 % of the inference source of an enterprise RAG or AI agent.

This is a legal point worth the attention of CIOs and DPOs. Annex IV §2(b) requires the provider of a high-risk system to document « the design specifications of the system, namely the general logic of the AI system and of the algorithms ; the key design choices including the rationale and assumptions made » and « the main classification choices » (Annex IV — official text). For a production RAG system, the design choice includes, by construction, the document corpus selected for grounding. The assumptions include hypotheses about quality, freshness, absence of contradictions, absence of divergent duplicates in that corpus. Without documentation of those hypotheses, the system is non-compliant by default.

That is exactly the role a Document Knowledge Platform plays: producing and maintaining, automatically, the technical trace that Annex IV requires without naming. The six axes we instruct at K-AI — internal anomalies, inter-document conflicts, divergent duplicates, unmarked obsolescence, traceability, freshness — are not an editorial framework. They are the measurable dimensions that turn an inference corpus into documentable evidence in the Annex IV sense. We published the complete method in our 15 May 2026 research note; its connection with the AI Readiness frameworks of 2026 (Cisco, Microsoft, Cloudera, Iris.ai, Atlan) is set out in our 25 May piece.

Article 12 and the « document retrieval log »: the evidence category nobody names

Article 12 mandates automatic logging that supports traceability of operations of a high-risk system, with a minimum retention of six months (PipeLab). Market practice interprets this obligation at the prompt-and-output level: the user query, the generated response, the user identifier, the execution time are kept. One log category is almost always missing: the trace of the source documents consulted at generation time.

For a RAG system, this trace exists technically. The retriever pulls a set of chunks; the reranker orders them; a subset is injected in the model context. At no point is this operation logged as documentary evidence. Yet it is precisely this trace that allows the operator, six months later, to demonstrate to a regulator or an internal risk committee that the answer provided by the AI agent was based on a specific document, in a specific version, at a specific date. Without that log, the operator cannot reconstruct the decision chain. With it, they can.

We formalize this category at K-AI under the name document retrieval log: for each agent or RAG response, retention of the list of source documents consulted, their version at retrieval time, their quality score from the corpus audit, and their validity status (current, expired, in conflict with another source). It is, to our knowledge, the only category of log that makes Article 12 actionable at the document-corpus level. No DKP or Enterprise Search vendor has published that formalization to date; that is the flag we plant this week.

The operational window August 2026 → December 2027: three phases to prepare for Annex III

Annex III deferral does not remove enforcement, it shifts it. For an operator whose AI systems will fall into the high-risk perimeter in December 2027 — HR tools, scoring, education, access to essential services — the sixteen-month window opened by the Digital Omnibus is an operational calendar, not a reprieve. We split it into three phases with the teams we support.

Phase 1 — June to August 2026: putting the existing in order. Every organisation already runs at least one AI system in production that will be subject to Article 50 on 2 August: a customer assistant, an internal copilot, an employee-recommendation tool. On that perimeter, the urgency is not Annex III but Article 50, Article 4, and — for high-risk systems already on the market — Annex IV. This window is where the 60-day corpus plan described below plays out.

Phase 2 — September 2026 to June 2027: industrializing continuous monitoring. Once the corpus has been cleaned, compliance does not maintain itself. The six audit axes (anomalies, conflicts, duplicates, obsolescence, traceability, freshness) drift by nature. The Stay Clean discipline — continuous monitoring, threshold alerts, scheduled re-audit — is the condition of compliance durability. This phase is also the natural moment to install the documentary governance that Annex III will require by end-2027.

Phase 3 — July to December 2027: preparing the Annex III crossover. For new high-risk systems the organisation deploys in this window, conformity assessment (Article 43) and declaration of conformity (Article 47) become applicable. Annex IV technical documentation must be produced before placing on the market. If the corpus has been cleaned in Phase 1 and maintained in Phase 2, the third phase is formalization, not a rebuild.

This sequencing is not K-AI-specific; it is consistent with the guides published by serious law firms (IAPP, Latham & Watkins, McKenna Consultants). What is specific is placing the document corpus — and not the model or the structured dataset — at the centre of the design.

A 60-day corpus plan: six work streams for AI systems already in production

Here is the plan we instruct at K-AI for an operator whose AI systems are already in production. It operationalizes the six-axis method published on 15 May, on the D-62 → D-0 calendar to 2 August 2026. Each work stream runs over ten days, in parallel where possible.

Work stream 1 — Internal anomalies (D-62 → D-52) covers the detection of internal incoherences within a single document: contradictory tables, incompatible dates, circular references. Deliverable: a counted inventory of anomalies, prioritized by business criticality. Work stream 2 — Inter-document conflicts (D-52 → D-42) addresses the failure mode that weighs most on RAG reliability in production, as we documented on 27 May. Deliverable: a graph of contradicting claims, validated by business stakeholders. Work stream 3 — Divergent duplicates (D-42 → D-32) identifies the multiple versions of the same document whose content has diverged without lineage tracking. This is the main cause of answers that were « true yesterday » but are wrong today.

Work stream 4 — Unmarked obsolescence (D-32 → D-22) classifies the corpus by explicit or inferred validity horizon. An HR policy is valid three years, a technical procedure six months, a product manual until the next release. Without that marking, the RAG cannot exclude expired sources. Work stream 5 — Traceability (D-22 → D-12) is the axis that addresses Annex IV §2(b) head-on: for each document, its author, validation date, source of truth, business owner. This is where the document retrieval log takes root. Work stream 6 — Freshness by segment (D-12 → D-0) measures the lag between the last effective update and the rhythm expected by the business, segment by segment. This is the metric that drives re-audit in Stay Clean mode.

At D-0, on 2 August 2026, the operator has a documented inventory, an operational document retrieval log, a costed remediation plan, and a continuous monitoring setup. Annex IV documentation for high-risk systems already in service is produced, Article 50 is wired for generated-content identification, Article 12 is logged at the document level. Compliance will not be perfect, but it will be defensible before a regulator or an auditor. That is the realistic objective at 62 days.

Frequently asked questions

What documentation obligations apply to high-risk AI systems on 2 August 2026?

The 7 May 2026 Digital Omnibus postpones the obligations of new high-risk systems listed in Annex III to 2 December 2027. Remaining applicable on 2 August 2026: Article 50 on transparency (with the exception of 50(2) on watermarking, moved to December 2026), Article 4 on AI literacy (in force since February 2025), and the full set of Annex IV, Article 11, Article 12 and Article 26 obligations for high-risk systems already placed on the market or already in service. In practical terms: technical documentation of the system, description of the data used and of its lifecycle, automatically generated logs with six-month retention, deployer monitoring. Penalties reach up to €35 million or 7 % of worldwide turnover.

Does the 7 May 2026 Digital Omnibus remove the AI Act’s documentation obligations?

No. The Digital Omnibus, in its 7 May 2026 political agreement, postpones the enforcement of Annex III for new high-risk systems to 2 December 2027, but removes no obligation. Documentation obligations for high-risk systems already in production remain applicable on 2 August 2026. AI literacy (Article 4) has been applicable since February 2025. Transparency (Article 50) remains applicable on 2 August 2026. Law firms converge on the same advice: do not use the postponement as an excuse to delay compliance work.

How do you prove the traceability of decisions made by an AI system?

Article 12 requires automatic logging that supports traceability, for a minimum of six months. Market practice interprets this obligation at the prompt-and-output level. A third log category is almost always missing: the trace of source documents consulted at generation time — what we call at K-AI the document retrieval log. For each agent or RAG response, retention of the list of source documents consulted, their version, their quality score at retrieval time, and their validity status. This category is what allows the operator, six months after the fact, to demonstrate to a regulator that an answer was based on a specific document in a specific version.

Does Article 50 on transparency remain in force on 2 August 2026?

Yes. The 7 May 2026 Digital Omnibus does not touch Article 50 on transparency of AI systems interacting directly with natural persons. The official text remains applicable on 2 August 2026 in its entirety, with the exception of paragraph 50(2) on watermarking of generated content, which is moved to 2 December 2026. Any chatbot, AI agent or assistant in production must be identifiable as such, and generated content must be traceable to its source. For an enterprise RAG, this obligation cascades down to the source document consulted.

What are the penalties for documentation non-compliance under the AI Act?

EU AI Act penalties reach €35 million or 7 % of worldwide annual turnover — whichever is higher — for the most serious violations, particularly those touching prohibited systems. For failures related to Annex IV, Article 12 or Article 50, the ceilings reach up to €15 million or 3 % of worldwide turnover. This range exceeds GDPR. National supervisory authorities designated by each Member State will be in charge of enforcement starting on 2 August 2026.

Going further

AI Act documentation compliance is not solved by reading the directive — it is solved by auditing the corpus that feeds the systems already in production. That is what we do at K-AI, on the corpora of large European groups. To discuss your perimeter — which obligations apply, what state your corpus is in, what a 60-day plan looks like — write to us at contact@k-ai.ai.

Cited sources


K-AI already supports CMA CGM, Veolia, PwC, BNP Paribas, TotalEnergies, and CEVA Logistics. Partners: AWS, Snowflake, Microsoft, Wavestone, Devoteam.

And in your organization, what does your document estate look like?

30 minutes with a founder. We audit a sample of your documents for free and show you exactly what K-AI detects.

Book a demo → Read other articles