Your AI Is Scaling. Your Understanding Isn't.
- Jeroen Janssen

- Jan 7
- 6 min read
This is part 3 of a series on AI strategy governance.
There's a number making the rounds in enterprise AI circles, and it should worry you more than it reassures you.
OpenAI reports an eightfold increase in enterprise message volume year over year. Reasoning-token consumption per organization is up more than 300×. AI is no longer experimental. It is becoming part of the operational fabric of firms across every industry, function, and geography. The future is here, the slide deck says, and the usage numbers prove it.
Except usage is not value. And volume is not control.
What's actually happening is something far more uncomfortable: organizations are scaling capability faster than they are scaling understanding. And the distance between those two curves is where the real risk lives. Invisible, growing, and almost never on the board agenda.

The Readiness Illusion
Most organizations believe they're AI-ready because they've checked the infrastructure boxes. Cloud architecture: done. Models: available. Data: abundant. Therefore, value should follow.
It doesn't.
A comprehensive study on AI-ready data identifies six minimum conditions that must be met before AI creates value rather than noise: data must be diverse, timely, accurate, secure, discoverable, and machine-consumable. These are not best practices. They are prerequisites. Without them, AI systems don't generate insight — they amplify whatever mess already exists.
But here's what actually happens. AI initiatives launch while data foundations remain unresolved. Nobody calls a halt, because the initiative has momentum and a sponsor. The assumption is that data quality is a parallel workstream — something that will catch up. It never catches up. It runs behind, silently, until outputs degrade. And when outputs degrade, something far more dangerous happens than a bad dashboard number.
People stop trusting the system. And when people stop trusting the system, they stop reporting problems. And when they stop reporting problems, you lose the ability to measure value at all.
This is not a technology failure. This is a sequencing failure. And it's happening everywhere.
What 100 Trillion Tokens Actually Tell You
If you want to understand how AI is really used — not how vendors say it's used, not how strategies assume it's used, but how it's actually used — there is now an empirical answer.
The OpenRouter State of AI study analyzed more than 100 trillion tokens of real-world inference. That's not a survey. That's not a focus group. That's the largest behavioral dataset on AI usage ever published.
Three findings should change how you think about your AI portfolio.
First: usage is wildly heterogeneous. AI is not being used primarily for the neat, contained productivity tasks that most business cases are built around. Creative work, coding, experimentation, agentic workflows — these dominate real-world usage in proportions that most enterprise strategies don't anticipate, let alone plan for.
Second: there is no "best model." Open-weight models now account for roughly a third of all token usage. Organizations switch between models constantly, depending on task, cost, and context. The ecosystem is fragmenting. If your strategy is locked to a single vendor or a single model family, you're not being focused — you're being fragile.
Third — and this is the one that should keep strategists awake — early fit determines everything. The researchers call it the "Glass Slipper effect." When an AI tool or workflow fits organizational reality early, adoption persists. When it doesn't, usage decays rapidly — regardless of how technically superior the model is. Regardless of how much you spent. Regardless of the executive sponsor's enthusiasm.
The implication is brutal in its simplicity: AI value does not come from model capability. It comes from fit. Fit with data. Fit with workflows. Fit with how people actually work.
And fit is exactly what most AI strategies don't test.
The Governance Cliff
Here's where it gets worse.
While enterprises are scaling deployment, the governance world is scaling expectations. And those two trajectories are on a collision course.
The 2025 Responsible AI Impact Report documents a clear shift. Responsible AI is no longer a set of ethical principles pinned to a wall. It is becoming operational risk management. Standards, benchmarks, audits, red-teaming — these are moving from nice-to-have to must-demonstrate. Governance is transitioning from aspiration to evidence.
This is being driven by two forces simultaneously. Regulatorily, the EU AI Act now requires that risks are classified, evidence is retained, controls are demonstrable, and post-deployment monitoring is continuous. This is not future regulation. This is current law.
Technically, the shift to agentic AI introduces failure modes that didn't exist before. When AI systems act autonomously — making decisions, triggering workflows, interacting with other systems — the risk surface expands faster than any governance framework designed for supervised tools can handle. Data poisoning, for instance, scales unexpectedly well. A small number of compromised inputs can subvert an entire model. And once the system is acting on its own, you need real-time failure detection. Not quarterly reviews. Not annual audits. Real-time.
The direction is unmistakable. Organizations will not just be asked whether they use AI. They will be asked whether they can prove they are in control of it.
And most can't. Not yet. Not honestly.
The Dip Nobody Talks About
There's a pattern underneath all of this that explains why so many AI initiatives feel simultaneously promising and broken.
Every serious technology implementation follows a J-curve. Productivity dips before it rises. Processes need redesign. People need to learn. Informal coordination — the kind that actually makes organizations work — breaks before it stabilizes. This is well-documented empirically.
AI makes this J-curve worse. For three specific reasons.
Because outputs are probabilistic, failure is ambiguous. You can't always tell whether the system got it wrong, the human got it wrong, or the process got it wrong. So nobody flags it.
Because models improve rapidly, hope consistently outpaces evidence. The next version will be better — so why raise concerns about this one?
Because investments are visible and politically loaded, stopping feels like career risk. The pilot continues. The numbers get smoothed. The board sees progress. Operations feels friction. And the gap between the two grows — silently, steadily — until something breaks publicly.
At that point, organizations are no longer managing AI. They are betting on it. And the J-curve, which should have been explained to the board on day one, becomes the autopsy finding.
What This Actually Requires
Strip away the frameworks and the jargon, and what emerges from every serious source on enterprise AI is the same conclusion.
AI value is not constrained by model capability. It is constrained by the organization's ability to hold strategy, data, people, and governance in alignment under uncertainty.
That alignment requires three things that most organizations talk about but very few operationalize.
Evidence discipline. A clear, enforced distinction between what is known, what is assumed, and what is hoped for. Portfolio approaches help — but only if investments are gated by demonstrated learning, not by activity. A pilot that consumes resources and produces no falsifiable evidence is not an investment. It's a subsidy for assumptions.
Socio-technical measurement. Value metrics alone tell you nothing if the people using the system have stopped trusting it. Adoption patterns and trust dynamics need explicit, continuous tracking. When people stop escalating issues, performance degradation becomes invisible. You're flying blind and the instruments say everything's fine.
Adversarial scrutiny. Not as obstruction. Not as a one-off audit. As hygiene. Assumptions must be challenged before they are operationalized at scale. Failure modes must be explored deliberately, not discovered accidentally. This is standard in aviation, medicine, and nuclear energy. AI is rapidly becoming the next domain where this standard applies. Most organizations haven't realized that yet.
Why Strategic Red Teaming
This is precisely the work that Strategic Red Teaming does. Not in theory — in practice.
What you just read is, in essence, what we do: confront every assumption, every initiative, every promise with reality before reality does it for you. Systematically. From every angle that matters: the CFO who demands proof, the COO who feels the friction, the CHRO who sees trust quietly eroding, the regulator who will ask the questions you'd rather have answers to today.
Apparens delivers that confrontation at a depth that is not standard in the market. Tens of thousands of scenarios. Hundreds of hypotheses. A diagnostic that doesn't stop at the surface layer, but drills down to where the real vulnerabilities live, the ones no dashboard reveals and no internal team dares to name out loud.
The result is not a report that disappears into a drawer. It's a decision document. Invest, stop, or adjust with your eyes wide open.
Your AI is scaling.
The question is whether your understanding is scaling with it.
Sources
All Tech Is Human (2025). Responsible AI Impact Report 2025.
Aubakirova, M., Atallah, A., Clark, C., Summerville, J. & Midha, A. (2025). State of AI: An empirical 100 trillion token study with OpenRouter.
OpenAI (2025). The State of Enterprise AI.
Qlik (2024). The Six Principles of AI-Ready Data.

