From Pilot to P&L: Building Owned AI Across a Portfolio

For Private Equity
Most AI pilots never reach the P&L, and the hold period is not getting any longer. Here is how to build owned AI across a portfolio that pays back inside the hold, compounds from one company to the next, and shows up in the exit.
The short version
  • The hold period is the constraint. Holding periods have reached a record average of 6.6 years and value-creation work, not multiple expansion, now drives most of the return, so AI has to pay back inside the hold.
  • Most AI never reaches the P&L. Gartner expected at least 30% of generative AI projects to be abandoned after proof of concept, and only about half of AI projects ever reach production at all.
  • Run it at two altitudes. Prioritize opportunities at the fund level, build production systems at the company level, and use one repeatable playbook so what is learned in the first portfolio company compounds across the rest.
  • Build owned, measure to the business case, document for exit. The same discipline that converts a pilot into P&L also turns the work into a defensible asset a buyer will pay for.
The part in between
The pillar in this series made the argument that AI now moves the exit multiple, and the buy-side piece showed how a disciplined acquirer prices that into a deal. This article is about the part in between, the hold period, where the thesis either becomes real or quietly does not. It is the hardest part, because building AI that reaches the profit-and-loss statement is a different exercise from buying a company with promising AI or writing a value-creation plan that mentions it. Most AI initiatives never make the trip from a convincing demo to a measurable line in the financials, and a portfolio full of stalled pilots is hold-period time a fund does not get back.
The goal of this piece is practical: how to build owned AI across a portfolio so that it pays back inside the hold, compounds from one company to the next rather than being rebuilt from scratch each time, and arrives at the exit as a documented asset rather than a story. That requires treating AI value creation the way good funds treat every other lever, with prioritization, execution discipline, and measurement against a business case, and it requires a model that solves the constraint most likely to stall it, which is talent rather than capital.
It helps to be clear about what this article is not. It is not a survey of AI use cases, because the right use cases are specific to each company and change quickly. It is an operating argument about how a fund turns AI from a portfolio of hopeful experiments into a managed capability that reliably reaches the financials and the exit. The pieces are prioritization, a two-altitude operating model, a repeatable playbook, and a discipline of building owned, measuring against a business case, and documenting as you go. Each one exists to close the gap between an AI demo and a number a buyer will pay for.
The hold period is the clock
Start with the constraint, because it shapes everything else. According to McKinsey's Global Private Markets Report 2026, the average global holding period for portfolio companies has reached a historic high of 6.6 years, and the backlog of companies held longer than four years now sits around 16,000 globally, the highest on record. That backlog is not a comfortable cushion. It is pressure, because limited partners are waiting on distributions and sponsors need to return capital, and exits are expected to broaden through 2026. A long hold under pressure is the worst environment in which to start an AI program late.
The same research reframes what the hold period is actually for. McKinsey attributes roughly 54% of the revenue growth from a private equity deal to value-creation initiatives and a smaller share to multiple expansion, which means the operating work done during the hold, not the entry or the market, drives most of the return. AI now sits inside that value-creation work as one of its sharpest levers. The implication is uncomfortable but clarifying: a sponsor cannot treat AI as something to explore and revisit, because the hold is finite and the return depends on what gets built and proven before the clock runs out.
Speed therefore matters as much as ambition. FTI Consulting found that the share of private equity leaders reporting AI benefits within twelve months has roughly doubled year over year, to around two thirds, which sets a realistic bar: a well-chosen AI initiative should pay back inside a year, not at some indefinite point in the future. A capability that returns inside twelve months can be deployed, measured, and documented with time to spare before a process begins. One that promises value in year four of a hold that may end in year three is not a value-creation initiative. It is a liability dressed as one.
The funds that handle this well do not treat AI as separate from the rest of the value-creation plan. McKinsey notes that leading sponsors build detailed value-creation plans that tie the investment thesis to specific initiatives and tangible results, and AI belongs inside that plan rather than beside it as a standalone experiment. Framed that way, an AI initiative competes for capital and management attention against every other lever on the same terms: expected impact, time to value, and contribution to the exit. That is a healthier test than the one most AI projects actually face, which is simply whether they are interesting.
Why most AI never reaches the P&L
The reason AI programs burn hold-period time is that most of them never reach production, and the failure is remarkably consistent. Gartner expected at least 30% of generative AI projects to be abandoned after the proof-of-concept stage, citing poor data quality, inadequate risk controls, escalating costs, and unclear business value. Its broader analysis found that only about 48% of AI projects ever reach production at all, taking an average of eight months to get there. MIT's research went further still, reporting that 95% of generative AI investments had produced no measurable return. The pattern has a name in the field, pilot purgatory, and it describes the gridlock of companies that can build a working demo but cannot convert it into a reliable, enterprise-grade asset.
What is striking about the failure modes is how few of them are about the model. The blockers Gartner names are data that was never made ready for production, governance and risk controls that were treated as afterthoughts, costs that looked negligible in a pilot and became a budget problem at scale, and use cases chosen without a clear line to business value. In other words, the bottleneck is almost always the operating environment around the AI, not the intelligence inside it. A pilot runs on a clean, static slice of data and a single happy path. Production faces messy, changing data, real users, real regulation, and real cost curves, and an initiative that was never engineered for that environment stalls the moment it meets it.
Data is the most common single point of failure. Gartner has found that the large majority of AI projects that fail do so because of poor data quality, and that without an AI-ready data foundation a growing share will be abandoned before they ever scale. The newest wave carries the same risk in a sharper form. Gartner expects more than 40% of agentic AI projects to be canceled by the end of 2027, again citing escalating costs, unclear business value, and inadequate risk controls, alongside a wave of what it calls agent washing, in which ordinary automation is rebranded as something more autonomous than it is. For a portfolio company, the lesson is blunt: the unglamorous work of getting data, governance, and cost visibility right is not a prerequisite to the AI program. It is most of the AI program.
For a private equity owner, pilot purgatory is more expensive than it looks. It consumes capital and the scarce attention of a management team, it diverts both from initiatives that could have worked, and it quietly erodes the credibility of AI inside the company, so that the next genuinely good idea is met with fatigue rather than support. The cost is not only the wasted project. It is the hold-period quarters that went with it. Avoiding that outcome is less about picking the cleverest model and more about refusing to start work that was never set up to finish.
Run it at two altitudes
AI value creation works best when it is run at the fund level and inside the portfolio company at the same time, with one methodology connecting the two. The two altitudes do different jobs, and skipping either one is where programs go wrong.
At the fund level, the work is allocation and pattern. Someone has to look across the whole portfolio and rank the AI opportunities company by company, so that value-creation capital flows to the highest-return uses rather than to whichever management team is most enthusiastic. The fund level is also where a repeatable playbook lives: the same methodology, the same standards, and the same senior engineers deploying across companies, so that the data architecture, the governance pattern, and the hard-won lessons from the first portfolio company become a head start for the next. This is the multiplier that turns AI from a per-company science project into a fund-level capability, and it is also the answer to the binding constraint. FTI has identified talent, not capital, as the primary limit on scaling AI, named by roughly a third of leaders. A repeatable model deployed by a shared, certified team solves that constraint far better than asking every portfolio company to hire its own scarce AI talent and learn the same lessons independently.
At the portfolio company level, the work is to build. That means production systems built on the company's own data and owned by the company, senior engineers embedded with the team rather than handing over a slide deck, the high-cost manual workflows re-engineered and run by AI with governance built in, and the underlying architecture that lets a company's data, models, and agents work together instead of in isolated pockets. The point of embedding engineers rather than advising from a distance is that production AI is built, not recommended, and the team that builds alongside the company leaves it more capable than it found it, which matters because the asset has to keep running long after the engagement ends.
A third piece sits across both altitudes and is the one most often left out: enablement. AI returns are won or lost in adoption, and a system the team does not trust or use produces nothing no matter how well it was built. The fund level can get operating partners fluent enough to sponsor the right work and challenge the wrong work, and the company level can bring frontline teams along so the new workflow becomes the way the work is done rather than a tool sitting unused beside the old process. Diligence and exit readiness complete the fund-level role, so the same team that prices AI on the way in and documents it on the way out is the one building it in between.
Prioritize like capital, not curiosity
The single most important fund-level decision is what to build first, and it should be made the way capital allocation is made, not the way pilots usually get chosen. Most stalled AI programs began as a list of interesting ideas. A portfolio program begins instead by scoring every candidate opportunity on the few axes that actually predict whether it will reach the P&L and matter at exit. The matrix below is the core of a Portfolio AI Assessment, and it turns a wish list into a ranked, fundable plan.
Axis The question it answers Scores high when Scores low when
Impact How much enterprise value is actually at stake, through margin lift, revenue, or a category shift? It targets a core, high-cost or high-volume function where the money concentrates. It is a peripheral efficiency that never really reaches the financials.
Feasibility Can it ship inside the hold, given the data, the complexity, and a realistic time to value? The data exists and is usable, the path to production is short, and payback lands within about a year. It depends on data that does not exist yet or a multi-year rebuild to even begin.
Durability Does the result become an owned asset at exit, or rented efficiency a competitor can buy tomorrow? It produces proprietary, embedded, documented AI the company owns and can evidence. It is a thin layer over a commodity tool with nothing defensible underneath.
The opportunities worth funding first are the ones that score well on all three axes, and the discipline is to resist the ones that score high on only one. A flashy idea with no usable data is not feasible, however large its impact might be. A quick win that produces rented efficiency lifts this quarter's margin but adds nothing a buyer will pay a premium for. The strongest programs deliberately sequence both kinds of work: a few high-feasibility wins early to build credibility and free up cash, alongside the durable, owned assets that take longer but move the multiple. As a starting point for where impact concentrates, BCG has found that roughly 70% of AI value sits in core functions such as sales and marketing, operations, and the supply chain, which is usually where a portfolio company's own money is made and lost. Scored this way, company by company, the matrix does at the fund level exactly what an investment committee does with capital: it sends the scarce resource where the return is highest and the risk is understood.
A concrete case makes the scoring tangible. Demand forecasting in a distribution business tends to score well on all three axes at once: the impact is large because inventory and service levels move real margin, the feasibility is strong because the historical data usually already exists, and the durability is real because a model trained on the company's own operating history is not something a competitor can simply buy. EisnerAmper has reported AI-driven logistics improvements on the order of 15% cost reductions and 35% inventory gains, the kind of result that lands directly in EBITDA. Contrast that with a generic assistant bolted onto a public model: it may demo well, but it scores low on durability because it rents a capability anyone can rent, and it rarely survives the move from a pilot to a number someone is willing to underwrite.
The same logic extends across the portfolio, not just within one company. A pattern that scores well in the first distribution business is a strong candidate for the next one, which is how a fund avoids the trap of trying to do everything everywhere at once. Better to prove a high-scoring use case in one company, capture the playbook, and roll it across the companies where it fits, than to scatter attention across a dozen unrelated experiments that each have to clear pilot purgatory on their own. Prioritization at the fund level is as much about sequencing across companies as it is about ranking within them.
Build owned, measure to the business case, document for exit
Once the priorities are set, three habits separate the AI that reaches the P&L from the AI that does not. The first is to build owned. Production systems should run on the company's own data and be owned outright, because that is the version of AI that earns a premium at exit rather than a shrug. A thin wrapper over a commodity service, with nothing proprietary beneath it, can lift a metric for a quarter, but it is not an asset, and the pillar in this series laid out why owned, embedded, documented AI is the only version a buyer values as intellectual property. Building for ownership from the start is far cheaper than retrofitting it under deadline pressure later.
In practice this sequences into a natural hold-period rhythm. The first hundred days are for the high-feasibility wins that prove AI can reach the financials and earn the management team's trust, run on real data and measured from day one. The quarters that follow are for the durable, owned assets that take longer to build but are what a buyer ultimately pays for. Governance is built in from the first project rather than bolted on before a sale, because a system designed with oversight, monitoring, and audit trails from the outset is both safer to run and far easier to evidence later. The aim throughout is a company more capable at the end of the engagement than at the start, because the asset has to keep running and improving long after the builders have moved on.
The second habit is to measure against the business case. FTI found that 95% of funds report their AI initiatives are meeting or exceeding the original business case, but that a much smaller share are significantly exceeding it, which tells you the spread between disciplined execution and box-checking is wide and that measurement is what sorts the two. The practical version is unglamorous: define the expected return before building, instrument the system to track it, kill what underdelivers quickly so it stops consuming the hold, and concentrate resources on what works. Measurement is also what lets a sponsor tell the value-creation story in numbers at exit, which is worth more to a buyer than any narrative.
The third habit is to document for exit while building, not after. The buy-side piece in this series showed that a buyer's diligence now tests ownership, data provenance, governance, and architecture in detail, and that the absence of documentation gets priced as risk. The answer is to capture model ownership, data lineage, governance, and development history as the work happens, so the AI assets are exit-ready by default. Ocean Tomo advises companies to begin comprehensive IP work twelve to eighteen months before a sale, but the cheapest version of that work is the documentation produced in real time during the build. Done this way, the handoff runs cleanly in both directions: the diligence file that justified the deal becomes the value-creation plan, and the value-creation work becomes the evidence that defends the multiple at exit, which is the subject of the exit-readiness piece later in this hub.
The compounding play
Put the pieces together and the advantage compounds in two directions at once. Across each company, quick wins fund the credibility and the cash that make the durable, owned assets possible, and those assets are what move the multiple at exit. Across the portfolio, a repeatable playbook means the second deployment is faster and cheaper than the first, and the tenth is faster still, because the methodology, the architecture patterns, and the lessons travel from one company to the next instead of being rediscovered each time. That is the difference between a fund that runs ten separate AI experiments and a fund that runs one AI capability across ten companies. The first burns hold-period time. The second turns it into owned, documented, defensible assets, and arrives at every exit with the value-creation story already written in numbers.
There is a quieter benefit that accrues to the fund itself. A sponsor that has run this playbook across a dozen companies has built something its competitors have not: a tested method, a bench of engineers who have shipped production AI in real operating environments, and a library of patterns that make the next deal faster to underwrite and the next portfolio company faster to improve. That capability is increasingly the kind of fund-level coordination that limited partners and buyers now expect to see, and it compounds in exactly the way a good platform does. The firms that start building it now, while the hold periods are long and the exit backlog is large, will be the ones holding evidenced, defensible AI assets when the market clears and the deals that were waiting finally come to market.
Get Started
Find the AI value hiding in your portfolio.

Start with a Portfolio AI Assessment. We rank the highest-ROI AI opportunities across your portfolio by impact, feasibility, and effect on enterprise value, then build the ones that pay back inside the hold and own them on your behalf. You leave with a prioritized roadmap you can act on. Every engagement runs on Generative-Driven Development and is delivered by certified Forward Deployed Engineers.

Talk to our PE team
Sources
  • McKinsey, Global Private Markets Report 2026 (record 6.6-year average holding period; value creation as the primary driver of deal returns).
  • Gartner, press research on generative AI project abandonment after proof of concept, and share of AI projects reaching production.
  • MIT Project NANDA, The GenAI Divide: State of AI in Business 2025.
  • FTI Consulting, 2026 Private Equity AI Radar (AI benefits within twelve months; talent as the primary constraint; share meeting or exceeding the business case).
  • BCG, research on where AI value concentrates across core business functions.
  • Ocean Tomo (J.S. Held), Increasing Exit Multiples: IP and AI Asset Management in M&A Transactions.