META · 2022 — 2026

Horizon Planner

Designing the canonical AI capacity surface at Meta

$95M+

IN COMPUTE COST SAVINGS

+60%

ADOPTION (2K→3.2K MAU)

3

NEW ENG HIRES UNLOCKED

+242%

CTR · LIVE QUOTA USAGE

$95M+

IN COMPUTE COST SAVINGS

+60%

ADOPTION (2K→3.2K MAU)

3

NEW ENG HIRES UNLOCKED

+242%

CTR · LIVE QUOTA USAGE

$95M+

IN COMPUTE COST SAVINGS

+60%

ADOPTION (2K→3.2K MAU)

3

NEW ENG HIRES UNLOCKED

+242%

CTR · LIVE QUOTA USAGE

When I joined Meta's AI infrastructure design team in 2022, I was looking for the next version of a problem I'd worked on at Shure: how to design rigorous tools for experts who know more about their domain than I do. At Meta, those experts would turn out to be the ML engineers, capacity planners, and infrastructure leaders making decisions about hundreds of millions of dollars of GPU spend per year.

This is the story of Horizon Planner — a long-range capacity planning tool I designed end-to-end over two years. It grew from a concept I sketched against a real internal need into what Meta's infrastructure leadership eventually recognized as the canonical surface for AI capacity planning decisions.

The problem

Meta's AI infrastructure team makes capacity planning decisions that span years and billions of dollars. By 2022, those decisions were being made across half a dozen disconnected tools, with planners assembling the picture by hand. The fragmentation produced predictable failures: decisions made on stale data, long-range plans drifting from operational reality, engineering teams unable to trust numbers they couldn't trace.

The work sits inside one of the most consequential planning disciplines in the technology industry. Hyperscalers — Meta, Microsoft, Google, AWS — collectively committed over $600B in 2026 capital expenditures, the majority for AI infrastructure. Long-range capacity planning translates uncertain multi-year demand projections into binding commitments that determine whether a company has enough compute when it matters.

Meta's AI infrastructure team makes capacity planning decisions that span years and billions of dollars. By 2022, those decisions were being made across half a dozen disconnected tools, with planners assembling the picture by hand. The fragmentation produced predictable failures: decisions made on stale data, long-range plans drifting from operational reality, engineering teams unable to trust numbers they couldn't trace.

A third audience, infrastructure leadership, sits above both — opening the tool to ask is this plan credible, is the budget on track, and what requires my decision today? The hardest design problem was building a single surface that served all three without bifurcating the product into a 'user mode' and an 'expert mode.'

My role and the team

I was the sole product designer on this work from concept through launch, partnering with a rotating group of 4–6 engineers depending on phase. Over the project's life I onboarded 2 additional designers and 5 engineers into the AI infrastructure design space.

Before any visual design, I spent the first two months in user research: contextual inquiries with ML engineers on three teams, interviews with seven capacity planners, and shadowing two infrastructure leadership reviews. The research output became the brief: a single tool, structured around three questions every capacity decision answers — what's the situation, where are the gaps, what should we do.

FEATURED DESIGN DECISION 01

The AI-generated executive summary

The hardest moment in any capacity planning review is the first thirty seconds. An infrastructure leader opens the tool, and they need to know: is this plan credible, is the budget on track, and what requires my decision today? Everything else can wait.

The earliest design I shipped had a generic 'Overview' tab — a few charts, some key metrics, no narrative. It tested poorly in feedback sessions with org leads. Leaders would open the tab, scan for ninety seconds, and ask their analyst the same question I was trying to design away: 'so… what should I be paying attention to?'

The redesign replaced the overview with an AI-generated executive summary. The system reads the current scenario state, identifies the most-decision-relevant metrics, and produces a structured summary with three components: a one-paragraph narrative, five KPI cells flagging where the scenario is healthy or stressed, and a side-by-side block listing the top recommended actions paired with the risks of delaying them.

Executive summary card with one-paragraph narrative, KPI cells, and paired Recommended Actions / Risks if Delayed block.

The narrative bolds specific values, not whole phrases.

Usability testing revealed that bolded phrases competed with the KPI cells above them, and readers' eyes bounced between the two. The shipping version bolds the specific quantitative values inside the narrative — -172 MW, $0.2B over, 84% — and leaves connecting phrases unbolded.

Recommended actions and risks live side-by-side.

Capacity decisions are always trade-offs. Surfacing both the action and the risk-if-delayed at the same visual level makes the trade-off legible. A leader can scan the recommended action, then immediately scan the risk if delayed, without having to scroll between sections.

In the six months after launch, leadership reviews consistently opened with this screen. Engagement analytics confirmed the pattern: the executive summary became the entry point for the majority of leadership-tier sessions.

FEATURED DESIGN DECISION 02

Natural-language scenario simulation

Capacity planning is a what-if discipline. Plans aren't static — they get stress-tested against scenarios: what if Singapore comes online two quarters late? What if AI demand grows 20% faster than we forecast?

I designed a natural-language interface that lets users describe what they want to test in plain English. 'Delay Singapore DC by 2 quarters.' 'Increase AI demand 20% from Q1 2027.' 'Move 15% of training capacity from US East to Europe.'

Filled state of the simulate-changes modal, with parsed adjustments shown before commit.

The interesting design problem wasn't the natural language itself — large language models handle that well. The harder problem was building user trust in the interpretation. Capacity decisions touch budgets at the scale where misinterpretation has real consequences.

The solution was a two-step interaction. After the user types their hypothetical, the system parses the input and displays a 'Detected Adjustments' preview — a structured list of what it interpreted. The user reads the parsed interpretation, confirms it matches their intent, and only then clicks Apply. If the AI misinterprets, the user sees the misinterpretation before any state changes.

In post-launch UXR, planners reported that scenario simulation moved from 'something I do once a quarter' to 'something I do every week to stress-test our assumptions.'

FEATURED DESIGN DECISION 03

Information-architecture revamp

The first version of the platform shipped with an IA that mirrored Meta's internal data taxonomy — surfaces organized around the engineering systems that produced the data, not the questions users were trying to answer. Adoption grew steadily but plateaued at around 2K monthly users by mid-2024.

One quarter of user research later, I had a clear picture of the problem. The three most decision-relevant components — live quota usage, capacity spenders, and the tenant job queue — were buried two and three levels deep in the navigation and far below the page fold.

I introduced an Overview tab that housed redesigned components for live quota usage, top capacity spenders, and tenant job queue. I A/B tested the new IA against the existing one over six weeks. The results:

+242% clicks on the Live Quota Usage card

+64% clicks on the Capacity Spenders card

+39% clicks on the Tenant job queue

The IA revamp became the design decision I'm proudest of — not because the design itself was particularly clever, but because it required convincing leadership to run a six-week test against an existing pattern that was already adopted, and ultimately showed overwhelmingly positive signals for the updated IA.

Adoption jumped from 2K to 3.2K monthly users in the following half — a 60% increase. The IA revamp also unlocked new use cases that hadn't been visible in the old structure, which the team worked on in subsequent quarters.

Other design moves worth noting

The three decisions above were the ones I designed against measurable goals and validated with A/B data. Two other moves embedded in the product are worth surfacing because they shaped the product's character and generalize beyond it.

Provenance as a first-class affordance on every number.

A small information icon on every metric tile and every chart value, opening a popover that traces the number back to the model that produced it, the inputs the model consumed, and the source-of-truth document or owner for each input. The principle: any number a user has to defend in conversation needs to be traceable from the surface it appears on.

Failed solver runs as a first-class navigation flow.

When a solver run fails, the parent scenario surface shows a red banner naming the failed step, and that banner is itself the navigation into the per-model debug inspector. Engineers don't hunt for the failure — the product surfaces it and offers the right next action.

What I learned

The three decisions above were the ones I designed against measurable goals and validated with A/B data. Two other moves embedded in the product are worth surfacing because they shaped the product's character and generalize beyond it.

Design leadership in engineering-heavy orgs is partly the work of teaching the org to evaluate design.

Early in the project, design critiques from my engineering partners were unhelpful. The rationale for a design decision has to be made before the decision is challenged, not after. Over the project's life, the engineering team's critique vocabulary shifted from 'I don't think this is right' to 'this conflicts with the discoverability principle we agreed on.'

For products with experts as users, design for the question first and the data second.

The IA revamp wasn't about better visualizations. It was about reorganizing the surface around the three questions every user was actually trying to answer. The hardest design move was figuring out which three questions to organize around, and then having the discipline to keep everything else subordinate to those three.

AI features need verification scaffolding.

The natural-language simulation works because users can verify the AI's interpretation before committing. The AI-generated executive summary works because every claim it makes is backed by a KPI cell or chart that users can drill into. Trust in AI-assisted decision tools is built by exposing the AI's reasoning, not by hiding it behind a confident answer.

Accessibility in dense data UIs is design from the foundation, not a layer added late.

I designed the system from the start with a color-blind-safe palette, redundant non-color encoding on every status indicator, keyboard-navigable data grids with explicit focus states, and screen-reader-friendly alternative text. The constraint that mattered most: every chart had to communicate its primary story even rendered in grayscale.

ℹ️ Screens shown are portfolio reconstructions with fictional data. Product names, financial figures, and internal references have been replaced. Design decisions, interaction patterns, and outcomes reflect the actual work.

Senior product designer for experts in complex domains. ©2026 Maria Jimbo

Senior product designer for experts in complex domains. ©2026 Maria Jimbo