Playbook 01 · The Maturity Assessment

01 —

Meet Halcyon Financial.

Worked example · the pre-programme assessment

A 700-person fintech with more ambition than evidence.

Halcyon Financial is a regulated digital financial services platform — the kind of mid-cap fintech that has scaled past start-up and is now writing the operating muscle it didn't need before. The board has just signed off on a twelve-month transformation programme. The phrase "be AI-Native within twelve months" appears in the deck.

No one has yet asked the harder question: where does Halcyon actually start from?

The CEO believes they're well-positioned. The CTO has a long list of caveats. The CX team — closest to the work — has neither been asked nor noticed. The CFO is funding the programme on the strength of the deck. None of these views are wrong. None of them are evidence.

This playbook is the work of replacing that ambiguity with a defensible baseline — across six dimensions, scored honestly by a cohort, calibrated openly, briefed back to the executive in a way that lands.

The worked example. This playbook follows Halcyon through six weeks of pre-programme assessment. The disagreement that breaks open in Week 4 — between Maya, the COO, and the CTO on Governance & Risk — is the recurring example. The gap between scores is often more useful than the average.

6 Dimensions assessed
· strategy to people

6 Stages of maturity
· unaware to native

10 In the assessor cohort
· not just the exec

1 Brief at the end
· not a dashboard

// And the people you'll meet · perspectives across the organisation

// Cast 01

The Transformation Lead

You, the reader

Brought in to run the programme. The role this playbook is written for. Holds the pen on the assessment — but not the answers.

// Cast 02

Diana Whitfield

CEO · the sponsor

Signed off on the programme. Believes the organisation is ready. The person whose mental model the brief has to actually move.

// Cast 03

Sam Patel

CTO · the sceptic

Has the longest list of caveats. Scores Data & Architecture lower than anyone. Often right — and easy to dismiss as a brake.

// Cast 04

Helen Bautista

COO · the operator

Runs the day-to-day. Scores Governance & Risk at 4. Reads the policy; doesn't yet see what's happening at the desks.

// Cast 05

Priya Nair

Head of CX · the listener

Sits between the frontline and the executive. Brings Maya into the cohort because she knows the desk-level reality the policy can't see.

// Cast 06

Maya Chen

CX Officer · the frontline

Has been using ChatGPT to triage tickets for six months. No one has asked. No one knows. Scores Governance & Risk at 2.

// Cast 07

Anna Petrović

Head of Compliance · the line

Holds the regulator-tolerance position. Scores Governance & Risk at 3 — between policy and practice. The honest reading.

// Cast 08

Daniel Okafor

Head of People & Culture

Owns the People Impact dimension. The voice that asks "what does this mean for the twenty-three roles in the call centre?" when no one else does.

// Cast 09

Robert Lin

CFO · the funder

Approved the budget on the strength of the deck. The brief has to land with him too — and the diagnosis may change the budget shape.

// Cast 10

External perspective

Optional · advisor or peer

One outside voice — a non-exec director, a peer transformation lead, an industry advisor — who can score without inheriting the politics.

02 —

Four phases. Frame, gather, score, brief.

The framework

// The framework is deliberately small. Most assessments fail not because they're too shallow but because they collapse into surveys, dashboards, or political theatre. The discipline is to stay narrow and honest.

Phase 01

Frame

Week 1

Decide what you're assessing, against what version of the model, and with which cohort. The frame is the diagnosis-of-the-diagnosis.

Phase 02

Gather

Weeks 2–3

Three input streams. Structured interviews. Document review. A short self-assessment survey. The same six dimensions, three ways.

Phase 03

Score

Week 4

A calibration session. Honest stages, defended scores. Where the cohort disagrees is the signal — don't average it away.

Phase 04

Brief

Weeks 5–6

One short brief. Current state, the gap, the one or two dimensions where the gap matters most. The output is the input to Playbook 02.

03 —

The assessment matrix.

Six × six

// Six dimensions down the side, six stages across the top. The grid is the instrument the whole assessment runs on — interviews, documents, and the pulse all score against it. The discipline isn't precision. It's a defensible relative position, captured as a range where the cohort splits, never flattened to an average.

// Maturity matrix · stage range plotted, not the mean

Dimension ↓ / Stage →

1Absent

2Ad hoc

3Emerging

4Defined

5Managed

6Embedded

Strategy &
Ambition

Leadership &
Sponsorship

Governance &
Risk

Data &
Architecture

Capability &
Fluency

People
Impact

Scored range · cohort spread Priority dimension · widest gap // A two-cell band is a one-point spread. A three-cell band is the finding.

// Read it as Halcyon's cohort scored it. Governance & Risk and People Impact carry the widest spreads — policy at one end, practice at the other. Those two bands are why the brief in Phase 4 points at Playbooks 02 and 06. The grid doesn't decide; it makes the disagreement impossible to hide.

04 —

Phase one · Frame.

Week 1

The first week is the cheapest place to fix a bad assessment. Decide what you're assessing, why, against what version of the model, and with which cohort. The frame is the diagnosis-of-the-diagnosis. Get it wrong here and the rest of the playbook produces a confident answer to the wrong question.

01

Name the purpose of the assessment.

Is it diagnosis before a new programme? A health-check on a programme already running? A pulse to decide whether to invest? Each has a different cohort, a different time horizon, and a different threshold for honesty. Most assessments fail because they conflate the three. Write a one-sentence purpose at the top of the brief and refuse to drift from it.

02

Confirm the model fits.

The methodology's six-by-six grid is a starting point, not a contract. Read each dimension and stage with the executive sponsor. Where the language doesn't land for this organisation, adapt the language — not the structure. The dimensions earn their place; the words can change to match the vocabulary of the business. Resist re-engineering the grid itself.

03

Choose the cohort.

Eight to twelve people. Three rules. One — perspective over rank. A frontline officer who uses AI daily is more useful than a director who reads about it. Two — at least one sceptic. If everyone in the room agrees the organisation is ready, the room is wrong. Three — one external voice. A non-exec director, peer transformation lead, or industry advisor who can score without inheriting the politics.

04

Set the truth-seeking norms publicly.

Three norms, named in the kick-off. Confidence is part of the score. A 3/5 with 80% confidence is different from a 3/5 with 50%. Disagreement is data. Where the cohort splits, that's the diagnosis — don't average it away. The exec doesn't break ties. When the CEO and a CX officer disagree on Governance, the gap is the finding. Publishing these norms in week one makes them safe to enforce in week four.

// In practice — Halcyon Financial

A two-page framing memo, signed by three people.

The Transformation Lead lands at Halcyon on a Monday. By Friday of week one: a two-page framing memo, agreed and signed by Diana (CEO), Robert (CFO), and Daniel (Head of P&C). The third signature is deliberate — having P&C sign protects the People Impact dimension from being treated as optional. The memo names what the assessment is for, who's in the cohort, and the three norms above. It is also the document the brief in Phase 4 will be measured against.

PurposeDiagnosis before the twelve-month programme. Not a health-check on existing AI work.
Cohort10 people · CEO, CTO, COO, CFO, Head of CX, CX Officer, Compliance, P&C, plus one external advisor
TimeboxSix weeks · brief delivered by end of week 6
OutputOne brief, three pages · maturity scores with confidence, two priority dimensions, recommendation on Playbook 02

i

// Insight from the field

Most assessments are framed too narrowly — "score us on AI readiness" — and produce surface-level findings. The frame that earns its weight is one stage broader: "score where this organisation actually is, across the six dimensions, with calibrated honesty about confidence." The broader frame invites disagreement; the narrow one buries it. Sources: this pattern shows up across the reputable models — Gartner's five-level, MIT CISR's four-stage, McKinsey's six scaling practices. The breadth matters more than the count.

// Phase 1 deliverables

Framing memo · two pages, three signatures

Cohort confirmed · 8–12 names, perspectives balanced

Model adapted · language fit to the business

Norms published · confidence, dissent, no tie-breakers

05 —

Phase two · Gather.

Weeks 2–3

Three inputs, the same six dimensions. Structured interviews with the cohort. Document review across strategy, comms, and hiring. A short self-assessment survey to a broader sample. The three triangulate. Where they agree, the picture is steady. Where they disagree, you've found the gap that matters.

01

Run the interviews.

Forty-five minutes per cohort member. Same script for all. For each of the six dimensions: where do you score the organisation today, on the one-to-six scale, and what evidence supports that score? Don't accept a score without evidence. The evidence is the data; the score is the summary. Record the score, the evidence, and one quote per dimension. Interview in pairs — one person leads, one takes notes, switch halfway.

02

Review the documents.

Six artefacts. Current strategy document. Most recent all-staff comms about AI. A sample of job ads from the last six months. The AI policy (or absence of one). Recent board pack on transformation. A sample of recent staff-survey results. The documents speak for themselves. Score each dimension against the evidence in them. The gap between document scores and interview scores is the gap between what the organisation says and what it does.

03

Send the pulse survey.

Eight questions. Six dimensions plus two open boxes. "For each of the six dimensions below, where do you think the organisation sits today, on the one-to-six scale? And what's one thing you'd want the assessment to surface?" Send to 50–100 staff across functions and levels — not the cohort. Anonymised. The survey is the broad pulse; the interviews are the depth. Treat the survey as direction, not data.

04

Hold a frontline shadow.

Half a day each, with three people doing the actual work the transformation will change. Not a meeting. Sit at the desk. Watch the workflow. Note where AI shows up, where it doesn't, what people work around, what tools are open in the background. Most maturity gaps are visible at the desk and invisible in the doc. Shadow notes won't appear in the final scores but they will shape every interpretation in Phase 3.

// In practice — Halcyon Financial

The Governance & Risk score has a 2-point spread.

By end of week three, the Transformation Lead has run ten interviews, reviewed all six artefacts, and shadowed three frontline workflows. The pulse survey has 68 responses.

The first signal arrives in the Governance & Risk row. The COO (Helen) and CFO (Robert) both score Halcyon at 4 — based on the policy document. Compliance (Anna) scores it at 3 — between policy and practice. The CX Officer (Maya) and the Head of CX (Priya) both score it at 2 — because they've watched staff paste customer data into consumer AI tools for six months and no one in the organisation has noticed.

The document review confirms the policy exists, was last updated in March, and references generative AI in a single paragraph. The frontline shadow confirms the consumer-AI workflows. The pulse survey shows 41% of respondents have used a consumer AI tool for work in the past month.

Cohort spread2 to 4 across the ten scorers · two-point spread
Document score4 · policy exists, lightly applied
Pulse signal41% using consumer AI for work · unmeasured
The diagnosisStage 2 in practice · Stage 4 on paper · the gap is the work

i

// The signature insight · evidence, not assertion

The most common failure mode in Phase 2 is scoring without evidence. "I'd say we're a 3" — based on what? Senior people are skilled at producing confident scores that are mostly intuition. The rule for the assessor: every score must be defensible against at least one piece of evidence — a document, a workflow observation, a specific interview quote. Where the evidence runs out, the score gets a confidence label of "low" and goes onto the watchlist for Phase 3.

!

// Watch for · the executive halo

When the CEO scores the organisation high and a frontline officer scores it low, the room often defers to seniority. Don't. The frontline officer is closer to the workflow the score is supposed to describe. Senior scores reflect the deck; junior scores reflect the desk. The gap between them is the most useful single number in the whole assessment. Capture both. Average neither.

// Phase 2 deliverables

Ten interviews · scored across six dimensions

Document review · six artefacts, scored

Pulse survey · 50–100 responses

Frontline shadows · three workflows observed

06 —

Phase three · Score.

Week 4

A half-day calibration session with the cohort. Honest stages, defended scores, confidence labels per dimension. The temptation in this phase is to manufacture consensus. Don't. Where the cohort splits, that's the finding — capture it, don't average it. Two scores three points apart are more useful than a single average that hides the gap.

01

Run the calibration session.

Four hours, the full cohort, in person if possible. The grid on the wall. One dimension at a time. Each cohort member shows their score and the one piece of evidence behind it. The room debates. Scores can move. They are not required to converge. Use the same truth-seeking norms from Phase 1 — confidence stated, dissent rewarded, no exec tie-breaks.

02

Capture confidence alongside the score.

For each cell, the cohort records two things: the stage (1–6) and the confidence (low / medium / high). A 4-with-low-confidence is a different finding from a 4-with-high-confidence. The first goes onto the watchlist. The second goes into the brief. Confidence is the calibration on the calibration — it tells the executive how much weight to put on each number.

03

Map the disagreement.

For each dimension, plot the spread of scores from the cohort on the wall. Where the spread is one point — agreement. Where it's two or three — investigate. Where it's four or more — the dimension is doing two different jobs in the organisation, and that itself is the finding. Disagreement maps are the most useful artefact most assessments never produce.

04

Decline the urge to average.

When the cohort splits, the temptation is to call it a 3 and move on. Resist. The grid records the range, not the mean. The brief in Phase 4 explains the range. The executive can see for themselves where the organisation tells the same story and where it tells two. This is the discipline most assessments quietly drop. Holding it is the difference between a defensible diagnosis and a comfortable one.

// In practice — Halcyon Financial

The Governance & Risk row gets three numbers.

The calibration session lasts four hours. Diana (CEO) chairs the room but doesn't break ties — the norm is held. By the end of the session, every row on the grid has a stage range, a confidence label, and a one-line interpretation.

The Governance & Risk row is the one the room spends longest on. Helen and Robert hold at 4. Anna moves from 3 to "3 with caveats." Maya holds firm at 2. The room briefly debates landing on 3 to "give the assessment a clean number." Anna, who has read the discipline of the framing memo, calls it out: "if we land on 3, the brief tells the executive nothing they don't already think they know. The 2-to-4 spread is the story."

The room agrees. The Governance & Risk row is logged as Stage 2 – 4, low confidence, spread of 2 points. The interpretation, captured live: "policy at 4, practice at 2; the gap is unmeasured shadow AI and the absence of governance over consumer-tool usage. Highest-priority dimension for Playbook 02 to address."

Helen (COO)Stage 4 · the policy exists and is current
Anna (Compliance)Stage 3 with caveats · between policy and practice
Maya (CX)Stage 2 · staff use consumer AI tools daily, unmeasured
The brief entryStage 2 – 4 · low confidence · highest-priority gap

i

// The signature insight · judge the assessment, not the number

Borrowed from Annie Duke's Thinking in Bets. The job of the assessment isn't to produce the "right" score — it's to produce the most defensible score given the evidence available. A 3 with high confidence and ten pieces of evidence is a different finding from a 3 with low confidence and one. Capture both. Resist the temptation, in the year-end review, to judge the assessment by whether the score "turned out to be right." Judge it by whether the reasoning was sound at the time. This is the same discipline the prioritisation forum in Playbook 03 will run. Naming it here builds the muscle for everything downstream.

!

// Watch for · the comfort score

When a dimension lands at exactly Stage 3 with medium confidence, the cohort has often agreed to disagree without saying so. Pressure-test it. Ask the room: "if we had to defend this score to a sceptical board member tomorrow, what would we say?" If the defence is thin, the score is comfort, not calibration. Split it into a range. Or label the confidence honestly as low.

// Phase 3 deliverables

The scored grid · ranges, not averages

Confidence per dimension · low / medium / high

Disagreement map · where the spread is, and what it means

Watchlist · low-confidence cells flagged for Phase 4

07 —

Phase four · Brief.

Weeks 5–6

The assessment is only useful if the brief lands. One short document. Current state, the gap, the one or two dimensions where the gap matters most. The brief isn't a dashboard. It's a decision aid for the executive sponsor and the input to Playbook 02. If the brief can't be read in fifteen minutes and discussed in an hour, it's the wrong length.

01

Write the three-page brief.

Page one: the scored grid and a one-paragraph headline finding. Page two: the two priority dimensions, each with the evidence and the disagreement spread. Page three: the recommendation — which of Playbooks 02 through 06 the diagnosis points at, and in what order. Nothing else. No appendix. No executive summary that paraphrases what the next page says. The discipline is to make every word load-bearing.

02

Hold the brief-back session.

90 minutes with the executive sponsor and the cohort. Read the brief together — physically, in the room. Then debate, dimension by dimension, what the brief means for the next six months of investment. Resist the urge to present. The cohort has done the work; the brief speaks for it. The lead's job in this session is to keep the room honest about the findings, not to defend them.

03

Capture what the sponsor heard.

At the end of the session, ask the sponsor to summarise the diagnosis back to the room — in their own words. If the summary matches the brief, the assessment landed. If it doesn't, you've found the gap between the work and the reception, and that's the gap to close before Playbook 02 starts. This is the single most-skipped step in any assessment, and the one that determines whether the brief becomes action or wallpaper.

04

Recommend the next playbook.

The output of this playbook is the input to Playbook 02 (Strategic Planning · OKRs). The recommendation names the two dimensions the OKRs should move, the starting stage of each, and the target stage at twelve months. Concrete enough to write OKRs against. Honest enough that the sponsor can defend them. If the diagnosis points at a different playbook — say, Playbook 06 (Change Readiness) — because the people-impact gap dominates, the recommendation says so.

// In practice — Halcyon Financial

Three pages, three signatures, one clear recommendation.

The Transformation Lead writes the brief on the Tuesday of week six. Three pages. Diana, Robert, and Daniel — the three signatories on the framing memo — receive it forty-eight hours before the brief-back session, so they can read it in advance.

The brief-back session is held on the Friday. Ninety minutes. The cohort is in the room. The grid is on the wall. Diana opens by reading her own summary back: "We're better on strategy and capability than I thought. We're materially worse on governance and people-impact than I thought. Those two are where the OKRs need to focus." The room agrees. The CFO commits to a budget shape that follows the diagnosis. The CTO, who came in expecting to be vindicated on data-architecture risk, finds he's been vindicated — and that he's also being asked to support the governance work he hadn't expected to lead.

The recommendation: Playbook 02 (Strategic Planning) is the next step. The OKRs to come out of it will target Governance & Risk and People Impact as the two priority dimensions. Capability & Fluency is named as the supporting dimension; the other three are flagged as watchlist for the next assessment, twelve months out.

Headline findingGovernance gap (Stage 2 to 4) and people-impact gap (Stage 2 to 3) are the priority
RecommendationPlaybook 02 next · OKRs target Governance and People Impact
Re-assessmentTwelve months · same cohort · same dimensions
Sponsor readDiana's summary matched the brief · the assessment landed

i

// The signature insight · the gap, not the score

The most-read line in the brief is usually the worst-written one — the headline finding. Most assessments default to "our overall maturity is Stage 3." That sentence is the work of nothing. The headline that earns its keep is the one that names a specific gap, between two specific perspectives, on a specific dimension, with a specific implication. Halcyon's headline isn't "we're at Stage 3 overall." It's "we score Stage 4 on policy and Stage 2 on practice in Governance & Risk — the gap is forty-one percent of staff using consumer AI tools without measurement." One sentence. Three pieces of evidence. A direction.

!

// Watch for · the wallpaper trap

Assessments that produce beautiful artefacts and no decisions are worse than assessments that don't happen. The test, three months after the brief: has the diagnosis changed anything? If the brief is in a SharePoint folder and the programme is being run on the original ambition, the assessment was wallpaper. Schedule the three-month check-in at the brief-back session. Put it in the calendar before the room leaves.

// Phase 4 deliverables

Three-page brief · grid, gap, recommendation

Brief-back session · 90 minutes, sponsor + cohort

Sponsor's own summary · in their words, captured

Playbook recommendation · which is next, and why

08 —

Three cadences. Quarterly, annual, programme-end.

After the first assessment

// A maturity assessment is a project, not a system. But the diagnosis goes stale, and the gap between this assessment and the next is where most transformations quietly lose the plot. Three cadences keep the assessment doing work after the brief lands.

Quarterly · 30 min

The check-in.

Has the diagnosis changed anything?
Is the priority dimension still the priority?
Any new evidence on the watchlist?
Three lines · captured, circulated

Annual · 2–3 wks

The re-assessment.

Same cohort, same dimensions
Score against the prior year's grid
Which stages moved · and which didn't
Lighter than the first · same discipline

Programme-end · 1 day

The retrospective.

Where did the assessment get it right?
Where did it miss?
Judge the reasoning · not the outcome
What we'd ask differently next time

09 —

Four ways this fails.

Common pitfalls

// Every maturity assessment that didn't change anything failed in one of these four ways. Watch for them across the six weeks — and especially in the brief-back session.

// Pitfall 01

The comfort average.

The cohort scores split badly on a dimension, and the room agrees to call it a 3 to "produce a clean number." The split was the diagnosis; the average is wallpaper. By the time anyone notices, the OKRs have been written against a fiction.

The fix

Make the disagreement-map a required artefact. The brief reports ranges where the cohort splits. The discomfort of writing "Stage 2–4, low confidence" is the discomfort that protects the diagnosis.

// Pitfall 02

The executive halo.

The CEO scores high, the frontline scores low, and the room defers to seniority. The senior view describes the deck; the frontline view describes the desk. Averaging them produces a number that's accountable to neither.

The fix

Capture both scores by name in the brief. The CEO sees that the gap is real — and that closing it is what the programme is for. The norm "the exec doesn't break ties" was published in Phase 1 for this exact moment.

// Pitfall 03

The evidence-free score.

Confident scores produced by intuition alone. The cohort says "I'd put us at a 4" and the room nods. Three months later, the OKRs aren't moving and no one can remember why the score was a 4. The assessment was an opinion poll dressed as a diagnosis.

The fix

No score without one piece of evidence behind it. The evidence column is required, not optional. Where the evidence runs out, the confidence label drops to "low" — and the dimension goes onto the watchlist for the re-assessment.

// Pitfall 04

The shelved brief.

The brief is delivered, the room nods, the SharePoint folder receives it. Three months later, the programme is running on the original ambition. The diagnosis didn't lose to a better argument — it lost to inertia.

The fix

Schedule the quarterly check-in inside the brief-back session, before the room leaves. Capture the sponsor's own summary — in their words — and circulate it. The next session opens with "three months ago, you said..."

10 —

Three disciplines underneath.

What the methodology page covers in full

// The four phases are the mechanics. These three are the thinking habits that keep them honest. They're surfaced in full on the methodology page; named here so the playbook is honest about what it rests on.

// Discipline 01

Score the evidence, not the intuition.

Every score, one piece of evidence
Where evidence runs out, label low
Documents vs. desk · triangulate
Don't score what can't be defended

// Discipline 02

Judge the assessment, not the number.

A good call can be a wrong score
A lucky guess can read as right
Judge the reasoning at the time
Retrospective on the assessment, annually

// Discipline 03

Make the cohort truth-seeking.

State confidence levels
Reward dissent · the gap is data
Accuracy over agreement
The exec doesn't break ties

11 —

By week six, you should have.

Starter checklist

// If you can tick all fifteen, the assessment is defensible. Anything missing at week six is debt — it'll surface in month three when the OKRs land and the cohort has moved on.

✓

A framing memo · two pages, three signatures

Phase 1

✓

A cohort confirmed · 8–12 people, perspectives balanced

Phase 1

✓

Norms published · confidence, dissent, no tie-breaks

Phase 1

✓

The model adapted · language fit to the business

Phase 1

✓

Ten interviews · scored across six dimensions, with evidence

Phase 2

✓

Six documents reviewed · strategy, comms, ads, policy, board, survey

Phase 2

✓

A pulse survey · 50+ responses, anonymised

Phase 2

✓

Three workflows shadowed · half a day each, at the desk

Phase 2

✓

A calibration session held · four hours, full cohort

Phase 3

✓

Confidence labels · low / medium / high per dimension

Phase 3

✓

A disagreement map · ranges captured, not averaged

Phase 3

✓

A three-page brief · grid, gap, recommendation

Phase 4

✓

A brief-back session held · 90 minutes, sponsor + cohort

Phase 4

✓

The sponsor's own summary captured · in their words

Phase 4

✓

A quarterly check-in scheduled · before the room leaves

Phase 4

12 —

Using this in practice.

Closer

The assessment is the floor, not the ceiling.

This playbook is a starting point, not a prescription. Every organisation has its own gravity — political, technical, cultural — that bends the assessment in different ways. Halcyon Financial is one shape. Yours will be different.

What travels is the discipline, not the artefacts. The framing memo will be a different two pages. Your cohort will have your faces around the table, your sceptic, your CEO who scored too high. The shape of the work — frame, gather, score, brief — is the thing that doesn't move. Neither does the discipline underneath it: score the evidence, judge the assessment, keep the cohort honest.

If you're running an assessment and want to talk through where it's getting stuck — particularly the calibration session, which is where most assessments quietly collapse — I'm happy to.

The maturity assessment.

Meet Halcyon Financial.

A 700-person fintech with more ambition than evidence.

The Transformation Lead

Diana Whitfield

Sam Patel

Helen Bautista

Priya Nair

Maya Chen

Anna Petrović

Daniel Okafor

Robert Lin

External perspective

Four phases. Frame, gather, score, brief.

Frame

Gather

Score

Brief

The assessment matrix.

Phase one · Frame.

Name the purpose of the assessment.

Confirm the model fits.

Choose the cohort.

Set the truth-seeking norms publicly.

A two-page framing memo, signed by three people.

Phase two · Gather.

Run the interviews.

Review the documents.

Send the pulse survey.

Hold a frontline shadow.

The Governance & Risk score has a 2-point spread.

Phase three · Score.

Run the calibration session.

Capture confidence alongside the score.

Map the disagreement.

Decline the urge to average.

The Governance & Risk row gets three numbers.

Phase four · Brief.

Write the three-page brief.

Hold the brief-back session.

Capture what the sponsor heard.

Recommend the next playbook.

Three pages, three signatures, one clear recommendation.

Three cadences. Quarterly, annual, programme-end.

The check-in.

The re-assessment.

The retrospective.

Four ways this fails.

The comfort average.

The executive halo.

The evidence-free score.

The shelved brief.

Three disciplines underneath.

Score the evidence, not the intuition.

Judge the assessment, not the number.

Make the cohort truth-seeking.

By week six, you should have.

Using this in practice.

The assessment is the floor, not the ceiling.