Playbook · The Four Layers of AI Guardrails

01 —

Meet Halcyon Financial.

Worked example

A 700-person fintech shipping AI faster than it can govern it.

Halcyon Financial is a regulated digital financial services platform — scaled past the start-up phase, not yet carrying the operating muscle of a mature enterprise. AI is already everywhere. The CX team drafts customer comms with it. Engineering ships AI-assisted features. Half the business uses tools no one approved.

The board wants the upside and none of the headlines. Legal wants a moratorium. Engineering wants to keep moving. The instinct is to stand up a committee and write a forty-page policy — and watch both get ignored.

None of that is a governance problem. It's a guardrail problem. The question isn't whether to allow AI. It's where the rails go, and which of them can't depend on anyone remembering.

The worked example. This playbook follows one Halcyon use case — an AI-assisted KYC-expiry notification flow that touches customer identity documents — through all four layers, from the board's risk appetite down to PII detection in the model itself.

700 People across
the business

40+ AI tools in use ·
most unsanctioned

1 Regulator watching
customer-facing AI

0 Guardrails in the
system itself

// And the people you'll meet

// Cast 01

The Transformation Lead

You, the reader

Holds the pen on the guardrails. Accountable for none of the risk, responsible for making the rails fit the work.

// Cast 02

Dianne Walsh

Chief Operating Officer

The single accountable owner for AI risk. The name on the one-page appetite — not a committee.

// Cast 03

Anna Petrović

Head of Compliance

Holds the regulator-tolerance line. Signs off anything the customer can see.

// Cast 04

Sam Patel

Engineering Director

Owns the system layer — the guardrails that live in the code, not the policy.

// Cast 05

Priya Nair

Head of CX

Owns the KYC-notification use case. Wants it live this quarter, not next year.

// Cast 06

Maya Chen

CX Officer · 2.5 yrs

Already drafts customer emails with an AI tool no one approved. The shadow-AI story, in one person.

02 —

Four layers. Governance, operating model, process, system.

The framework

// The layers nest. Each one translates the layer above it into something more concrete — intent becomes policy, policy becomes process, process becomes code. A guardrail that only exists at the top is a slogan.

Layer 01

Governance

Set the appetite

Decide how much risk you'll carry, and who owns it. One accountable name, one page, an existing oversight body.

Layer 02

Operating model

Translate to behaviour

Turn the appetite into a policy people can follow — risk tiers, a usage policy, shadow AI brought into the light.

Layer 03

Process

Embed in delivery

Put the checks where the work happens — human review where it counts, approval and escalation paths, a runbook for when it's wrong.

Layer 04

System

Build into the AI

The guardrails that don't depend on anyone remembering — input and output checks, PII detection, content filters, monitoring.

03 —

Layer one · Governance.

Set the appetite

Governance is one decision made well: how much risk you'll carry, and who owns it. Everything below this layer inherits from that decision. Most organisations skip it and build the committee instead — a body to govern a thing they haven't yet defined.

01

Name one accountable owner.

A single executive accountable for AI risk — not a committee, not "the business". A committee can advise; only a person can be accountable. The name belongs on the one-page appetite, and on the hook when something goes wrong.

02

Write a one-page risk appetite.

Three statements, not thirty pages. What you'll never do. What you'll do with sign-off. What teams can just do. The appetite says yes as clearly as it says no — a document that only forbids gets routed around.

03

Put oversight on an existing agenda.

AI risk is a standing item on a body you already have — the risk committee, the exec meeting. Don't stand up a new board. A new committee is months of process to govern a thing that needs governing now.

// In practice — Halcyon Financial

One name, one page, no new body.

Halcyon names Dianne Walsh, the COO, as the single accountable owner. Not the most obvious pick — but AI risk at Halcyon is an operational-and-regulatory problem, and the COO already owns both.

The one-page appetite fits on a slide. Never: AI making a final lending or compliance decision unreviewed. With sign-off: anything a customer sees. Just do it: internal drafting, summarisation, code assistance on non-production data.

AI risk joins the existing fortnightly risk committee as a standing item. Net-new governance bodies created: zero.

OwnerDianne Walsh, COO · single accountable name
AppetiteOne page · never / with sign-off / just do it
OversightStanding item on the existing risk committee

i

// Insight from the field

The most common first move — stand up an AI ethics board — is usually the wrong one. A board with nothing concrete to rule on becomes a ritual. Define the appetite first; let the volume of real decisions tell you whether you ever need a dedicated body. Most organisations under 5,000 people don't.

// Layer 1 deliverables

Accountable owner · one named executive

Risk appetite · one page, three tiers

Oversight · on an existing agenda

04 —

Layer two · Operating model.

Translate to behaviour

The operating model turns the appetite into something a person at their desk can act on. If a guardrail can't be followed without reading a policy, it won't be. The work here is making the right path the easy one.

01

Publish a usage policy people actually read.

One page, traffic lights, real examples. Green: go. Amber: sign-off. Red: stop. A policy is only a guardrail if someone can recall it without opening it.

02

Tier use cases by risk, route by tier.

Three tiers, set once. Low-risk use cases self-serve. High-risk ones route to sign-off. The tier does the routing — not a queue, not a person's inbox. Proportionate beats uniform every time.

03

Bring shadow AI into the light.

A large share of staff already use AI tools their employer can't see. An amnesty plus a sanctioned path beats a ban — prohibition doesn't stop the usage, it just stops you seeing it. Register what's in use; bless the safe, replace the rest.

04

Train for judgement, not tools.

The skill that matters isn't prompting — it's knowing when not to trust the output. Train people on the failure modes: confident wrong answers, fabricated detail, the moment a draft needs a human before it ships.

// In practice — Halcyon Financial

A traffic-light page, and an amnesty.

Halcyon's usage policy is a single page on the intranet. The KYC-notification idea is amber — customer-facing and regulated — so it routes to Compliance sign-off automatically, without anyone deciding to send it there.

A two-week shadow-AI amnesty surfaces forty-plus tools already in use. Maya Chen's email-drafting tool is one of them. Three tools are blessed on the spot; the rest get a sanctioned replacement, and the drafting workflow moves onto an approved platform with no loss of speed.

PolicyOne page · green / amber / red, with examples
RoutingKYC flow = amber → auto-routes to sign-off
Shadow AI40+ tools surfaced · registered, not banned

!

// Watch for

The blanket ban. It feels decisive and it photographs well in a board pack. What it actually does is push every AI tool one layer underground — off the corporate account, onto personal devices, out of your sight entirely. You don't get less AI. You get less visible AI, which is the opposite of a guardrail.

// Layer 2 deliverables

Usage policy · one page, traffic lights

Risk tiers · three, with routing

Shadow-AI register · surfaced, not banned

Judgement training · failure modes, not prompts

05 —

Layer three · Process.

Embed in delivery

Process is where the guardrail meets the work. The guardrail that matters most is the one for when the AI is wrong, not when it's right. Most teams design for the happy path and improvise the rest under pressure.

01

Put a human in the loop where it counts.

Human review on the decisions that carry consequence — and only there. Review everything and the reviewers stop reading. Reserve the human for the irreversible, the regulated, the customer-facing; auto-pass the rest.

02

Name the approval and escalation paths.

Who signs off before launch. Who you escalate to when the model does something no one predicted. Both written down before go-live — an escalation path invented mid-incident is not a path.

03

Write the "when it's wrong" runbook first.

Before launch, not after the first bad output. What gets paused, who's told, how a customer is made whole, what gets logged. The runbook is cheap to write in calm and impossible to write in a crisis.

// In practice — Halcyon Financial

A human on the words, not the trigger.

For the KYC-notification flow, Halcyon splits the work. The trigger logic — detecting an expiring document — runs automatically; it's deterministic and low-risk. The customer-facing wording, which carries regulated language, gets human review on the template before launch, then runs unattended.

Anna Petrović signs off the comms template. The escalation path is named on one line: unexpected output → pause the flow → notify Compliance & Engineering on-call. The runbook is two pages, written the week before go-live.

Human reviewOn the regulated wording · not the trigger
ApprovalCompliance signs off the template pre-launch
EscalationNamed path · pause → notify · written runbook

i

// Insight from the field

Human-in-the-loop fails in two opposite ways. Too little, and an unreviewed model makes a decision it shouldn't. Too much, and reviewers rubber-stamp at volume — the review becomes theatre, and theatre is worse than no review because it manufactures false confidence. Put the human exactly where a wrong answer is expensive, and nowhere else.

// Layer 3 deliverables

Human-in-the-loop map · where, and only there

Approval & escalation paths · named, pre-launch

Incident runbook · written before go-live

06 —

Layer four · System.

Build into the AI

The system layer is the only one that doesn't depend on anyone remembering. It's also the one most often skipped — because it needs engineering, not a policy. A guardrail in code holds at 2am on a Sunday. A guardrail in a document holds only while someone's reading it.

01

Guard the input and the output.

Constrain what can go in; validate what comes out. Block prompts that try to pull data they shouldn't; check outputs against rules before they reach a customer. The model is a component, not a trusted employee.

02

Detect personal information, filter content.

Automated PII detection wherever the AI touches customer data, plus content filters for the outputs that go outside the building. The checks that protect people shouldn't rely on a person catching them.

03

Monitor in production.

Log inputs, outputs, and decisions. Watch for drift — the model that was right in testing and quietly degrades in the wild. You can't govern what you can't see, and the audit trail is the thing the regulator asks for first.

// In practice — Halcyon Financial

The rails that hold without a human.

Sam Patel's team builds three things into the KYC flow. PII detection, because it touches identity documents. Output validation on the generated wording, so nothing reaches a customer that the template didn't sanction. And full logging — every notification, every trigger, every override — for the audit trail Compliance will need.

None of it depends on Maya, or Anna, or anyone remembering a policy. The non-negotiables are in the code. The human layer above handles judgement; the system layer handles the things that should never be a judgement call.

PII detectionAutomated · wherever identity data is touched
Output validationGenerated wording checked before it sends
MonitoringFull logging · drift watch · audit trail

!

// Watch for

The paper programme. Layers one to three are quick to stand up because they're documents and meetings — so it's tempting to declare victory there. But a guardrail that only exists on paper is a guardrail that fails the first time someone's busy. If the system layer keeps slipping because "it needs engineering," that's not a reason to skip it — it's the signal that it's the layer actually doing the work.

// Layer 4 deliverables

Input / output guardrails · in the system

PII detection & content filters · automated

Production monitoring · logs, drift, audit trail

07 —

Keeping the rails true.

Operating rhythm

// Guardrails aren't a launch, they're a maintenance job. Three cadences keep the four layers from quietly drifting out of date as the AI estate grows.

Continuous · automated

The system watch.

Monitor outputs and drift
Flag PII and filter breaches
Keep the audit log complete
Alert on anything unexpected

Monthly · 45 min

The use-case review.

Tier and route new use cases
Review the shadow-AI register
Walk any incidents and near-misses
Update the usage policy if needed

Quarterly · 90 min

The appetite reset.

Re-test the risk appetite
Pressure-test the system controls
Retire guardrails that no longer earn their place
Re-secure the owner's sign-off

08 —

Four ways this fails.

Common pitfalls

// Guardrail programmes fail in four predictable ways. All four are easier to fix in the first 90 days than after the first incident.

// Pitfall 01

The policy no one reads.

Forty pages, written by Legal, signed off by everyone, opened by no one. It satisfies the audit and changes no behaviour at all.

The fix

One page, traffic lights, real examples. If it can't be recalled from memory, it isn't a guardrail — it's a filing.

// Pitfall 02

The committee that governs nothing.

An AI board stood up before there's anything concrete to rule on. It meets, it minutes, it defers — and the real decisions happen elsewhere.

The fix

Define the appetite first. Add AI to a body you already have. Let the volume of real decisions tell you if you ever need a dedicated one.

// Pitfall 03

The ban that breeds shadow AI.

Prohibition feels decisive. In practice it moves usage onto personal devices and out of view — you get the same risk, minus the visibility.

The fix

Amnesty, then a sanctioned path. Make the approved route the easy one. Sunlight governs better than prohibition.

// Pitfall 04

Policy without the system.

All four layers declared "done" when only the paper ones are. The guardrails depend on people remembering — which holds until the day someone's busy.

The fix

Push the non-negotiables into the code — PII detection, output validation, logging. The system layer is the slowest to build and the only one that doesn't forget.

09 —

Three principles underneath.

What holds the four layers together

// The four layers are the mechanics. These three principles are the posture that keeps them from hardening into bureaucracy.

// Principle 01

Proportionate, not uniform.

Match the rail to the risk
Low-risk use cases self-serve
Reserve the heavy controls
Uniform controls train people to ignore them

// Principle 02

Enable, don't just restrict.

The appetite says yes clearly
Make the safe path the easy path
Guardrails exist to allow speed
A document that only forbids gets routed around

// Principle 03

Sunlight over prohibition.

Surface shadow AI, don't ban it
Visible risk is governable risk
Amnesty beats enforcement
What you can't see, you can't guard

10 —

By the time you ship, you should have.

Starter checklist

// Thirteen items across the four layers. If the system-layer rows are still empty when you launch, you have a paper programme — the gap will surface at the first incident.

✓

One accountable owner named · a person, not a committee

Layer 1

✓

A one-page risk appetite · never / with sign-off / just do it

Layer 1

✓

AI risk on an existing oversight agenda

Layer 1

✓

A one-page usage policy · traffic lights, examples

Layer 2

✓

Three risk tiers with automatic routing

Layer 2

✓

A shadow-AI register · surfaced, not banned

Layer 2

✓

Judgement training · failure modes, not prompts

Layer 2

✓

A human-in-the-loop map · where it counts, and only there

Layer 3

✓

Approval & escalation paths · named before launch

Layer 3

✓

An incident runbook · written while it's calm

Layer 3

✓

Input / output guardrails in the system

Layer 4

✓

PII detection & content filters · automated

Layer 4

✓

Production monitoring · logs, drift, audit trail

Layer 4

11 —

Using this in practice.

Closer

Guardrails are the floor, not the cage.

This playbook is a starting point, not a prescription. Your risk appetite, your regulator, your engineering capacity — all of it bends the four layers into a shape that's yours. Halcyon Financial is one shape. Yours will differ.

What travels is the order of operations: governance, operating model, process, system — intent translated downward until the most important rails live in code, not in a document. Skip the system layer and you have a programme that reads well and fails quietly. Build it and you can let people stop checking.

// Where this connects — the change side of standing this up lives in the transformation methodology: Change readiness & risk →

If you're putting guardrails around an AI programme and want to talk through where it's getting stuck, I'm happy to.

The four layers of AI guardrails.

Meet Halcyon Financial.

A 700-person fintech shipping AI faster than it can govern it.

The Transformation Lead

Dianne Walsh

Anna Petrović

Sam Patel

Priya Nair

Maya Chen

Four layers. Governance, operating model, process, system.

Governance

Operating model

Process

System

Layer one · Governance.

Name one accountable owner.

Write a one-page risk appetite.

Put oversight on an existing agenda.

One name, one page, no new body.

Layer two · Operating model.

Publish a usage policy people actually read.

Tier use cases by risk, route by tier.

Bring shadow AI into the light.

Train for judgement, not tools.

A traffic-light page, and an amnesty.

Layer three · Process.

Put a human in the loop where it counts.

Name the approval and escalation paths.

Write the "when it's wrong" runbook first.

A human on the words, not the trigger.

Layer four · System.

Guard the input and the output.

Detect personal information, filter content.

Monitor in production.

The rails that hold without a human.

Keeping the rails true.

The system watch.

The use-case review.

The appetite reset.

Four ways this fails.

The policy no one reads.

The committee that governs nothing.

The ban that breeds shadow AI.

Policy without the system.

Three principles underneath.

Proportionate, not uniform.

Enable, don't just restrict.

Sunlight over prohibition.

By the time you ship, you should have.

Using this in practice.

Guardrails are the floor, not the cage.