Crypto-agility is one of those phrases that has been around for two decades and is now finally being taken seriously because PQC migration is making the cost of inflexibility visible. NIST SP 800-131A formalised the concept; ENISA and NCSC have detailed guidance; the principle is simple. The implementation isn't.
The four-layer pattern
The application code, the policy decision, the cryptographic interface, and the library implementation each live in a different layer, with explicit contracts between them:
Layer 1 — Application
The application code never names an algorithm. It calls
typed verbs that describe purpose:
sign(payload, profile="user-token"),
encrypt(blob, profile="archive"),
verify(signature, profile="firmware-update").
The profile name encodes the use case. The algorithm choice
doesn't appear here.
Layer 2 — Crypto policy
A policy service resolves profile names to algorithm
tuples. user-token maps to
{sig: "ML-DSA-44", hash: "SHA-256"} today and
could map to something else tomorrow. The mapping lives in a
config source — a config service, a Git-backed YAML, a feature
flag system — with audit logging on changes and
hot-reload support.
Layer 3 — Primitive interface
Vendor-neutral interfaces for each cryptographic
operation: Signer, Verifier,
KeyEncapsulator, Encryptor,
KeyAgreement. The application's typed verbs
resolve through Layer 2 to a concrete implementation that
satisfies the right interface. The application never imports
a specific library directly.
Layer 4 — Library / HSM
Concrete primitives, behind the interfaces. ML-DSA-65 from AWS-LC, SLH-DSA-128f via PKCS#11 against the HSM, AES-256-GCM from OpenSSL, ML-KEM-768 from BoringSSL. Multiple implementations of the same interface coexist; the policy layer picks one per profile per call.
What this buys you
An algorithm migration becomes a config change. Update the
user-token profile to map to
ML-DSA-65 instead of ML-DSA-44. The
hot-reload picks up the change. New tokens are signed under
the new algorithm. Old tokens continue to verify under the
old one until they age out. No code review, no deploy, no
window.
This is the gap between PQC migrations that take quarters and ones that take years.
Anti-patterns to refactor away
Five things in source code that make crypto-agility hardest. Find them in your bank's codebases. Refactor them first.
- Hard-coded algorithm strings.
"RSA","ECDSA","AES/GCM"appearing in source files. Replace with profile lookup. - Hard-coded key lengths and curve names.
2048,P-256,secp384r1as magic numbers. Encode in the profile, not the call site. - Fixed-size key buffers.
byte[256]for an RSA-2048 modulus. PQC keys are an order of magnitude larger. Use the interface's key opaque type; don't size buffers based on RSA assumptions. - Direct library imports in business code.
import javax.crypto.Cipher;orfrom cryptography.hazmat.primitives import...in a service layer. Business code should import only the Layer 3 interfaces. - Stable signature/ciphertext sizes assumed by consumers. Databases with fixed-width signature columns. Wire protocols that assume 64-byte signatures. Make these variable-length or rev the schema before migration begins.
Retrofitting an existing service
You rarely get to build crypto-agility into a greenfield service. The realistic question is how to retrofit existing ones. Three-phase approach:
- Phase 1 — wrap, don't rewrite. Introduce the Layer 3 interfaces. Implement them by delegating to existing direct-library calls. Replace call-sites one at a time. No behaviour change.
- Phase 2 — introduce the policy layer. Add the Layer 2 profile lookup behind the interfaces. Initially every profile resolves to "the same algorithm we used before". Test that nothing changes.
- Phase 3 — actually move the algorithm. Update a profile to point at a PQC algorithm. Roll out behind a feature flag. Monitor. Promote.
The whole programme is months, not years, for any single service. For an enterprise estate of dozens of services, it's a year-plus initiative. But each service goes through the same playbook, which means it parallelises if the team has the bandwidth.
What good observability looks like
A crypto-agile system has runtime observability that a crypto-hardcoded system doesn't:
- Every cryptographic operation logs the resolved profile and the chosen algorithm. "What was I signing with at this time on this request?" is answerable.
- Migration progress is the time-series ratio of new vs old algorithm usage per profile. You can watch a profile flip in production.
- Audit changes to the policy layer are first-class events. Every algorithm reconfiguration is approvable and revertible.
The bank that builds this once gets it for the next migration too. PQC won't be the last algorithm migration anyone ever does. Crypto-agility is what makes the next one cheap.