Our Methodology

How we actually do the work.

A five-phase process for AI security work. Mapped to the standards your auditors, regulators, and insurers already use, without the alphabet soup that usually comes with them.

The process

From "we don't know what could go wrong" to "we've fixed it, and we can prove it stays fixed."

Every engagement runs the same five phases. Each one builds on the last, and the work doesn't close until the original problem can no longer be exploited.

Scope and threat model.

We sit down with you and map what the AI system actually does, what data it touches, what actions it can take, and where things would hurt your business if they went wrong. We agree on what is in scope and what is off-limits. The output is a short written threat model that everyone signs off on before any testing begins.

ALIGN TO NIST AI RMF “MAP”

Learn the system from the outside in.

Before we attack anything, we use your system the way a curious customer would. We map its real behaviours, the guardrails it has, the tools it can call, the documents it can read, and the memory it carries between conversations. This is where we find the edges that public benchmarks always miss.

APPLICATION ORIENTATION

Try to break it.

We combine hand-crafted attacks built from your threat model with automated test suites covering the well-known failure patterns: hidden instructions in retrieved content, tool misuse, leaked system prompts, poisoned memory, and the rest. Every working attack ships with a script that re-runs it on demand; no reproduction, no finding.

OWASP LLM TOP 10 + MITRE ATLAS

Chain the small problems into the real one.

A single quirky behaviour is rarely the real story. Impact comes from chains: a hidden instruction in a document triggers a tool call, which exposes a credential, which unlocks a customer record. We try those chains the way a motivated attacker would, so what you see at the end is business risk, not technical curiosities.

IMPACT AND EXPLOITATION

Fix it, then prove the fix held.

Each finding comes with a recommended fix, and we pair with your engineers to land the change. Then we re-run the exact same attack. The engagement closes only when the original exploit no longer works, and the test that proves it stays in your CI pipeline, so a future change cannot quietly bring the issue back six months later.

CLOSED-LOOP REMEDIATION

Non-negotiables

How you can tell the real work from the theatre.

A lot of AI security services produce slide decks that age out within weeks. These are the rules we hold ourselves to, so what you buy is something engineering can actually act on.

01. EVIDENCE

No screenshot, no finding.

Every issue ships with the exact script that triggers it. If we can't reproduce it on demand, we don't report it.

02. DELIVERABLES

Tests, not PDFs.

The thing you keep at the end isn't a report, it's an executable test suite that runs in your CI.

03. SCOPE

The whole system, not just the model.

Most real AI failures live at the seams, the tool calls, the document retrieval, the memory, and the retry model output is handed off to the rest of your stack.

04. CADENCE

Continuous, not annual.

A point-in-time audit ages out the moment you ship the next prompt, tool, or retrieval source.

05. DEFENSIBILITY OF DATA

Done means the exploit fails.

The work isn't finished when the report is delivered. It's finished when re-running the original attack against your production system no longer succeeds.

06. STANDARDS

Mapped to language your auditors already speak.

Every finding is tagged with the relevant frameworks, so your security, legal, and compliance teams can each read the same report.

Mapped to

OWASP LLM Top 10 2025

MITRE ATLAS

NIST AI RMF

Google SAIF

Frequently Asked Questions

How does an engagement start?

With a 30-minute scoping call. We cover what is deployed, what you are worried about, and what success would look like.

How long does a typical engagement take?

Most work runs one to six weeks depending on system size, access, and whether the target is research or production.

Can the test harness keep running after you leave?

Yes. We prefer to leave teams with repeatable tests and documentation, not just a static report.

Starting point

One scoping call. One short brief.

A 30-minute call covers what's deployed, what you're worried about, and what success would look like. We follow up with a short written brief: phases, time on team, cost, and only then does anyone commit. Most engagements run between one and six weeks.

Reach us at hello@ryvane.ai.

Start a scoping call ↗See our expertise