ConfidentialFinTech · Germany · Series A

RAG-based compliance assistant — from zero to production in 4 weeks

A German Series A fintech needed a compliance assistant that could review regulatory documents against their product terms — a task their team spent 3+ hours on per submission. We built a multi-stage RAG pipeline and shipped it to production in 4 weeks.

−65%

manual review time

99.2%

accuracy on test corpus

4 wks

zero to production

Challenge

What was broken

Every new financial product at this Series A startup required a compliance review against EU regulatory documents — MiFID II, PSD2, and internal product terms. Each review took a senior compliance officer 3–4 hours. As the product line expanded, the compliance bottleneck was slowing down new launches by weeks. Hiring more compliance staff wasn't viable at their stage. They needed software that could do the heavy lifting — but the accuracy bar was non-negotiable.

Our Approach

How we thought about it

The key design decision was the accuracy requirement. Standard single-pass RAG retrieval wouldn't get above 85–88% on their test set, which wasn't good enough. We designed a three-stage pipeline: semantic retrieval (broad), cross-encoder reranking (precise), and a final reasoning pass with GPT-4o that cited specific clause numbers. The verification layer compared the AI's output against a manually labelled golden dataset before any result was surfaced to the compliance team.

Solution

What we built

We built a Next.js frontend with a Node.js backend, Pinecone for vector storage of chunked regulatory documents, and the OpenAI API for both embedding and reasoning. All document uploads, query logs, and AI decisions are audit-logged to PostgreSQL on AWS RDS — a hard compliance requirement. The system integrates with their existing Slack workflow: compliance officers receive a structured report with cited clauses, a confidence score, and a clear recommendation. Borderline cases are automatically escalated for human review.

Results

What shipped

The system handled its first real compliance review in week 5. Accuracy on the production corpus came in at 99.2%, well above the 95% threshold set as a go/no-go criterion. Manual review time dropped 65%. The compliance team now handles twice the volume with the same headcount, and the CTO reported that new product launches have accelerated by an average of 3 weeks.

Architecture

System overview

Regulatory Docs

MiFID II, PSD2, internal

Chunking + Embedding

OpenAI text-embedding-3

Pinecone Index

Semantic retrieval

Reranking + Reasoning

GPT-4o with citations

Compliance Report

Slack + Next.js dashboard

Node.jsOpenAI APIPineconeNext.jsAWS

“The accuracy was the thing that surprised us most. We had budgeted for 80% and planned to have humans review the rest. Getting 99.2% changes everything — it means we can actually scale compliance without scaling headcount.”

— Head of Compliance — FinTech, Germany (confidential)

Got a similar challenge?

Let's talk about your situation — 30 minutes, no commitment, and you'll leave with a clearer picture of how to move forward.

Let's talk More case studies