Ingress/Case Studies/Financial Services Data Platform

Unified data platform for a top-30 financial services firm.

14 siloed source systems. Manual regulatory reporting consuming 12 analysts and three weeks per cycle. We consolidated everything into a single Snowflake platform and cut reporting time to under three days.

Financial Services Data Platform Engineering SOC 2 Type II 6-month delivery
Compliance posture
SOC 2 Type II
Audit-ready from day one
Reporting cycle
3 weeks to 3 days
BCBS 239 and CCAR automation
6mo
Time to Production
12TB
Data Unified
60%
Faster Regulatory Reporting
Engagement Overview

14 source systems, no source of truth.

A top-30 financial services firm was operating trading systems, risk models, CRM, core banking, and nine additional source systems with no integration layer and no single authoritative data source. Regulatory reporting (CCAR, BCBS 239) was entirely manual: 12 analysts running spreadsheet-based reconciliations for three weeks every reporting cycle.

Data lineage was undocumented, which created audit exposure. When examiners asked how a number was derived, the answer required manually tracing through multiple systems. The firm had been flagged in an internal audit for lineage gaps and needed to demonstrate a remediation path to the board risk committee.

Engagement snapshot
Client typeTop-30 financial services firm
SegmentCapital markets and banking
Duration6 months to production
PlatformSnowflake on AWS
ComplianceSOC 2 Type II, BCBS 239, CCAR
AI layerOtonmi (NL-to-SQL analytics)
The Challenge

Manual reporting. Audit exposure. No lineage.

The reporting problem was operationally painful: 12 analysts locked in a three-week sprint every quarter, running reconciliations that should take hours. But the audit exposure was the harder issue. BCBS 239 requires documented data lineage for all risk-aggregation inputs. Without it, the firm was carrying regulatory risk that examiners had already noticed.

The 14 source systems also had inconsistent data definitions for the same concepts. "Trade date" meant different things in three systems. "Credit exposure" was calculated three different ways. Building a unified platform required resolving these semantic conflicts before any technology decisions, or the consolidation would just move the reconciliation problem from spreadsheets to SQL.

  • 14 siloed source systems with inconsistent data definitions for core concepts like trade date, exposure, and counterparty.
  • 3-week manual reporting cycle consuming a 12-analyst team every quarter for CCAR and BCBS 239 submissions.
  • Undocumented data lineage flagged in an internal audit, creating examiner exposure and a board-level remediation commitment.
  • SOC 2 Type II gap on data access controls across the fragmented environment, with no unified governance model in place.
The Work

Semantics first, then the platform.

Before writing any Snowflake SQL, we spent three weeks on data modeling workshops with source system owners to resolve semantic conflicts. Every concept that appeared in more than one system got a canonical definition, a data owner, and a contract. The platform was built on top of agreed facts, not assumptions.

01.
Data modeling and semantic resolution
Three-week discovery and modeling engagement with owners of all 14 source systems. Output: canonical data dictionary, data ownership matrix, and data contracts for every concept crossing system boundaries. This step prevented the reconciliation problem from moving downstream into the platform.
Data contractsSemantic layer
โ†’
02.
Snowflake platform medallion architecture
Snowflake data warehouse built with a Bronze (raw ingestion), Silver (cleansed and conformed), and Gold (business-ready, report-ready) medallion architecture. dbt handled all transformations with automated documentation and column-level lineage. Every transformation traceable from source to report.
SnowflakedbtMedallion arch
โ†’
03.
Regulatory reporting automation
CCAR and BCBS 239 report generation automated from the Gold layer. Reporting cycle reduced from manual three-week analyst sprint to automated run with human review, completing in under three days. BCBS 239 lineage documented and examiner-ready at the column level for all risk aggregation inputs.
BCBS 239CCARAutomated reporting
โ†’
04.
Governance, SOC 2 and AI analytics
Snowflake data governance built with role-based access, row-level security, and dynamic data masking for PII-adjacent fields. SOC 2 Type II controls documented against the platform. Power BI semantic layer for executive dashboards. Otonmi AI layer added for natural language analytics, letting analysts query in English against the Gold layer.
SOC 2Power BIOtonmi AI
โ†’
Results

In production in six months. Audit-ready on day one.

The unified Snowflake platform went to production in six months. The quarterly regulatory reporting cycle, previously a three-week manual sprint consuming 12 analysts, now runs in under three days as an automated process with human review. BCBS 239 data lineage is fully documented at the column level, resolving the examiner exposure flagged in the prior audit.

Annual analyst time savings exceed 2,400 hours. SOC 2 Type II report was issued at the end of year one. Executive dashboards in Power BI gave the CFO and Chief Risk Officer real-time views of key risk and performance metrics that previously required a data request with a 72-hour turnaround. The Otonmi NL-to-SQL analytics layer let non-technical business users query the platform directly, removing the backlog from the data team.

Reporting cycle from 3 weeks to 3 days

Quarterly CCAR and BCBS 239 submissions now run as automated processes. The 12-analyst sprint is a historical reference, not an operating model.

BCBS 239 lineage documented at column level

Every risk aggregation input traceable from source table to regulatory output. Examiner exposure from prior audit fully remediated.

SOC 2 Type II report issued at year one

Data access controls, change management, and availability SLAs documented and tested. Type II report issued with no exceptions noted.

2,400+ analyst hours saved annually

Time previously spent on manual reporting reconciliation redirected to analysis, model validation, and strategic work.

Stack and Frameworks

What we built with.

Data platform

Snowflake on AWS.

Medallion architecture (Bronze, Silver, Gold). Snowflake Data Sharing for cross-team access. Dynamic data masking and row-level security for SOC 2 and PCI-adjacent data.

SnowflakeAWSMedallion arch
Transformation and lineage

dbt and data contracts.

dbt Core for all transformations with automated documentation and column-level lineage tracking. dbt Data Contracts for inter-team source agreements. Airflow for orchestration.

dbtAirflowData contracts
Analytics and AI

Power BI and Otonmi.

Power BI semantic layer for executive and risk dashboards. Otonmi natural-language-to-SQL analytics for self-serve business user queries against the Gold layer without data team involvement.

Power BIOtonmi AINL-to-SQL
Related

More from this practice.

Start a conversation

Building a financial data platform?

// 30 minutes to a written brief.

Bring the regulatory constraint or platform challenge. We return with a written approach, phased plan, and honest cost ranges. SOC 2, PCI, and audit-ready from day one.

Emailconnect@ingressits.com
GSA MAS#47QTCA26D000K
Reply< 24 hrs