Data Extraction and Document Intelligence Systems
Integrated government portals, financial platforms, and document processing pipelines to transform fragmented external data into structured APIs and product workflows.
Overview
Built systems that collected, normalized, and exposed financial data from a wide range of external sources. The work combined large-scale web automation, document parsing, certificate-based authentication, and frontend tooling to transform fragmented data into reliable product experiences.
Problem
Financial data was distributed across numerous external systems with different authentication methods, technologies, and formats. Some platforms exposed modern APIs, while others relied on legacy SOAP services, digital certificates, server-rendered HTML, or complex JavaScript applications with anti-automation protections. The challenge was creating reliable integrations despite constant changes and inconsistent data structures.
Approach
Developed scraping and automation pipelines capable of interacting with government portals, banking systems, and financial platforms built with a variety of technologies, including Angular applications, traditional HTML interfaces, and SOAP-based services. Worked extensively with digital certificates, TLS authentication flows, and browser automation to access protected data sources. Built document processing systems capable of extracting and normalizing information from PDFs, XML files, JSON payloads, and other financial documents. Exposed the resulting data through structured APIs and product interfaces, allowing customers to access complex information through simple workflows.
Outcomes
- Integrated multiple external financial and government platforms through automated data extraction.
- Handled diverse authentication mechanisms including certificates, TLS-based flows, and legacy protocols.
- Processed and normalized data from PDFs, XML documents, JSON payloads, and web portals.
- Reduced manual collection and verification of financial information.
- Made previously fragmented external data accessible through product APIs and user interfaces.
- Created reusable integration patterns that accelerated the development of new financial workflows.
More selected work
Real-Time Conversational Engine for AI Voice Agents
Designed a finite-state conversational engine orchestrating telephony, speech recognition, language models, tool execution, and speech synthesis for real-time AI phone calls.
Scaling Digital Onboarding for a European Bank
Unified multiple identity verification providers, automated KYC and KYB risk assessment, and streamlined compliance workflows to help users open bank accounts in under five minutes.
Designing a Billing Microservice for an AI SaaS Platform
From zero to a production-ready billing domain: subscriptions, credits, Stripe integration, domain events, and failure-safe webhook handling.
Building a Low-Cost Payments Platform with Open Banking
Co-founded and built a recurring payments platform that leveraged direct debit and Open Banking infrastructure to reduce transaction costs compared to traditional card networks.
Designing the Microservice Foundation for an AI Platform
Defined the architectural standards, development patterns, and event-driven conventions used across all backend services of a multi-service AI platform.