Senior Data Engineer → AI Platform Engineer
Databricks Lakehouse · Azure · LLM Pipelines · Agentic RAG · Banking · Cybersecurity
I’m a Senior Data Engineer with 10+ years of experience building production-grade data platforms across banking, cybersecurity, risk technology, and healthcare.
My work has focused on scalable lakehouse architectures, governed data pipelines, metadata-driven frameworks, and platform reliability. I’m now applying that same foundation to AI platform engineering by building LLM pipelines, retrieval systems, and agent-based workflows on top of strong data infrastructure.
Most AI systems fail because the data layer is broken. I build both.
|
Security data lakehouse + AI analysis Ingests multi-source security telemetry into a medallion architecture and enables natural-language investigation of enterprise security posture. Stack: |
Text-to-SQL platform Includes prompt versioning, SQL validation, automated evaluation, and feedback loops for continuous improvement. Stack: |
Pipeline monitoring + anomaly detection Detects schema drift, volume drops, and SLA breaches, then generates AI-assisted root cause analysis. Stack: |
Enterprise RAG capstone for private equity due diligence. Built to surface contradictions across earnings calls, SEC filings, and other research sources using multi-source retrieval and agentic reasoning.
Stack: Knowledge Graph · SEC EDGAR · USPTO API · LangGraph · Agentic AI · Multimodal RAG
Domain-specific RAG system with adaptive retrieval and multi-step reasoning. Includes query rewriting, corrective retrieval patterns, hallucination grading, semantic caching, and evaluation with RAGAS.
Stack: LangGraph · LangChain · Chroma · OpenAI API · RAGAS · Pydantic
- Data platform architecture for analytics and AI workloads
- Pipeline reliability, observability, and governance
- Retrieval systems and agent-based application design
- LLMOps, evaluation, and production readiness
Data Platforms
Databricks · Delta Lake · PySpark · Snowflake · Azure Data Factory · ADLS Gen2
Cloud & DevOps
Azure · Docker · GitHub Actions · Azure DevOps · Azure Functions
Languages
Python · SQL · T-SQL · Spark SQL
Vector Databases
Pinecone · Qdrant · Chroma
Agent Frameworks
LangGraph · LangChain
AI Developer Tools
Claude Code · Codex · GitHub Copilot
LLMOps / Evaluation
MLflow · RAGAS
Databases
PostgreSQL · Oracle · SQL Server · Cosmos DB
| Microsoft | Databricks | Anthropic |
|
Azure Data Engineer Associate (DP-203) Azure Data Scientist Associate (DP-100) Fabric Analytics Engineer (DP-600) Power BI Data Analyst (PL-300) Azure AI Fundamentals (AI-900) Azure Data Fundamentals (DP-900) Azure Fundamentals (AZ-900) |
Lakehouse Fundamentals Building Single-Agent Apps on Databricks GenAI App Deployment and Monitoring Building Retrieval Agents on Databricks Data Engineer Professional (in progress) |
Claude Code 101 Claude Code Architect Foundations (in progress) Building with the Claude API (in progress) Introduction to Agent Skills (in progress) |
- Databricks Mosaic AI
- Azure AI Foundry
- LLMOps and evaluation frameworks
- Agentic system design
- Enterprise RAG architecture
- LinkedIn: linkedin.com/in/roopmathi
- Email: roopmathi.gj@gmail.com
- Location: New Jersey
- Work Authorization: Canadian citizen eligible for TN visa sponsorship pathway
Most AI systems fail because the data layer is broken. I build both.