About The Project
Join Neurons Lab as a
Senior GCP Data Architect working on
banking data lake and reporting systems for large financial institutions. This is an end-to-end role where you'll start with presales and architecture - gathering requirements, designing solutions, establishing governance frameworks - then progress to implementing your designs through to MVP delivery.
Our Focus: Banking and Financial Services clients with stringent regulatory requirements (Basel III, MAS TRM, PCI-DSS, GDPR). You'll architect data lake solutions for critical use cases like AML reporting, KYC data management, and regulatory compliance - ensuring robust data governance, metadata management, and data quality frameworks.
Your Impact: Design end-to-end data architectures combining
GCP data services (BigQuery, Dataflow, Data Catalog, Dataplex) with
on-premise systems (ex. Oracle). Establish data governance frameworks with cataloging, lineage, and quality controls. Then build your designs - implementing data pipelines, governance tooling, and delivering working MVPs for mission-critical banking systems.
Duration: Part-time long-term engagement with project-based allocations
Reporting: Direct report to Head of Cloud
Objective
Design and deliver data lake solutions for banking clients on Google Cloud Platform:
- Architecture Excellence: Design data lake architectures, create technical specifications, lead requirements gathering and solution workshops
- MVP Implementation: Build your designs - implement data pipelines, deploy governance frameworks, deliver working MVPs with data quality
- Data Governance: Establish and implement comprehensive governance frameworks including metadata management, data cataloging, data lineage, and data quality standards
- Client Success: Own the full lifecycle from requirements to MVP delivery, ensuring secure, compliant, scalable solutions aligned with banking regulations and GCP best practices
- Knowledge Transfer: Create reusable architectural patterns, data governance blueprints, implementation code, and comprehensive documentation
KPI
- Design data architecture comprehensive documentation and governance framework
- Deliver MVP from architecture to working implementation
- Establish data governance implementations including metadata catalogs, lineage tracking, and quality monitoring
- Achieve 80%+ client acceptance rate on proposed data architectures and technical specifications
- Implement data pipelines with data quality and comprehensive monitoring
- Create reusable architectural patterns and IaC modules for banking data lakes and regulatory reporting systems
- Document solutions aligned with banking regulations (Basel III, MAS TRM, AML/KYC requirements)
- Deliver cost models and ROI calculations for data lake implementations
Areas of Responsibility
Phase 1: Data Architecture & Presales
- Elicit and document requirements for data lake, reporting systems, and analytics platforms
- Design end-to-end data architectures: ingestion patterns, storage strategies, processing pipelines, consumption layers
- Create architecture diagrams, data models (dimensional, data vault), technical specifications, and implementation roadmaps
- Data Governance Design: Design metadata management frameworks, data cataloging strategies, data lineage implementations, data quality monitoring
- Evaluate technology options and recommend optimal GCP and On Premises data services for specific banking use cases
- Calculate ROI, TCO, and cost-benefit analysis for data lake implementations
- Banking Domain: Design solutions for AML reporting, KYC data management, regulatory compliance, risk reporting
- Hybrid Cloud Architecture: Design integration patterns between GCP and on-premise platforms (ex. Oracle, SQL Server)
- Security & compliance architecture: IAM, VPC Service Controls, encryption, data residency, audit logging
- Participate in presales activities: technical presentations, client workshops, demos, proposal support
- Create detailed implementation roadmaps and technical specifications for development teams
Phase 2: MVP Implementation & Delivery
- Build production data pipelines based on approved architectures
- Implement data warehouses: schema creation, partitioning, clustering, optimization, security setup
- Deploy data governance frameworks: Data Catalog configuration, metadata tagging, lineage tracking, quality monitoring
- Develop data ingestion patterns from on-premise systems
- Write production-grade data transformation, validation, and business logic implementation
- Develop Python applications for data processing automation, quality checks, and orchestration
- Build data quality frameworks with validation rules, anomaly detection, and alerting
- Create sample dashboards and reports for business stakeholders
- Implement CI/CD pipelines for data pipeline deployment using Terraform
- Deploy monitoring, logging, and alerting for data pipelines and workloads
- Performance tuning and cost optimization for production data workloads
- Document implementation details, operational runbooks, and knowledge transfer materials
Skills & Knowledge
Certifications & Core Platform:
- GCP Professional Cloud Architect (strong plus, not mandatory) - demonstrates GCP expertise
- GCP Professional Data Engineer (alternative certification)
- Core GCP data services: BigQuery, Dataflow, Pub/Sub, Data Catalog, Dataplex, Dataform, Composer, Cloud Storage, Data Fusion
Must-Have Technical Skills:
- Data Architecture (expert level) - data lakes, lakehouses, data warehouses, modern data architectures
- Data Governance (expert level) - metadata management, data cataloging, data lineage, data quality frameworks, hands-on implementation
- SQL (advanced-expert level) - production-grade queries, complex transformations, window functions, CTEs, query optimization, performance tuning
- Data Modeling (expert level) - dimensional modeling, data vault, entity-relationship, schema design patterns for banking systems
- ETL/ELT Implementation (advanced level) - production data pipelines using Dataflow (Apache Beam), Dataform, Composer, orchestration
- Python (advanced level) - production data applications, pandas/numpy for data processing, automation, scripting, testing
- Data Quality (advanced level) - validation frameworks, monitoring strategies, anomaly detection, automated testing
BFSI Domain Knowledge (MANDATORY):
- Banking data domains: AML (Anti-Money Laundering), KYC (Know Your Customer), regulatory reporting, risk management
- Financial regulations: Basel III, MAS TRM (Monetary Authority of Singapore Technology Risk Management), PCI-DSS, GDPR
- Understanding of banking data flows, reporting requirements, and compliance frameworks
- Experience with banking data models and financial services data architecture
Strong Plus:
- On-premise data platforms: Oracle, SQL Server, Teradata
- Data quality tools: Great Expectations, Soda, dbt tests, custom validation frameworks
- Visualization tools: Looker, Looker Studio, Tableau, Power BI
- Infrastructure as Code: Terraform for GCP data services
- Streaming data processing: Pub/Sub, Dataflow streaming, Kafka integration
- Vector databases and search: Vertex AI Vector Search, Elasticsearch (for GenAI use cases)
Communication:
- Advanced English (written and verbal)
- Client-facing presentations, workshops, and requirement gathering sessions
- Technical documentation and architecture artifacts (diagrams, specifications, data models)
- Stakeholder management and cross-functional collaboration
Experience
- 7+ years in data architecture, data engineering, or solution architecture roles
- 4+ years hands-on with GCP data services (BigQuery, Dataflow, Data Catalog, Dataplex) - production implementations
- 3+ years in data governance (MANDATORY) - metadata management, data lineage, data quality frameworks, data cataloging
- 3+ years in BFSI/Banking domain (MANDATORY) - AML, KYC, regulatory reporting, compliance requirements
- 5+ years with SQL and relational databases - complex query writing, optimization, performance tuning
- 3+ years in data modeling - dimensional modeling, data vault, or other data warehouse methodologies
- 2+ years in presales/architecture roles - requirements gathering, solution design, client presentations
- Experience with on-premise data platforms (MANDATORY) - Ex. Teradata, Oracle, SQL Server integration with cloud