Senior Data Architect

Kab systems
Washington, DC, USUSA
Full-time
AI
Posted 3 days ago

Job Description

**JOB OVERVIEW** We are seeking a hands\-on **Senior Data Architect** to lead the design, optimization, and modernization of our enterprise data infrastructure. In this role, you will bridge the gap between high\-level architectural strategy and hands\-on implementation. You will work closely with federal and contracting staff to leverage and expand our existing cloud ecosystems on both AWS and Microsoft Azure. You will be required to implement a modern Medallion Architecture and Data Fabric strategy to deliver a shared data and analytics services platform to support other analytics teams and business users. This role owns the security, automation, and discoverability foundations of the data platform, designing and documenting Role\-Based Access Control and provisioning strategies to secure the data lake, warehouses, and pipelines. It drives data model change control through automated CI/CD pipelines and Infrastructure as Code, while building a cloud\-based data cataloging and metadata repository to enable self\-service discovery across diverse data sources. The role also integrates downstream BI tools like Tableau, Power BI, Qlik, and Socrata with the data lakehouse, and authors clear technical documentation—such as Data Access Approach documents and architectural blueprints—to support development and design work. **RESPONSIBILITIES** **Architectural Leadership \& Strategy** ● **Target State Delivery:** Lead the logical and physical design of data warehouses, data lakes, and data marts, ensuring a robust, secure, and optimized architecture. ● **Requirements Gathering:** Assist in soliciting, assessing, and translating complex data requirements from diverse data consumers, including analysts, visualization specialists, programmers, and engineers. ● **Platform Governance \& Expansion:** Administer, govern, and expand the program's Databricks platform (hosted in AWS), including performance tuning, configuration, and structural growth. ● **Governance Integration:** Support the onboarding, configuration, and eventual administration of the **Collibra** data governance platform to establish enterprise\-wide data standards. ● **AI/ML Enablement:** Engage with and evaluate emerging AI, Machine Learning, and advanced analytics tools (including LLM\-based analytics, vector databases, or AI agent frameworks) to inform forward\-looking architecture decisions. **Data Engineering \& Migration Execution** ● **Hands\-on Implementation:** Actively deliver solutions through the creation of tables, relationships, optimized schemas, and scheduled data processing jobs. ● **Medallion Migration:** Execute the migration of legacy data pipelines to a modern **Medallion Architecture** (Bronze, Silver, Gold processing states) within Databricks. ● **Pipeline Modernization:** Migrate legacy data sources (e.g., AWS EDMAS PostgreSQL) to Databricks Bronze and rewrite existing AWS Glue jobs into optimized **PySpark/Databricks** jobs. ● **Core Engineering Components:** Build and implement foundational pipeline components, including comprehensive error handling, data quality checks, and ETL audit logs across critical subject areas. ● **Data Preparation \& Quality:** Enrich data repositories to support Machine Learning (ML), Deep Learning (DL), and Natural Language Processing (NLP) using advanced data science techniques such as data cleaning, preprocessing, standardization, and Master Data Management (MDM). ● **System Integration:** Seamlessly connect heterogeneous data systems and manage data replication using **MuleSoft** as the primary integration tool. ● **Technical Documentation:** Author, maintain, and guide complete, high\-quality technical documentation deliverables (including Data Access Approach documents and architectural blueprints) for all development and design efforts. **REQUIRED SKILLS \& QUALIFICATIONS** ● **Experience:** 10\+ years of dedicated Data Warehousing and Data Architect experience. ● **Core Paradigms:** Deep mastery of logical/physical modeling, Medallion Architecture, Data Fabric designs, and building transactional application models. ● **Advanced Databricks:** Strong hands\-on expertise with advanced Databricks features, including Unity Catalog, Delta Live Tables (DLT), and Lakeflow/pipeline orchestration frameworks. ● **Cloud Components:** Proven track record architecting data lake and data warehouse models using cloud infrastructure components (including AWS S3, AWS Redshift, AWS Aurora PostgreSQL). ● **Technical Documentation:** 5\+ years of experience defining, guiding, and delivering comprehensive technical documentation for large\-scale engineering efforts. **PREFERRED SKILLS \& QUALIFICATIONS** ● **Enterprise Governance Platforms:** Direct experience working with **Collibra** or comparable enterprise data governance and cataloging tools (e.g., Alation, Informatica, Ataccama). ● **Modern DevOps \& BI:** Experience utilizing Infrastructure as Code (IaC) for data infrastructure deployment and integrating BI tools (e.g. Tableau, Power BI, Socrata). ● **Domain Experience:** Prior experience delivering data systems within **Federal Government** or Department of Defense (DoD) environments. Pay: From $145,000\.00 per year Benefits: * Dental insurance * Employee assistance program * Flexible schedule * Health insurance * Health savings account * Paid time off * Parental leave * Professional development assistance * Retirement plan * Vision insurance Work Location: Hybrid remote in Washington, DC 20590