Sacramento, California, United States
Experienced Data Engineer with a demonstrated history of learning technology and business models quickly. Skilled in SQL, Data Modeling, ETL, dbt, Data Visualization, Enterprise and Cloud-based Modern Data Stack. Passionate about transforming data from source to insights.
- Develop dbt data pipelines on AWS Redshift efficiently delivering data for the company’s 2nd largest revenue stream - Optimize AWS Redshift SQL code by analyzing and removing bottlenecks improving runtime across 200+ model project for enterprise-wide AWS Data Lakehouse modernization effort - Engineer automated data quality solution (dbt, star schema model, Power BI dashboard) reducing a daily 6+ hour manual process to 6 minutes providing data quality metrics to 3 teams - Lead code reviews and approve git pull requests ensuring code alignment with dbt best practices and data governance standards for 5+ person team - Design scope and technical requirements for code migration effort generating cost savings by consolidating 2 dbt projects into 1 - Build dimensional data models to provide data for 2 new Power BI dashboards resulting in highly performant models and accurate data for 32 clubs
- Develop data pipelines to transform data from AWS S3 into dimensionally modeled tables resulting in accurate and performant data ready for consumption in Looker Dashboards and self-service analytics - Leverage REST API calls to ingest metadata from Legacy Matillion ETL project to identify and remove unused components resulting in improved development team workflow - Redesign DDL for 1 trillion+ row tables to ensure alignment with AWS Redshift table design best practices resulting in optimized performance and minimized job runtimes - Provide technical guidance for a project to migrate from Matillion ETL to Data Build Tool (dbt) aiding in the successful implementation of dbt - Conduct GitHub based code reviews to ensure alignment of committed code with dbt coding best practices - Commit code changes to AWS Cloud formation-based Infrastructure as Code platform successfully building new AWS Redshift Clusters to increase the compute power of our data warehouse - Develop SQL scripts to identify data gaps across multiple tables at a time allowing our team to more effectively target the root cause of downstream issues - Conduct root cause analysis to determine source of data errors in the pipeline and implement necessary changes to prevent future occurrences
• Develop data models and ETL data pipelines integrating product data and ~10+ business systems for company-wide use (AWS S3, Fivetran, Snowflake, dbt, Looker) • Lead dbt project implementation and promote best practices to enable effective team collaboration and data development operations • Establish automated data quality testing to allow proactive identification and correction of inaccurate data • Advise on data modeling and data infrastructure changes to drive best practices and optimize performance • Successfully provide in-depth technical troubleshooting to identify and resolve company-wide data issues • Collaborate with multi-departmental stakeholders to scope and design solutions to translate business requirements into new dashboards and reports
• Develop database models, ETL pipelines, and dashboards used by department leadership to track usage trends of multiple applications across the University of California system • Successfully clean and prep semi-structured data using ETL processes consuming data from NoSQL, Graph, and relational sources resulting in ready-to-use data for reports and dashboards • Utilize Microsoft Power BI and the Microsoft data stack to develop enterprise level dashboards and reports enabling client research institutions and government agencies to effectively manage risk and compliance regulations • Effectively co-developed a solution to acquire and visualize geospatial data to help research institutions locate and plan hazardous waste removal
- Successfully develop custom reports with Cognos Business Intelligence to meet client data reporting needs - Utilize detailed knowledge of core product business logic and SQL Server-based relational databases to customize relational data models to meet client-specific business requirements - Write T-SQL Queries in SQL Server Management Studio to identify data needed to develop reports and to research root cause of defects reported by clients - Develop dimensional data model for SaaS-based data warehouse solution to effectively meet the large-scale enterprise reporting needs for client companies - Administer Cognos Business Intelligence systems to allow for the development of models and reports, and to facilitate client acceptance testing - Conducted SSRS report development to gather data needed to provide test case documentation and traceability for clients - Accomplished an 80% retest-success rate using ad-hoc methods on highly customized SOAP-based software integrating approximately 15 large-scale enterprise systems