Chris Shumaker's Career
Senior Software Engineer
October 2024-Present
Recommended key solutions to incident resolution of production Spark data pipeline processing 1.5TB of data per day on AWS EMR responsible for up-to-date traffic data products and services used by millions of people
Addressed back pressure in production Flink data pipeline processing 280MB/s of data on AWS Managed Flink (KDA) that ingests billions of spatial data points for analysis and product development
Principal Software Engineer
August 2018 - May 2024
Built a data platform on AWS from scratch that ingest hundreds Terrabytes of data per day, store Petabytes of data, support over 300 analytics users and millions of customers
Invented a cloud-based obstacle avoidance method that was patented
See US 11,467,585 B2
Built lots of PoC's using a wide variety of databases, compute frameworks, and infrastructure management technologies
Associate Consultant
March 2016 - August 2018
Replaced Oracle SQL batch jobs with Spark SQL ETL reducing transfer time by several days
Indexed and visualized data from source systems using Elasticsearch and Kibana dashboards
Developed microservices in Spring Boot with Maven for business process management
Spearheaded installation of Docker engines and adoption of containers to streamline operations
Standardized inter-service communication and authorization with RabbitMQ and Kong
Enforced testing of code early using Jenkins and SVN for continuous integration
Streamlined application delivery using Docker Compose and Bash for continuous delivery
Assisted in development of web application user interface in Angular and Javascript on Nginx
Senior Consultant
November 2015 - March 2016
Overhauled Python Behave and Selenium test harness for Salesforce including user login
Researched SSL and OpenAuth to integrate with Salesforce while writing automated tests
Improved accountability with behavior driven development during requirements gathering
Expanded testing abilities by training on Salesforce SOQL and user interface development
Introduced testing best practices for high volume realistic mock data preparation and cleanup
Consultant
June 2012 - November 2015
Simplified Java ETL development in Spring Batch, reducing network overhead with chunking
Maintained Qpid-based messaging system for Positive Train Control (PTC) in C++ and Python
Managed server farm of 400 virtual machines for distributed testing with Bash, Python, Ruby
Improved relations with a major financial firm, architecting an guaranteed transaction manager
Led and trained a team of 6 developers to extend PTC systems in C++, Python, and Ruby
Systematized the complex environment setup of a distributed system with Vagrant and Puppet
Aided the PTC team as Scrummaster by facilitating effective meetings and swarming blockers
Authored automated validation tests in Cucumber and Ruby in behavior-driven development
Additional Skills
Data
SQL, Iceberg, PySpark, Flink, Parquet, NiFi, Kafka, Oracle RDBMS
Cloud
AWS: CloudFormation, S3, ECS, ECR, Batch, CloudFront, Lambda, Kinesis (streams, firehose, video, and analytics), CloudWatch, Managed Grafana, CodePipeline, IAM, Service Catalog, Batch, RDS, Redshift, Athena, EMR, Sagemaker, Glue, SNS, SQS, EventBridge, Step Functions, Lake Formation, SES, SAR, Timestream, CodeBuild, IOT, KMS, Secrets Manager, DynamoDb
Architecture
Data Mesh/Lake/Warehouse, Service Oriented Architecture, ETL, Messaging, Microservices, Distributed Systems, Serverless
Programming
Python, SQL, Jupyter, Java, Javascript, C/C++, Git, Jenkins, Red Hat Linux
Libraries
Shapely, Pandas, Scikit-learn, Matplotlib, Keras, OpenCV, Spring
Models and Algorithms
Linear/Logistic Regression, Neural Networks, SVMs, Reinforcement Learning, Dimensionality Reduction, Recommendation Systems, Anomaly Detection, Iterative Closest Point (Point Cloud Registration)