Our client is looking for a Data Engineer to develop some of our most complex data products and own the evolution of our data platform. In addition to shipping code that processes some of our largest and most complex data sets, they will also ensure that our data platform is innovating to support business priorities, thinking through aspects of resiliency, scale, speed, and cost.
What is required?
- 5+ years experience as a data engineer working with a modern tech stack
- Expertise in using SQL and Python to process large scale datasets
- Proficiency in batch processing technologies such as Spark and Airflow
- Ability to architect data processing systems with in a cloud environment (we’re an AWS shop)
- Mindset for data quality, especially around using technology to automate problems of data quality
What will be a plus?
- Exposure to data streaming technologies such as Firehose and Kafka
- Experience with additional scripting languages such as Ruby and R
- Comfort working with Docker and/or Kubernetes
- Experience either standing up a data warehouse from scratch, or migrating to a new data warehouse
What you will do:
- Design and ship data pipelines impacting internal analytics, logistics, ecommerce, and marketing.
- Create automated monitoring that provides observability and resiliency for our data systems.
- Maintain the infrastructure and tooling powering our analytics and data products, including but not limited to our data warehouse (Redshift/Snowflake), Airflow, and Fivetran.
- Help architect systems (infrastructure, pipeline design, etc.) that deal with complex data flows and/or large datasets. Examples might be product recommendations, event streaming, or ERP systems.