The world’s number 1 highest-traffic job website, Indeed.com has 250 million unique monthly visitors and 9.8 jobs posted every second. A giant on a mission to help people find jobs, Indeed works with product teams in Austin, Tokyo, Seattle, San Francisco, Singapore, and Hyderabad.
On behalf of Indeed, AgileEngine is looking for a Middle Data Engineer passionate about Data Governance.
As part of this project, you will help your team understand the data they rely on, create data definitions, and outline how data can and should be used within the storage of 125 Petabytes! Finally, you will take ownership of designing and implementing automatic data analysis instruments that front-end teams will use in their products.
What is required?
- 2+ years of experience with big data modeling utilizing HadoopEcosystem
- 2+ years of experience of developing in Python/Scala/Java to transform large datasets on distributed and cluster infrastructure
- Experience with SQL. Must have the ability to write complex, highly-optimized queries across large volumes of data
- Ability to take initiative to ask questions, identify patterns, and share discoveries or recommendations from your technical analysis of the code
- Curiosity and passion about data, visualization, and solving problems
- Experience with reporting, descriptive statistics, probability, and cleaning big datasets
- Experience with version control systems, GitLab in particular
- Experience with Docker and Jenkins
- Willingness to question the validity, accuracy of data and assumptions
- Enjoyment from collaborating with others in team environment
- Eagerness to learn in a fast-paced environment
- Drive and self-reliance
- Intermediate+ English
- B.S. degree in math, statistics, computer science, or equivalent technical field
What will be a plus?
- Experience with Apache Spark, Apache Hive, Apache Flink, Apache Kafka
- Knowledge of Unix-based operating systems (bash/ssh/ps/grep etc.)
What will you do?
- Partner with product and engineering teams to define requirements for capturing/logging/curating new data; coordinate with product and engineering on new product lines to ensure all new data is incorporated into our data governance model
- Work with a set of stakeholders and analysts to identify the data required to operate an area of the business. Define what it means to have complete and accurate data
- Drive consistency of data across front-end (web app, 3rd party tool) and back-end systems (application to application)
- Work as part of a team of data governance analysts to ensure consistent data use across the entire company
What about the project?
The world’s number 1 job site, Indeed.com has 250 million unique monthly visitors and 9.8 jobs posted every second. A giant on a mission to help people find jobs, Indeed works with product teams in Austin, Tokyo, Seattle, San Francisco, Singapore, and Hyderabad.
Ready to help Indeed improve people’s lives, one job at a time?