Senior Data Engineer

  • Netflix
  • Los Gatos, CA, United States
  • Jan 22, 2018
Full time Engineering Information Technology Internet STEM Media

Job Description

Ever wonder how Netflix serves a great streaming experience with high-quality video and minimal playback interruptions? Or how we are confident that major UI redesigns across thousands of devices are well received? 

It’s because new ideas are constantly being explored across all aspects of the Netflix product, including our UI experience, recommendation/search algorithms, sign-up flows, messaging, adaptive streaming algorithms, etc. To ensure that these ideas deliver experiences our subscribers love, we diligently test and measure the impact of all these proposed enhancements using ABlaze, Netflix’s Experimentation Platform.  

Given the size and complexity of our datasets, this analysis is no trivial task. Our platform must process data for A/B test analysis across 100M+ subscribers and over 100M+ hours of video streamed every day. Our current infrastructure has served us well, but now that Netflix is in almost every country around the world, it’s time to evolve and scale.

Consequently, we’re re-architecting our data pipelines to provide our users with more flexible self-serve experimentation analysis capabilities. This means allowing them to specify their own data sources, metrics, dimensions, and statistical tests, and finally choose from a variety of data visualization options in order to construct reusable reports for each experimentation area.

Since almost every product decision at Netflix depends on A/B test analysis, your efforts on this team will impact improvements throughout our product and therefore the experience of the millions of our users across the globe.

What You’ll Be Doing:

  • Partnering with internal customers. You’ll partner with teams across Netflix to understand and enable their analysis needs. You should be comfortable in an environment of context, not control, where you’ll receive context on your users’ needs and drive follow up conversations to help you and your peers decide how to best serve those needs.
  • Prototype and productize configurable data pipelines as part of a team of experienced data engineers who are continually learning from each other and collectively improving our craft.  

Your Skills & Characteristics:

  • You have at least 5 years of experience in a data-­driven environment designing and building distributed data processing systems
  • You have experience building production data pipelines (using Hadoop, Hive, Spark, etc.) on large scale datasets. You should have an unmistakable passion for elegant and intuitive dataset design. You have deep hands-on experience with schema design and data modeling (using datastores such as Druid, ElasticSearch, etc.).
  • You have proficiency programming in Java and/or Scala. You strive to write beautiful code and you're comfortable working in a variety of tech stacks.
  • You're ambitious and quick to take action, while also open to new ideas. You recognize when you're wrong and move past your own mistakes.
  • You enjoy staying current with technology and continually strive to be better at your craft.
  • Experience with stream processing technologies (e.g. Spark Streaming, Kafka, Flink) is a major plus.
  • Some exposure to analytics, statistics, or A/B or Multivariate testing is also a plus

A couple of other things:

  • We expect a lot. Our culture is unique and we live by our values, so it's worth learning more about it here.
  • We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.