Site Reliability Engineer

Job Number: R10007730
Brand: Fox Networks Group
Job Type: Engineering
Location: Playa Vista, California, United States of America
Job Posting Date: September 12, 2019
Fox Networks Group logo
Apply Now Current Employees and Freelancers/Contractors Apply Here*


Fox Networks Group

We are a collection of enduring brands and iconic, unforgettable characters. We create content for audiences big and small, reaching billions of people every day. Most of all, we are many individuals, each uniquely talented, each a critical piece to our puzzle, who collectively become a power. Desire to thrill and engage? Join us as we engage and inspire.


Team Mission:

This team currently consists of passionate engineers who strive to demonstrate excellence in the field of DevOps. We are not only responsible for the uptime of the various .COM websites, Broadcast Data Network and backend services, but a large portion of the job is to innovate.  Historically, once the infrastructure is configured and working – it is mandated that there are no more changes to the system. Not here. We frequently and deliberately rebuild our entire system in an automated fashion. These activities help us not only discover pain points in our system, but it gives us the opportunity to improve continually. How do we guarantee geographic redundancy? How do we get the code to production faster? 15 minutes to rebuild a system, how can we get it down to 2? Instead of Mongo, should we use DocumentDB or DynamoDB?  How can we orchestrate a cloud failover from one provider to another?

Who are we looking for?

  • Someone who is looking to be creative, build and script solutions from top to bottom and become a part of an A+ team.
  • The Site Reliability Engineer is a part of an innovative team, who are on a continuous mission of building bulletproof, scalable, secure private and public cloud environments for our customers and users. 
  • If you think hard is fun, and get bored easily if you aren’t challenged, this might be the place for you. We want someone who has an insatiable thirst for technology, desire to learn and grow – individually, with the team, and the business.
  • This is a challenging position but would be the perfect fit for someone who wants to contribute, grow or get started in their career.

The Challenge:

  • The SRE is responsible for any and all tasks related to the performance, stability, reliability, efficiency, and security to both the sites and the general team operations. Responsibility also extends to how incidents are managed and operated.
  • Design and develop complete end to end automation environment using configuration/auto-scaling tools.
  • Define standards for configuration, monitoring, reliability, scalability, performance optimization and capacity planning of new infrastructure focused on 99.9%+ uptime.
  • Respond to off-hours and weekend emergency alerts, alarms, and requests, in keeping with the team's on-call rotation schedule.
  • Document solutions and create diagrams.
  • Strategize with the teams to develop new technology initiatives with a primary focus on availability, supportability, scalability, security, and performance.
  • Configure and tune an enterprise monitoring and instrumentation system(s) to efficiently detect existing issues and predict future issues based on trends.
  • Stay up-to-date with technology. Recurrently advance your technical skill-sets.
  • Continuously improve via taking justifiable risks, not being afraid to fail.
  • Be flexible and at the same time push back respectfully to ensure we are doing what is best for the company in the long run.
  • Challenge the status quo by recommending / pushing for changes that improve reliability and velocity.


  • Bachelors degree
  • Strong desire to learn
  • Proficiency in Python
  • Team player

The rest of the requirements are a nice to have. You will get to learn all this and more!

  • Experience with configuration management systems such as Ansible.
  • Understanding of end-to-end technology stacks which include but is not limited to OS, Network, Application, Relational & Nonrelation Databases, interacting with APIs and Security (network & application).
  • Understanding of cloud-based architectures and concepts. Knowledge and hands-on experience of/in AWS and GCP (including serverless technologies, APIs, Kubernetes, etc.).
  • Treat infrastructure as code - You will build infrastructure inside of AWS/GCP via code. All our environments are expected to be scripted and checked in, so familiarity with tools such as Terraform and CloudFormation will come in handy here.
  • Experience implementing self-service solutions to reduce workload on the DevOps team and allow Development and business teams be more self-sufficient.
  • Experience working in collaborative environment such as Bitbucket or Git.
  • Knowledge and experience in automation.

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law.

Apply Now Current Employees and Freelancers/Contractors Apply Here*

Back to Search Results
New Job Search


*Current Employees and Freelancers/Contractors: Do not create an account or apply from the 21CFCareers site. You must apply via the “Career” application on the internal Workday portal (link below) using your username and password. Note: Current freelancers/contractors must also apply on the internal Workday portal with your username and password.

We use first- and third-party cookies to improve our services, personalize your advertising and remember your website preferences. If you continue to browse, you accept the use of cookies on our site. For more information (e.g., How to disable cookies), please see our cookie policy.