Default_best_data_analytics_tools_promo_image_with_saturated_0 (1)

Locations

  • Romania

Company Background

Our client is a publicly traded technology company focused on family safety and connectivity, serving millions of users across 140 countries. Their platform provides real-time location sharing, crash detection, roadside assistance, and other safety features. The company operates in a Remote First environment, fostering inclusivity, innovation, and collaboration.

Project Description

The Network and Systems Operations (NSO) Team is part of Cloud Operations, supporting over 325 engineers. The team's mission is twofold:

  • Providing world-class observability infrastructure and tooling for system monitoring and reporting;
  • L1 service support and incident management, ensuring high availability and reliability of services.

The role involves monitoring, responding to alerts, and executing runbooks with manual and/or automated steps to resolve problems. The system comprises dozens of microservices, all requiring tracking, reporting, and optimization. The position requires strong troubleshooting skills, familiarity with observability tools, and a proactive approach to automation.

Technologies

  • Prometheus
  • Grafana
  • Datadog
  • Java
  • Python
  • Shell
  • Ruby
  • Docker
  • Kubernetes
  • AWS
  • Terraform
  • CloudFormation
  • Chef
  • Ansible

What You'll Do

  • Use tools such as Prometheus, Grafana, and Datadog to create and maintain observability infrastructure and tooling, including creating alerts, production reporting, and writing documentation;
  • Serve as a member of L1 support, working alone or with teammates to answer pages for all onboarded services and resolve or escalate issues in a timely manner;
  • Utilize anomaly detection and alerting, respond to alerts in PagerDuty, drive incidents to their conclusion, and lead the effort to strengthen the system based on post-mortem action items;
  • Coordinate cross-team and cross-functional efforts with processes, documentation, and tooling to ensure operational excellence;

Job Requirements

  • Bachelor's in Computer Science, Engineering, related field or equivalent practical experience;
  • 5+ years experience writing/reading/debugging code in one or more languages, such as Java, Python, Shell, Ruby;
  • 5+ years experience working with large-scale distributed systems and managing Linux-based systems in a cloud like AWS;
  • In depth experience with large scale observability and reporting systems (New Relic, Datadog, Elastic, Prometheus, etc.);
  • 3+ years experience with solutions such as Docker, Kubernetes, system virtualization, cloud monitoring and logging;
  • 3+ years experience with IaC and config management tools such as Terraform, Cloudformation, Chef, Ansible, and similar;
  • Experience working as part of a team, using analytical, problem-solving skills;
  • Excellent troubleshooting and attention to detail;
  • Ability to quickly learn new technologies and follow industry trends;
  • Ability to analyze and optimize high-traffic internet applications;

What Do We Offer

The global benefits package includes:

  • Technical and non-technical training for professional and personal growth;
  • Internal conferences and meetups to learn from industry experts;
  • Support and mentorship from an experienced employee to help you professional grow and development;
  • Internal startup incubator;
  • Health insurance;
  • English courses;
  • Sports activities to promote a healthy lifestyle;
  • Flexible work options, including remote and hybrid opportunities;
  • Referral program for bringing in new talent;
  • Work anniversary program and additional vacation days.

Didn’t find anything suitable?

We’re always starting new projects and we’d love to work with you. Please send your CV and we’ll get in touch.

We will be glad to see you!

First Name is required. Maximum 50 characters.
Last Name is required. Maximum 50 characters.
Email is required. Please enter a valid email address (e.g. recipient@domaine.org).
Maximum 100 characters. Add the name of our employee (e.g. John Smith)
Maximum 2000 characters.
Please attach file in the allowed format .pdf, .doc(x), .txt, .rtf Please attach file less than 3 Mb
Formats (3 MB): doc, docx, pdf, txt, rtf
Please Add Comment or Attach File.

An error occurred sending your message.
Try again or contact us via webinforequest@coherentsolutions.com.

Thanks for your application!
We will reply soon.

Share vacancy