Staff Site Reliability Engineer Job at Ipro Networks Pte. Ltd., Palo Alto, CA

aGNpb2ZNa2JBSWNIc0hHdGJhbnYwT1NHa3c9PQ==
  • Ipro Networks Pte. Ltd.
  • Palo Alto, CA

Job Description

Staff Site Reliability Engineer (Remote, US) Compensation: $200K–$250K + Equity Full-Time | Remote | Infrastructure Team We’re hiring a Staff Reliability Engineer to help scale and maintain the massive GPU infrastructure that powers our cutting-edge AI systems. If you're passionate about building robust, scalable systems and solving deep infrastructure challenges at scale, this role is for you. What You’ll Be Doing Work closely with engineers and researchers to define and meet system performance, availability, and efficiency requirements. Operate and manage thousands of GPUs distributed across multiple cloud providers and clusters. Design scalable solutions to support rapid growth in compute demands for AI model training, data processing, and inference. Build resilient, fault-tolerant systems to ensure continuous uptime and seamless performance. Develop automation tools to eliminate toil and streamline infrastructure operations. Set up and maintain monitoring systems to proactively detect issues and drive performance improvements. Define and track SLOs and SLIs that uphold system reliability standards. Participate in an on-call rotation to ensure 24/7 system availability. Qualifications Proven 7+ years of experience as a reliability engineer, infrastructure engineer, or production engineer in fast-paced, high-growth environments. Deep knowledge of GPU infrastructure, including scheduling, scaling, cloud networking, storage, and security. Proficiency in one or more scripting or programming languages. Strong experience with Kubernetes or similar container orchestration systems. Familiarity with Infrastructure-as-Code tools like Terraform or CloudFormation. Experience working with observability tools like Prometheus, Grafana, DataDog, ELK, or Splunk. Excellent troubleshooting, debugging, and systems thinking. Strong communication skills and a collaborative mindset. Bonus: Experience in AI/ML infrastructure, or managing large-scale GPU clusters. What We're Building We're developing highly complex infrastructure to support advanced AI research and production systems running on thousands of GPUs. This is an opportunity to work on some of the most demanding reliability and performance challenges in tech today—at scale. You’ll have direct impact on how infrastructure supports foundation model development and deployment. Compensation & Benefits Base Salary: $200K–$250K/year Competitive equity package (stock options) Comprehensive health benefits Generous PTO and flexible work policies Support for ongoing professional development #J-18808-Ljbffr Ipro Networks Pte. Ltd.

Job Tags

Full time, Remote job, Flexible hours,

Similar Jobs

Providence Service

Orthopedic Technician Job at Providence Service

DescriptionUnder the general supervision of the Back Office Lead/Medical Office Manager, the Orthopedic Technician will assist the orthopedic physician in providing casts and/or limb stability, wound care, general patient education, and general nursing care to the maximum... 

BCG Attorney Search

Senior Estate Planning Attorney- 375029 Job at BCG Attorney Search

 ...029 Practice area:- Corporate - Private Equity,Elder Law,Estate & Tax Planning,Probate,Real Estate - General,Tax - General,Trusts and Estates...  ...and expanding law firm is seeking a Senior Estate Planning Attorney to join its Irvine, California office. This full-time... 

Barkoskie Electric Service Inc

Electrical Helper Job at Barkoskie Electric Service Inc

 ...Handling basic hand and power tools # Running conduit or cable trays # Assisting with the installation of electrical systems, lighting, or power # Must have 2 to 3 years of electrical experience Company Description Small electrical company. Locally owned and... 

Code Ninjas

Coding & STEM Instructor Job at Code Ninjas

Code Ninjas is the nation's fastest-growing kids' coding franchise. In our center, kids ages 5-14 learn to code in a fun, non-intimidating...  ...go through a background check Shift : We are flexible and instructor can choose to work any of these days when we are open (Tuesday,... 

Vietnam Business Forum

Tax Consultant Job at Vietnam Business Forum

 ...with people at all levels in an organizationWillingness to travelWhat youll doYoull work in teams of typically 3 5 consultants, playing an active role in all aspects of client engagement. This includes gathering and analyzing information, formulating and...