Returning Candidate?

IT Site Reliability Architect (MCCA)

Requisition ID: 2025-1765
# of Openings: 1
Job Locations: US-GA-Savannah
Posted Date: 1 day ago(10/29/2025 7:59 AM)
Department: Corporate Center

Overview

We are seeking an experienced Site Reliability Architect to lead the design, implementation, and continuous improvement of our on-premises IT infrastructure at Hyundai MOBIS Corporate Center America (MCCA), supporting operations across the United States, Canada, Mexico, and Brazil. This role will focus on enhancing system reliability, performance, and scalability across our regional data centers and enterprise environments. You will translate global reliability standards into region-specific strategies, optimize incident response, automate operational workflows, and ensure high availability of mission-critical systems—all while fostering cross-functional collaboration and compliance with local regulations.

Responsibilities

(To perform within this position successfully, the incumbent must be able to perform each essential duty satisfactorily. Other duties may be assigned.)

Reliability Architecture & Design:

Design resilient, scalable, and secure on-prem infrastructure solutions aligned with global SRE principles
Evaluate existing systems for reliability gaps and propose architectural improvements
Develop region-specific reliability standards based on global guidelines and local operational needs
Integrate observability tools and telemetry systems to monitor infrastructure health and performance

Automation & Operational Efficiency:

Lead automation initiatives for infrastructure provisioning, configuration management, and incident response
Collaborate with Infrastructure and Security teams to streamline operational workflows and reduce manual effort
Define and implement service-level objectives (SLOs), indicators (SLIs), and error budgets for key systems
Drive continuous improvement through post-incident reviews and reliability-focused retrospective

Regional IT Operation Support:

Serve as a technical escalation point for major infrastructure incidents across the region
Conduct thorough root cause analyses and implement corrective actions to prevent recurrence
Maintain and update runbooks, incident playbooks, and recovery procedures
Participate in regional change control board and ensure reliability considerations are embedded in all changes

Infrastructure Governance & Compliance:

Ensure infrastructure reliability practices comply with regional regulations and global standards
Maintain accurate documentation of system architecture, configurations, and operational procedures
Support audits and compliance reviews by providing technical insights and documentation
Champion reliability-focused governance across infrastructure projects and operational processes

Partner Relationship Management:

Work closely with Hyundai AutoEver and other IT partners to align reliability goals and service delivery
Provide technical leadership in regional infrastructure projects, ensuring reliability is prioritized
Mentor infrastructure engineers and promote a culture of reliability and operational excellence
Evaluate and onboard new tools and vendors that enhance regional reliability capabilities

Supervisory Responsibilities:

Qualifications

(The requirements listed below are representative of the knowledge, skills, and/or ability required and preferred for this position.)

Required Education & Experience:

Bachelor’s degree in computer science, Information Technology, or a related field.
10+ years of experience as an IT Infrastructure Engineering or similar role in a corporate environment, preferably in the automotive industry.
5+ years of experience in site reliability engineering or infrastructure architecture roles

Required Knowledge, Skills, & Abilities:

Excellent verbal and written communication skill in English
Strong expertise in on-prem environments including virtualization (VMware vSphere, Microsoft Hyper-V)
Deep understanding of network architecture, protocols, and security (routing, switching, firewalls)
Experience with storage systems (SAN, NAS), backup/recovery strategies, and disaster recovery planning
Proficiency in infrastructure automation tools (Ansible, Terraform, Puppet, etc.)
Familiarity with observability platforms (Prometheus, Grafana, ELK stack, etc.)
Solid grasp of Windows Server and Linux (Red Hat, Rocky, Ubuntu, CentOS)
Proven incident management and root cause analysis capabilities
Knowledge of regulatory frameworks (GDPR, SOX) and IT governance practices

Preferred Education & Experience:

Master’s degree in a relevant technical or business discipline
Advanced certifications such as VMware VCAP, Cisco CCNP/CCIE, or Microsoft Certified: Azure Solutions Architect Expert
ITIL Foundation certification and experience with ITIL-based operations
Multiregional project leadership and cross-cultural stakeholder engagement experience
Bilingual speaker (English and Korean) is a plus.

Options

Apply for this job onlineApply

Email this job to a friendRefer

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.

Share on your newsfeed

Application FAQs