Job Information
System One Site Reliability Engineer Lead - Contractor Pittsburgh, Texas
Position Tittle: SRC Lead (Site Reliability Center Lead)
Location: Pittsburgh, PA or Cleveland OH or Birmingham AL or Dallas, TX or Phoenix, AZ
Hybrid - 2-3 days in office
Years of Experience: 12+ years applicable experience required
*For immediate consideration, you can reach me out at 412.516.1987/ shafique.mohammed@systemone.com *
Team Dynamic: SRC Lead will be overseeing a team of global contractors (L1.5 Engineers).
Role Overview:
As an SRC Lead, you’ll be at the forefront of ensuring the reliability, availability, and performance of critical enterprise technology and security applications. Your leadership will drive operational excellence, foster collaboration, and elevate the overall reliability of our systems within the Site Reliability Center (SRC). You’ll work closely with cross-functional teams, mentor engineers, and contribute to the success of the organization.
NOTE FOR THE SKILLS/TECHNOLOGIES
Be knowledgeable enough to jump in, drive the conversations to resolution, and escalate if needed to BANK Application System Managers/SMEs (ex. Here is the problem, here is what we think it is, here is the solutions we think we should do, what do you want to do?).
Top Technologies:
• Monitoring and Debugging Tools (LogScale, Splunk, Dynatrace)
• DevOps pipeline (Git, Jenkins, Artifactory)
• Infrastructure (Red Hat Linux, Openshift, Windows)
• Networking (DNS, Load-balancing, Network tracing, Firewall)
• Database (Oracle, SQL)
• API understanding & Web services technologies: (SOAP, JSON, REST)
• Directories (LDAP, Active Directory)
• Java
Secondary:
• Python/Java Scripting, Ansible, Powershell for Automation purposes
• Modern development technologies and tools: (Agile, CI/CD, Git, Jenkins)
• Kafka Event Streaming
• ETL/Informatica
Nice to Have:
• Database (Mongo, Cassandra, other databases)
• Evolve
Responsibilities Summary:
Production Support. NOT new development. Troubleshoot highly technical problems which may require assessing source code to analyze and resolve problems. This requires advanced troubleshooting skills and must be able to adapt and create non-standard approaches to problem solving.
*There are 185 applications and platforms combined in this space. It is acknowledged that expertise is not expected in all, but emphasis will be needed to develop SME for the Criticality Level 0/1 mnemonics, which are reflected in the top skills.
We are looking for someone who is astute enough to see a problem and fix it or escalate it to BANK SME teams and learn from how they fix the problem. Runbooks should then be updated accordingly.
Key responsibilities:
• Create and Maintain documentation to ensure knowledge accessibility.
• Liaise with other application support teams and internal/external business and technical partners.
• Provide ad hoc and on-demand reports.
• Perform timely escalation of critical issues and proactively identify patterns of recurring issues to improve production.
• Lead problem resolution and conduct root cause analysis and establish processes that will help incident prevention.
• Participate in the Incident and Problem Management processes as a resolver accountable for root cause analysis, resolution and reporting.
• Guidance to all staff involved and vendors in driving a coordinated approach for results.
• Reduce escalations to Level 3 based on incremental learning about applications.
Technical Acumen and System Familiarity:
While the majority of the role involves management, the SRC Lead should possess a solid understanding of the systems and technical stacks they are supporting. They should be able to pull up dashboards, troubleshoot issues, and guide conversations related to system health. Additionally, they must effectively manage impact and risk.
System Monitoring and Health:
Lead the production environment by monitoring availability and taking a holistic view of system health.
Quality and Time-to-Market:
Drive improvements in reliability, quality, and time-to-market for software solutions.
Performance Optimization:
Continuously optimize system performance, anticipating customer needs and innovating for excellence.
Operational Leadership:
Provide primary operational support for large-scale distributed software applications.
Mentorship:
Mentor and guide engineers within your shift team, fostering growth and technical expertise.
Stakeholder Communication:
Manage team operations while effectively communicating with directors and other executives/CIOs who have a stake.
Qualifications:
• Proactive Approach:
Take a proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
• Leadership Experience:
Demonstrated leadership in technical roles, preferably within Site Reliability Engineering (SRE) or DevOps.
• Continuous Improvement:
Foster a culture of continuous improvement and technical excellence, proactively identifying patterns of recurring issues to enhance stability and improved processes (automation opportunities, etc).
Thanks,
Shafique Mohammed
Recruiting Manager
210 Sixth Avenue, Suite 3100 | Pittsburgh, PA 15222
412.516.1987 (o)
systemone.com | LinkedIn (https://www.linkedin.com/in/shafique-m-519421bb/)
#LI-SM1
#M1
Ref: #404-IT Pittsburgh
System One, and its subsidiaries including Joulé, ALTA IT Services, CM Access, TPGS, and MOUNTAIN, LTD., are leaders in delivering workforce solutions and integrated services across North America. We help clients get work done more efficiently and economically, without compromising quality. System One not only serves as a valued partner for our clients, but we offer eligible full-time employees health and welfare benefits coverage options including medical, dental, vision, spending accounts, life insurance, voluntary plans, as well as participation in a 401(k) plan.
System One is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, age, national origin, disability, family care or medical leave status, genetic information, veteran status, marital status, or any other characteristic protected by applicable federal, state, or local law.
System One
-
- System One Jobs