The Network Reliability Engineering department within IBM Cloud’s IaaS Network Engineering team provides incident and problem management, internal and external customer-driven changes, network evolution guidance, network testing, break-fix troubleshooting, and maintenance services for all aspects of data networking for IBM, including but not limited to responsible for ongoing support of our global network infrastructure including routers, switches, datacenter network, transit and transport links which connect datacenters sites to other networks and to other datacenter sites. The Network Engineer position of Network Reliability Engineering is an individual who is an accomplished Engineer with experience in network engineering and network operations.
The Network Engineer role includes the following responsibilities:
· Receive and manage ticket escalations from other teams as well as various monitoring tools involving service-affecting issues.
· Troubleshoot, isolate and correct service-affecting issues on the network in areas including but not limited to: routing protocols, routers, switches, firewall administration, MPLS, BGP, VPN, load balancing.
· Plan, schedule, and implement network maintenance activities which include firmware upgrades, hardware replacement and network infrastructure augments/changes.
· As needed, implement approved routing policy changes/corrections to mitigate points of traffic congestion on the network due to planned or unplanned incidents.
· Work with network infrastructure/service providers to identify and correct causes of circuit disruption.
· Work with hardware vendors to determine causes of device failure/issues.
· Create network connectivity diagrams and other documentation of live network environments for internal and customer use.
· Incident management during critical events that impact the network, including internal and external communications, team coordination of repair and then root cause analysis efforts.
· Communicate effectively with internal and external audiences with varying levels of technical expertise.
· Maintains high quality customer service to internal and external groups when needed.
This is a shift-based position with a 24/7/365 department. Individuals must be prepared to work a schedule of shifts that may include nights, weekends and holidays. You must be willing to work in an on-call situation.
Typical shifts can include, but are not limited to:
- Sunday – Thursday: 8:30am - 5:00pm, GMT
- Tuesday – Saturday 8:30am - 5:00pm, GMT
Demonstrated experience in network engineering and operations in a multi-vendor environment
- Working knowledge of the following network protocols/technologies:
- Border Gateway Protocol (BGP)
- Multiprotocol Label Switching (MPLS)
- Open Shortest Path First (OSPF)
- IPv4 and IPv6 addressing
- LACP (Link Aggregation Control Protocol)
- Dark fiber / DWDM systems
- Ability to initiate and effectively utilize “remote hands” assistance within datacenter and carrier hotel environments for items such as fault isolation, installation/test of structured cabling/cross-connects and troubleshooting physical equipment as well as replacing faulty components within a chassis
- Advanced Configuration of Juniper and Cisco devices
- Demonstrated history of organization and time management skills
- Demonstrated history of verbal and written communication skills
- Exhibit a strong understanding of customer service
- Must be self-motivated and disciplined
- Ability to recognize and prioritize critical tasks independently
- One of the following Certifications: Juniper Networks Certified Internet Expert (JNCIE), Cisco Certified Network Professional (CCNP), Cisco Certified Design Professional (CCDP), Cisco Certified Internetwork Expert (CCIE).
- Ability to build mechanisms to automate routinary tasks, and to participate with automation efforts within the team.
- Working knowledge with Software Defined Network technologies
- Experience in Microservices (Kubernetes, Docker)
- Comfortable operating in fast paced environment