This Customer NEEDED 100% uptime, on a budget
We know how to solve complex customer problems with elegance and with cost in mind!
The Business Problem:
This customer provided voice translation services, in real time, to 911 operators throughout the United states, in addition to other real and non-real time translation services. They have 3 physical locations in the Pacific Northwest, in Portland, Oregon, Vancouver, Washington, and Seattle, Washington. The customer built an fat-client application that allowed translators, from ANYWHERE in the world, to connect into their data centers. A very innovative approach to finding translators wherever they resided, that helped keep costs down AND find the right talent, at the right time.
To service the 911 contract, and because lives were potentially at stake, the customer required 100% uptime from ALL components of the infrastructure, applications (SIP, web and DB tiers), and across the Internet. However, the customer was on a budget as the translation arena is extremely competitive. This is where NetraVine engineers proved their skillset and elegant design solutions.
Our Solution:
Here's the really cool, WAN diagram!
For the Nerds, and we know you are reading this, the Layer 2 and Layer 3 infrastructure that aligns with the WAN diagram above, looks like the following! (NOTE - this network was designed in 2015 and STILL operating on the same WAN edge (Cisco ASR routers) and internal switching (Cisco 3750 stacked switches). The firewalls have been upgraded to Cisco Firepower appliances.
Layer 2 network
Layer 3 Network
The Customer Requirements
-
The following are elements and requirements of the Customer Network data center and interconnect infrastructure.
-
- Utilize existing hardware where it makes sense. For the existing hardware make sure it is covered with Cisco SmartNet
- Replace EOL/EOS hardware (or soon to be EOS)
- Provide a fully redundant network that can support 24/7 operations. Some of these operations are in support of 911 services
- Build out a DR / Hot standby / Active capable data center facility in Westin Building in Seattle, WA
- Provide a routing infrastructure that can support the fastest possible convergence of network routing protocols. This requires the use of Bidirectional Forwarding Detection
- Implement a Guest WIFI solution in the Corporate Office.
-
How did we meet the customer requirements?
The diagrams above shows the ‘end’ game physical design to support the logical design. For Internet routing, we acquired a /23 public IPv4 routing space as well as a public Autonomous System Number (ASN) for BGP routing. We provisioned 2 x diverse ISPs in both data centers and used our ASN and Public IPv4 space to advertise the customers routes as follows:
- For the SEA data center, advertise 1 x /24 out of the /23 space as a primary path and advertise the full /23 as an additional route
- For the PDX data center, advertise the OTHER /24 out of the /23 space as a primary path and advertise the full /23 as well
- What does this do? From the Internet route table perspective, we have 8 routes (4 * 2 upstream ISPs): 1 x /24 from SEA, 1 x /24 from PDX, one x /23 from SEA and finally, one x /23 from PDX
- We can lose any Internet circuit and BGP converges as expected
For physical diversity, and to support Layer 3 routing:
- Provision 3 x 10 Gigabit WAVE circuits to form a 'triangle' of circuits between the 3 locations. A wavelength service is muxed onto a DWDM system, such that you get a full 10Gbps pipe non-blocking with a fixed latency and jitter. In essence, it's a cable (via light waves on fiber) dedicated to you that can span long distances.
- The WAVE circuits should be fully diverse paths. We verified this with the telco by interviews with key telco engineers AND looking over telco KMZ files (KMZ files are providers way of showing their fiber paths throughout their networks).
- Because WAVE circuits can be used however you'd like (trunk access VLANs or configure Layer 3 SVI's) - we did both.
For internal WAN routing, the idea is that should any of the 10G wave circuits fail between the data center, the OSPF routing protocol will converge as quickly as possible to the backup 10G path. This will be accomplished by tweaking OSPF timers and integrating OSPF and BGP with Cisco Bidirectional Forwarding Detection to quickly detect Layer 2 issues across the 10G circuits. This means we need switches in all 3 locations that support BFD and OSFP, at a minimum. We also need support of 10G modules and/or uplinks between locations and internally between switches in some cases. Additionally, the ASRs will be interconnected between SEA and PDX by VLANs routed over the 10G circuits. These VLANs will provide 2 x IP paths for connectivity between the ASRs. This will provide us with the following capabilities:
- Best path outbound routing based on BGP metrics we can control
- Ability to provide inbound and outbound routing via any data center
- Ability to find the closest SIP peering endpoint. VOIP is used a lot for this customer and having the ability to route SIP traffic either via PDX or SEA is essential
In Conclusion!
We've had several circuit outages over the years and between our fully redundant AND diverse network and the customers unique approach to moving application and SIP workloads between facilities, the customer has accomplished 100% uptime of the networking and application services! This customer employs NetraVine using our Team as a Service (TaaS) model for ongoing architecture and staff augmentation needs