Category: Tags
high availability
ChatGPT explaining application high availability to a high school kid
Before going into the details, it’s worth figuring out what the application (or system) users need as opposed to what they think they need:
- Fifty Shades of High Availability (2020)
- Figure Out What the Customer Really Needs (2017)
- Are Business Needs Just Excuses for Vendor Shenanigans? (2020)
- Redundancy Does Not Result in Resiliency (2017)
- High Availability Planning: Identify the Weakest Link (2016)
- Meaningful Availability (2020)
- Differential Availability (2020)
Not surprisingly, IT vendors sell magic infrastructure solutions as the high-availability panacea based on the assumption that redundant infrastructure cannot fail. Nothing could be further from the truth:
- High Availability Fallacies (2011)
- If Something Can Fail, It Will (2012)
- How Hard Is It to Think about Failures? (2016)
- This Is What Makes Networking So Complex (2013)
- Decide How Badly You Want to Fail (2019)
- Sometimes You Have to Decide How You Want to Fail (2015)
- Some People Don’t Get It: It Will Eventually Fail (2016)
- The Network Is Reliable and Other Stories (2016)
- Circular Dependencies Considered Harmful (2021)
High Availability Concepts, Technologies, and Solutions
You can use a plethora of approaches depending on your availability targets:
- Disaster recovery is the right tool for the job if you’re OK with the system being down for a few hours.
- Automatic restart of application instances combined with disaster recovery is acceptable if you can accept your system to be down ~0.1% of the time (99.9% availability)
- Availability targets higher than 99.9% can only be reached reliably with proper application design supported by well-designed infrastructure.
I wrote over 130 blog posts on these topics. It would be impossible to list all of them on a single page; major high-availability technologies or concepts thus have dedicated pages:
- Disaster recovery and avoidance
- High availability clusters
- Public and private cloud deployments
- Global and local load balancing with IP anycast
One of the prerequisites for highly available services is also redundant networking infrastructure:
- Redundant Data Center Internet Connectivity – Problem Overview (2013)
- Redundant Data Center Internet Connectivity – High-Level Design (2013)
- Coping with Byzantine Routing Failures (2014)
- Site and Host Multihoming (2023)
- High Availability Switching (2024)
Regardless of your approach, the only sustainable way to get highly available services is the correct design of the application stack. For more details, watch the Designing Active-Active and Disaster Recovery Data Centers webinar; I also wrote a few blog posts on the topic:
- Swimlanes, Read-Write Transactions and Session State (2017)
- Solving the Problem in the Right Place (2017)
- Moving Complexity to Application Layer? (2017)
- Optimizing the Time-to-First-Byte (2021)
Notable Outages
Finally, here are a few notable outages. TL&DR: it can happen to the big guys and will eventually happen to you.
Other High Availability Blog Posts
- 2015
- 2014
- 2013
- 2012
video
We published hundreds of public videos covering dozens of technologies on ipSpace.net. Networking technologies covered in free videos include:
Contents |
Artificial Intelligence and Machine Learning
- Introduction to AI/ML Hype (2021)
- Machine Learning 101 (2021)
- Machine Learning Techniques (2022)
- Use Cases for AI/ML in Networking (2022)
- The Long Tail of AI/ML Problems (2022)
- Ugly Challenges of Using AI/ML in Networking (2022)
- Language Models in AI/ML Landscape (2023)
- Language Model Basics (2023)
More in the AI/ML in Networking: The Good, the Bad and the Ugly webinar (with more videos coming soon).
Border Gateway Protocol (BGP)
- Simplify BGP Configurations (2017)
- History of BGP Route Leaks (2023)
- Hacking BGP for Fun and Profit (2023)
- Outages Caused by Bugs in BGP Implementations (2023)
More in the Network Security Fallacies part of the How Networks Really Work webinar and the Internet Routing Security webinar.
Business Aspects of Networking Technologies
- Define the Problem Before Searching for a Solution (2020)
- Know Your Users' Needs (2020)
- Should You Build or Buy a Solution? (2020)
- High-Level Technology Guidelines (2021)
- Lessons Learned: Technology Still Matters (2021)
- Lessons Learned: Fundamentals Haven't Changed (2021)
- Lessons Learned: Complexity Will Kill Your System (2021)
- Some Services Are Not Worth Delivering (2021)
- Lesson Learned: The Way Forward (2022)
More in the Business Aspects of Networking Technologies webinar.
Cloud Networking
- Cloud Models, Layers and Responsibilities (2019)
- Public Cloud Networking Overview (2020)
- We Still Need Networking in Public Clouds (2021)
- Public Cloud Networking Is Different (2021)
- How Can You Master Public Cloud Networking? (2021)
- Cloud Services Hierarchy (2022)
- Functions-as-a-Service Demo (2022)
- Cloud-Native Environments (2022)
- Cloud Infrastructure-as-Code (2022)
- Migrating into a Cloud (2023)
Cumulus Linux
- What Is Cumulus Linux All About? (2015)
- Cumulus Linux Base Technologies (2015)
- Cumulus Linux Architecture (2015)
- What is Cumulus Linux All About (2020)
- Simplify Device Configurations with Cumulus Linux (2020)
- NetQ and Cumulus Linux Data Models (2020)
Ethernet VPN (EVPN)
- EVPN Multihoming Taxonomy and Overview (2022)
- EVPN Multihoming Deep Dive (2022)
- MLAG with EVPN (2023)
- vPC Fabric Peering with EVPN Multihoming (2023)
- Advantages and Drawbacks of EVPN-based Multihoming (2023)
FRRouting
- FRRouting Overview (2019)
- FRRouting Architecture (2020)
- FRRouting Configuration and Performance Optimizations (2020)
- FRRouting Usability Enhancements (2020)
- FRRouting Deployment Guidelines (2020)
IPv6 Security
- Reconnaissance in IPv6 (2012)
- IPv6 Secure Neighbor Discovery (SEND) (2013)
- IPv6 Source Address Validation Improvement (2013)
- IPv6 uRPF and Neighbor Discovery Throttling (2013)
- IPv6 Address Assignment and Tracking (2013)
- Dual-Stack Security Exposures (2013)
- IPv6 Security Overview (2020)
- IPv6 Trust Model (2022)
- Practical Aspects of IPv6 Security (2022)
- Rogue IPv6 RA Challenges (2022)
- IPv6 RA Guard and Extension Headers (2022)
- Testing IPv6 RA Guard (2022)
- Traffic Filtering in the Age of IPv6 (2022)
- IPv6 Traffic Filtering Details (2022)
More in the IPv6 Security webinar.
Kubernetes
- Why Do We Need Kubernetes? (2021)
- Kubernetes Principles (2021)
- Kubernetes Architecture (2022)
- Kubernetes Networking Model (2022)
- Understanding Kubernetes Pods (2022)
- Typical Kubernetes Inter-Pod Traffic Walk (2022)
- Kubernetes Services Overview (2022)
- Kubernetes Services Types (2022)
- Exposing Kubernetes Services to External Clients (2022)
- Kubernetes SDN Architecture (2023)
- Sample Kubernetes SDN Implementations (2023)
- Kubernetes Container Networking Interface (CNI) (2023)
- Kubernetes Calico Plugin (2023)
More in the Kubernetes Networking Deep Dive webinar (with more videos coming soon).
Leaf-and-Spine Fabrics
- Multi-Stage Clos Fabrics (2013)
- Building a L3-Only Data Center with Cumulus Linux (2016)
- SPB Deep Dive (2017)
- Overlays in Data Center Fabrics (2017)
- Routing on Hosts Deep Dive (2017)
- Challenges of Data Center Fabric Deployments (2017)
- Building Data Center Fabrics with SPB (2017)
- Building a Pure Layer-3 Data Center with Cumulus Linux (2017)
- Data Center Fabric Validation (2017)
- Separate Data from Code (2017)
Networking Fundamentals
- Overview of Networking Challenges (2019)
- Introducing Transmission Technologies (2019)
- Beyond Two Nodes (2019)
- The Need for Network Layers (2019)
- Retransmissions and Flow Control in Computer Networks (2019)
- Putting the Networking Layers Together (2019)
- Breaking the End-to-End Principle (2019)
- Fallacies of Distributed Computing (2020)
- The Network Is Not Reliable (2020)
- End-to-End Latency Is Not Zero (2020)
- Bandwidth Is Neither Infinite Nor Cheap (2020)
- Networks Are (Not) Secure (2020)
- Internet Has More than One Administrator (2020)
- Networks Are Not Homogenous (2020)
- What Are Bridging, Routing, and Switching? (2020)
- Getting a Packet Across a Network (2020)
- Finding Paths Across the Network (2021)
- Path Discovery in Transparent Bridging and Routing (2021)
- Transparent Bridging Fundamentals (2021)
- IP Routing Fundamentals (2021)
- Comparing Routing and Bridging (2021)
- Typical Large-Scale Bridging Use Cases (2021)
- Introduction to Network Addressing (2021)
- Theoretical View of Network Addressing (2021)
- Early Data-Link-Layer Addressing (2021)
- Local Area Network Addressing (2022)
- Network Layer Addressing (2022)
- Comparing TCP/IP and CLNP (2022)
- Combining Data-Link- and Network Layer Addresses (2022)
- Network Address Assignments (2022)
- Network Address Scopes (2022)
- The Basics of Network Address Translation (NAT) (2022)
- Routing Protocols Overview (2022)
- Link State Routing Protocol Basics (2023)
- Link State Routing Protocol Implementations (2023)
More in the How Networks Really Work webinar (with more videos coming soon).
Networking Labs
- Could I Use netlab instead of GNS3? (2022)
- What Can Netlab Do? (2022)
- Getting Started with netlab (2023)
- netlab Topology File (2023)
- netlab IP Address Management (IPAM) (2023)
More in the Network Automation Tools webinar (with more videos coming soon).
Software-Defined WAN (SD-WAN)
- What Is SD-WAN? (2018)
- SD-WAN Reference Design (2018)
- Going Beneath the Cisco SD-WAN Surface (2020)
- Cisco SD-WAN Fundamentals and Definitions (2020)
- Cisco SD-WAN Solution Architecture and Components (2020)
- Cisco SD-WAN Routing Goodness (2020)
- Cisco SD-WAN Onboarding Process (2020)
- Cisco SD-WAN Policies and Centralized Magic (2021)
- Cisco SD-WAN Policies Review (2021)
- Cisco SD-WAN Routing Design (2021)
- Cisco SD-WAN Site Design (2021)
- Cisco SD-WAN Policy Design (2021)
- Managed SD-WAN Services (2022)
- Challenges of Managed SD-WAN Services (2022)
- SD-WAN Backend Architecture (2023)
- SD-WAN CPE Architecture (2023)
- Security Aspects of SD-WAN (2023)
More in Software-Defined WAN (SD-WAN) Overview, Cisco SD-WAN and Business Aspects of Networking Technologies webinars (with more videos coming soon).
Switching and ASICs
- Switch Buffer Architectures (2017)
- Big- or Small-Buffer Switches (2018)
- Tools and Knobs to Use when Tweaking TCP Performance (2018)
- ASICs 101 (2020)
- Packet Buffers in Data Center ASICs (2023)
- Chassis Switch Architectures (2023)
- Types of Switching ASICs (2023)
Other Videos or Video-Related Blog Posts
- 2024
- 2023
- 2021
- 2020
- 2019
- 2018
-
- Video: What Problem Are We Solving with SDDC?
- Real-Life Network Automation: How It All Started
- Making Sense of Software-Defined World
- Video: SPB Fabric Use Cases
- Video: Automatic Diagramming with PowerNSX
- Presentation and Video: Real-Life Automation Wins
- Video: Automated Data Center Fabric Deployment Demo
- Video: Create an NSX Logical Switch with PowerNSX
- [Video] Configure Data Center Devices with PowerShell
- Video: What Is PowerNSX?
- 2017
- 2016
- 2015
- 2014
- 2013
EIGRP
EIGRP was the best choice for an interior gateway protocol in late 1990s – it was fast, efficient, and easy to deploy. OSPF and IS-IS implementations improved in the intervening 30 years, slowly turning EIGRP into a forgotten technology.
On a more serious note, I wouldn’t deploy EIGRP in new network designs for compatibility reasons (no major networking vendor apart from Cisco implemented it), and I’d use BGP in designs where a single router has to deal with hundreds of adjacent routers (the only scenario where EIGRP still outshines OSPF and IS-IS).
While the ultimate sources of EIGRP wisdom remain the EIGRP Network Design Solutions Cisco Press book and RFC 7868, you might want to read these articles and blog posts describing EIGRP implementation details and deployment guidelines.
The Basics
- Scaling EIGRP Networks with Stub Routers
- EIGRP Myths Debunked
- EIGRP: an MBA-Like Perspective
- EIGRP Loop Prevention Logic
- RFC 7868: The Definitive EIGRP Guide
- Missing Information for the EIGRP Network Design Solutions Cisco Press Book
Implementation Details
- EIGRP Goodbye Message
- EIGRP Load Balancing Based on Interface Load
- Changes in EIGRP Summary Address Are no Longer Disruptive
- EIGRP Neighbor Loss Detection
- EIGRP Load and Reliability Metrics
- EIGRP MTU “metric”
- EIGRP Offset Lists
- Beware of the Pre-Bestpath Cost Extended BGP Community
- EIGRP Third-Party Next Hops
EIGRP Deployment Scenarios
- Using EIGRP in MPLS VPN Networks
- Multihomed EIGRP Sites in MPLS VPN Network
- Leak Map Confusion
- Limitations of VRF Routing Protocols on Cisco IOS
- GRE Keepalives or EIGRP Hellos?
- Recommendations for Keepalive/Hello Timers
- Manipulating EIGRP Metrics
- Multiple EIGRP Autonomous Systems in a VRF
- EIGRP Summarization in DMVPN Phase 2 Networks
- Solution: EIGRP Summarization Breaks Phase 2 DMVPN
- OSPF Meets EIGRP
- IBGP, IGP Metrics, and Administrative Distances
- Does Unequal-Cost Multipathing Make Sense?
DMVPN
DMVPN is an old1 Cisco-proprietary technology that combines NHRP, IPsec, IKEv2 and multipoint GRE tunnels to build dynamically-provisioned multi-access VPNs.
The easiest way to master DMVPN is to watch the ipSpace.net DMVPN webinars, and every now and then someone still finds them somewhat useful:
- Advanced DMVPN Webinar: Router Configurations
- DMVPN: How to Get from Zero to Hero?
- DMVPN Deployment Success Story
- Feedback: DMVPN Webinars
I also wrote dozens of DMVPN-related blog posts. Hope you’ll enjoy them!
The Basics
DMVPN always relies on a hub-and-spoke topology, but enables direct communication between spokes (Phase-2 DMVPN) and simplified routing with NHRP redirects (Phase-3 DMVPN).
- DMVPN Phase 1 Fundamentals
- DMVPN Phase 2 Fundamentals
- The Fundamental Difference between Phase 2 and Phase 3 DMVPN
- DMVPN Scalability
- Is Anyone Using DMVPN-over-IPv6?
Routing Protocols in DMVPN Networks
Routing protocols face significant challenges in DMVPN networks due to very large number of directly-connected neighbors, with EIGRP faring better than OSPF, and BGP being the only viable solution in deployments with a very large hub-to-spoke ratio.
- EIGRP Summarization in DMVPN Phase 2 Networks
- Solution: EIGRP Summarization Breaks Phase 2 DMVPN
- Can You Run OSPF over DMVPN?
- Using BGP in Phase 1 DMVPN network
- OSPF Configuration in Phase 1 DMVPN Network
- Configuring OSPF in a Phase 2 DMVPN network
- More OSPF-over-DMVPN Questions
- OSPF-over-DMVPN Using Two Hub Routers
- More Private AS Numbers
- BGP Routing in DMVPN Networks
- Scaling BGP-Based DMVPN Networks
- Changes in IBGP Next Hop Processing Drastically Improve BGP-based DMVPN Designs
- Reducing BGP SNMP Traps in DMVPN Networks
- DMVPN Split Default Routing
- Another DMVPN Routing Question
Typical DMVPN Designs
- Sometimes You Need to Step Back and Change Your Design
- VPN Network Design: Selecting the Technology
- DMVPN as a Backup for MPLS/VPN
- Redundant DMVPN designs, Part 1 (The Basics)
- Redundant DMVPN Designs, Part 2 (Multiple Uplinks)
- Regional Internet Exits in Large DMVPN Deployment
DMVPN Deployment Guidelines
- DMVPN: from Concept to Pilot in 36 Hours
- MPLS/VPN-over-GRE-over-IPSec: Does It Really Work?
- Migrating from Phase 1 DMVPN to Phase 2/3 Network
- Combining DMVPN with Existing MPLS/VPN Network
- Real Life BGP Route Origination and BGP Next Hop Intricacies
- Building a DMVPN Test Lab with netlab
Integration with Other Network Technologies
- End-to-End QoS marking in MPLS/VPN-over-DMVPN networks
- Spoke-to-Spoke IP Multicast over DMVPN?
- QoS in Large-Scale DMVPN Networks
- DMVPN: Spoke QoS Challenge
- RSVP over DMVPN
- Inter-VRF NAT in DMVPN Deployments
DMVPN Alternatives
Quirks and Implementation Details
I wrote numerous blog posts documenting DMVPN quirks while preparing the materials for the DMVPN webinars. Most of these blog posts were written in early 2010s and might no longer be relevant.
- Tunnel Route Selection and DMVPN Tunnel Protection Don’t Work Together
- uRPF Violation Logging Is Not Working on 12.4T
- DMVPN: Non-Unique NHRP Registrations
- DMVPN Spoke NHRP Behavior Changed in IOS Release 15.0M
- NHRP Convergence Issues in Multi-Hub DMVPN Networks
- NHRP Rate Limiting Can Hurt Your DMVPN Network
- The Impact of Changed NHRP Behavior in DMVPN Networks
Other Blog Posts Vaguely Related to DMVPN
- DMVPN: Fishing Rod or Grilled Tuna?
- Where Would You Need GRE?
- Viptela SEN: Hybrid WAN Connectivity with an SDN Twist
- Should I Use L2VPN+MACSEC or L3VPN+GETVPN?
- Use Existing (DMVPN) Device Configurations in netlab
-
As in: created around 2010. For more details, listen to the History of DMVPN with Mike Sullenberger. ↩︎