Agentic NetOps: The Right Direction, and the Real Work Ahead
NetBrain’s big announcement was so big, it took over the Sphere. But this was Cisco Live, so let’s talk about what Cisco had to say too. In his keynote, Jeetu...
by Eyvonne Sharp Dec 7, 2017
The long-awaited change window for your network upgrade has arrived. Managers stack pizza on the conference room table, snacks are scattered throughout the office, and your changes have been tested as thoroughly as possible.
Your team has been planning code upgrades to your network core for weeks. It’s been challenging, to say the least. In last few years, your mid-size enterprise has invested significant resources to increase the redundancy and stability of the network core. You’ve been working hard to consolidate a network that began as a string of daisy-chained switches, ad-hoc configurations, and undocumented network surprises.
Although you’ve requested budget to create a lab environment to mirror your production core, the project remains unfunded. Instead, you’ve used every other resource at your disposal. You’ve read through the release notes document for the new code version and researched relevant issues. You’ve reached out to peers who have performed similar network upgrades in the past. You’ve backed up the configs on every device on the network. You’ve talked with the vendor SE to ensure you’re moving to a stable version of code.
On top of all this, you’ve opened a proactive ticket with the vendor to expedite support should you need help in the middle of the night. As an added measure, you’ve downloaded relevant documentation and verified software images on your local drive in case you lose Internet access during the network upgrade.
In preparation, you’ve documented the changes you need to make, step by step. You’ve developed a test for every incremental change, to be certain the network behaves as expected. You’ve determined key checkpoints along the way where you need to evaluate your progress.
Immediately before the change window starts, you grab a copy of the ARP table, the routing table, and the MAC address table from the devices you are upgrading, so you can compare after the reboot.
You are as prepared as you can possibly be.
Everyone’s in position. You’ve executed your plan and it’s time to see if your hard work will pay off. You enter the seven characters that will determine the fate of your night. . . .
Reload<enter>
You wait.
This scenario is familiar to every network engineer who has worked in the enterprise. Most of us have experienced the thrill of leaving the office in the wee hours of the morning with a successful network upgrade under our belts. We’ve also experienced the agony when calls come in with unexpected impacts, strange behaviors, or seemingly unrelated application errors.
So why do things still go wrong even when we did everything we could to validate the change beforehand?

One of the great challenges in networking is that we have very little control over the devices and application traffic that ride our networks. Even when we have complete control over network configuration, we don’t always have control over network state. What kinds of state conditions can cause problems during a network upgrade?
In short, the more you know about your network, the more can you do to prepare, plan, and prevent problems during a network upgrade. NetBrain’s Dynamic Network Mapping can help you visualize the network in real time, to discover your full network topology. You can visualize traffic flows through your network to understand where and how problems may arise. You can discover misconfigurations that do not have an impact until traffic fails over to an unused link.
As you plan your network upgrade, you can build predefined validation tests that can be automated as part of Executable Runbooks. These tools not only provide critical and timely information during your network upgrade, they build confidence with your leadership team as you are able to provide specific test plans and data to prove your success.
NetBrain’s big announcement was so big, it took over the Sphere. But this was Cisco Live, so let’s talk about what Cisco had to say too. In his keynote, Jeetu...
NERC CIP compliance automation is the continuous use of network automation to satisfy the requirements of NERC’s Critical Infrastructure Protection standards and FERC’s 2023 rule on Internal Network Security Monitoring...
A 2024 IT outage cost one US carrier over $500 million. Two years earlier, an operational meltdown at another carrier cost more than $750 million. The US Government Accountability Office...
We use cookies to personalize content and understand your use of the website in order to improve user experience. By using our website you consent to all cookies in accordance with our privacy policy.