+ Reply to Thread
Results 1 to 13 of 13
  1. Senior Member Danielh22185's Avatar
    Join Date
    Apr 2012
    Location
    DFW Area
    Posts
    1,161

    Certifications
    CCNP R&S, CCNA, CCENT
    #1

    Default Something I really never Understood...

    So with my company we tend to I think "over do it" when it comes to a failed devices on the network.

    Example:
    Major core switch fails, we generally shut down physical interfaces AND routing protocols to that device. I know it's always best to tread carefully in this type of scenario but I've never been able to get an answer from anybody I work with why just shutting down physical interfaces isn't "good enough". If there is no possible ingress interface UP/UP on the device then how could the device possibly receive traffic right?

    The only reason I can possibly think people tend to do this is to have the additional control element to ensure interfaces are up/up physically before telling a routing protocol to establish during a restoration. Still, I think a bit redundant and silly and not necessary because an interface will always establish before the routing protocol, and if there is something actually wrong with an interface, simply shut it back down...

    Now, don't get me wrong either. I fully understand too when it's best to have just a routing protocol down. Example of this would be for some reason we need to leave the physical up but not allow traffic across the link. So the routing protocol can be shut down in this instance to do the physical testing. However my previous example doesn't apply to this idea.

    Can y'all think of why I should steer clear of just focusing on the physical and think more like some of my work colleagues or are they just being overly cautious?
    Currently Studying: IE Stuff...kinda...for now...
    My ultimate career goal: To climb to the top of the computer network industry food chain.
    "Winning means you're willing to go longer, work harder, and give more than anyone else." - Vince Lombardi
    Reply With Quote Quote  

  2. SS -->
  3. Senior Member
    Join Date
    Dec 2014
    Posts
    259
    #2
    why will you want a routing protocol down?
    Reply With Quote Quote  

  4. Random Member docrice's Avatar
    Join Date
    Apr 2010
    Location
    Bay Area, CA
    Posts
    1,687

    Certifications
    GSEC, GCFW, GCIA, GCIH, GWAPT, GAWN, GPEN, GCFE, GCFA, GMON, OSWP, SFCP, SnortCP, Sec+; expired: CCNA (R&S, Security, Wireless), WCNA
    #3
    Some places have a strict procedure. Sure, shutting an interface down can effectively cut off the device's utility, but if there are many cooks in the kitchen it's possible that someone may come around and "fix it" by turning it back on not knowing about the issue that caused the problem.

    Another example - if you remove a server from a network, shutting the port down seems enough ... until someone later sees that port as available, plugs an arbitrary device in, turns the port back on, and all of a sudden you have a new device on a VLAN that it's not supposed to be on. This can be bad news.
    Reply With Quote Quote  

  5. Senior Member
    Join Date
    Sep 2013
    Location
    Sweden
    Posts
    861

    Certifications
    CCNP
    #4
    So with my company we tend to I think "over do it" when it comes to a failed devices on the network.

    Example:
    Major core switch fails, we generally shut down physical interfaces AND routing protocols to that device.
    Manually changing routing protocol configuration at the CLI of an operational device when another device fails seems like a bigger risk to me than just shutting down interfaces and leaving the routing intact. If the person doing that reconfiguration does something wrong you now have two failed devices.
    Last edited by fredrikjj; 04-27-2016 at 10:44 AM.
    Reply With Quote Quote  

  6. Senior Member White Wizard's Avatar
    Join Date
    Sep 2013
    Location
    KY
    Posts
    173

    Certifications
    A+, S+, CCENT, CCNA
    #5
    Quote Originally Posted by docrice View Post
    Some places have a strict procedure. Sure, shutting an interface down can effectively cut off the device's utility, but if there are many cooks in the kitchen it's possible that someone may come around and "fix it" by turning it back on not knowing about the issue that caused the problem.

    Another example - if you remove a server from a network, shutting the port down seems enough ... until someone later sees that port as available, plugs an arbitrary device in, turns the port back on, and all of a sudden you have a new device on a VLAN that it's not supposed to be on. This can be bad news.
    If an int is administratively down, then whomever wants to use that int should know it was shutdown for a reason and find out why.
    Reply With Quote Quote  

  7. Went to the dark side.... Moderator networker050184's Avatar
    Join Date
    Jul 2007
    Posts
    11,649

    Certifications
    CCNA, CCNP, CCIP, JNCIA-JUNOS, JNCIS-SP, JNCIP-SP, MCA200
    #6
    I usually don't remove routing protocols but I do raise the metric. That way you can turn it all the way up and gracefully shift traffic back on.
    An expert is a man who has made all the mistakes which can be made.
    Reply With Quote Quote  

  8. Senior Member Danielh22185's Avatar
    Join Date
    Apr 2012
    Location
    DFW Area
    Posts
    1,161

    Certifications
    CCNP R&S, CCNA, CCENT
    #7
    Quote Originally Posted by networker050184 View Post
    I usually don't remove routing protocols but I do raise the metric. That way you can turn it all the way up and gracefully shift traffic back on.
    I believe the mind set is all about doing things gracefully.

    Now don't get me wrong, like in this situation I described we have the device isolated from the network already (cables pulled (told you we tread carefully)) (we sometimes even go as far as powering off a bad device like this...I get it, it quickly preserves business continuity... and I work for a large financial firm so business / money is everything).

    The Failure was a bad SUP (had multiple parity errors). So to better facilitate a graceful re-introduction to the network I consoled into the switch and shut down all interfaces, so once the hardware is replaced we still have the device logically isolated from the network. From there we can control gracefully reintroducing it to the network from that particular device instead of logging into 20+ surrounding networking devices and shutting down / re-enabled stuff manually (creating less opportunity for human error).

    The thing that just throws me is that I have several colleagues with years more experience than me that will take it a step further and passive interfaces / shut down routing protocols on the devices. I never understood from them why other than it being a precaution it is done that way. A routing protocol works only as good as it's up/up interfaces...
    Last edited by Danielh22185; 05-01-2016 at 02:22 PM.
    Currently Studying: IE Stuff...kinda...for now...
    My ultimate career goal: To climb to the top of the computer network industry food chain.
    "Winning means you're willing to go longer, work harder, and give more than anyone else." - Vince Lombardi
    Reply With Quote Quote  

  9. Senior Member
    Join Date
    Apr 2014
    Location
    Ohio
    Posts
    300

    Certifications
    A+. net+, CCNA R&S
    #8
    In the world of networking, you have a lot of people that have learned over the years, only based on what they have seen. There are some that may not even bother to look at best practices, and just come up with something on their own. This isn't wrong per se, but it can be uneducated guesses at times. You will also find a lot of people out there, that only have a high level view of how something works in the world of networking, and they don't have the lower level knowledge of how something really works, which are usually the people that start making lots of "Excessively safe" policies, and creating lots of unnecessary work.
    Reply With Quote Quote  

  10. Went to the dark side.... Moderator networker050184's Avatar
    Join Date
    Jul 2007
    Posts
    11,649

    Certifications
    CCNA, CCNP, CCIP, JNCIA-JUNOS, JNCIS-SP, JNCIP-SP, MCA200
    #9
    Those people with more experience have probably seen devices comeback online and wreak havoc which is why they're so cautious. Either that or someone that has told them war stories.
    An expert is a man who has made all the mistakes which can be made.
    Reply With Quote Quote  

  11. Junior Member
    Join Date
    May 2016
    Posts
    28
    #10
    While that does sound a bit extreme, its always wise to bring a device into service gracefully.

    You generally don't want to plug a device in, turn up its interfaces, and boom its forwarding traffic. What if there is a faulty patch that has become damaged during the swap out? You may get a whole load of TCP retransmits and severely limit traffic flow. Policy got fluffed when restoring config? Could have routing loops/other undesirable behaviour. There are many issues that could occur.

    Generally a good way of going about things is something like the following:

    Bring up interfaces. Run a few pings across, make sure no errors on any ports.
    Bring up IGP with high metrics on all links. Ensure all adjacencies are established and that all routes are being distributed/learnt.
    Bring up BGP. Ensure all neighbours establish, routes are being distributed.
    Drop IGP metrics back to what is normal.
    Restore any FHRP that are usually master on this box.

    YMMV, but that is roughly the kind of plan I would follow. It always pays to be safe.
    Reply With Quote Quote  

  12. Senior Member Danielh22185's Avatar
    Join Date
    Apr 2012
    Location
    DFW Area
    Posts
    1,161

    Certifications
    CCNP R&S, CCNA, CCENT
    #11
    Quote Originally Posted by daveyb View Post
    While that does sound a bit extreme, its always wise to bring a device into service gracefully.

    You generally don't want to plug a device in, turn up its interfaces, and boom its forwarding traffic. What if there is a faulty patch that has become damaged during the swap out? You may get a whole load of TCP retransmits and severely limit traffic flow. Policy got fluffed when restoring config? Could have routing loops/other undesirable behaviour. There are many issues that could occur.

    Generally a good way of going about things is something like the following:

    Bring up interfaces. Run a few pings across, make sure no errors on any ports.
    Bring up IGP with high metrics on all links. Ensure all adjacencies are established and that all routes are being distributed/learnt.
    Bring up BGP. Ensure all neighbours establish, routes are being distributed.
    Drop IGP metrics back to what is normal.
    Restore any FHRP that are usually master on this box.

    YMMV, but that is roughly the kind of plan I would follow. It always pays to be safe.

    All good advice. In the end, in the networking world there are multiple ways to skin a cat. Other than being cautious I wanted to understand is there really a hard reason why some of my colleagues wanted to operate the way they do. Seems there is no hard reason other than caution, which I fully understand, because like I mentioned I work for a giant financial firm centered around a giant network. Network stability is huge for us. We are involved in almost any technical problem the firm faces, at least always to prove out network.

    I too have seen my fair share of craziness. This is why when we do restore equipment is generally done in the middle of the night when business traffic is at its lowest.
    Currently Studying: IE Stuff...kinda...for now...
    My ultimate career goal: To climb to the top of the computer network industry food chain.
    "Winning means you're willing to go longer, work harder, and give more than anyone else." - Vince Lombardi
    Reply With Quote Quote  

  13. Senior Member
    Join Date
    Apr 2014
    Location
    Ohio
    Posts
    300

    Certifications
    A+. net+, CCNA R&S
    #12
    Quote Originally Posted by networker050184 View Post
    Those people with more experience have probably seen devices comeback online and wreak havoc which is why they're so cautious. Either that or someone that has told them war stories.

    I can't argue against this one. We all make mistakes, there is nothing like not being cautious enough, and causing an outage. I was just trying to make the point that some people use excessive caution as a way to mask the fact that they don't understand how something works. But you are correct, that in networking, people who have learned to lean towards caution, probably have a few battle scars.
    Reply With Quote Quote  

  14. Senior Member Node Man's Avatar
    Join Date
    Dec 2012
    Location
    LV426
    Posts
    600

    Certifications
    CCENT, CCNA R&S, CCNP-R&S, CE-A
    #13
    +1 Graceful shut down and costing off traffic.
    Reply With Quote Quote  

+ Reply to Thread

Social Networking & Bookmarks