Cisco switch-profile Syncing on NX-OS

Date December 4, 2013

Man, I've been pulling my hair out for the past couple of days trying to get my pair of Cisco Nexus 5548s to synchronize their switch profile configurations, but I think I've finally got it, so I wanted to write a little bit and maybe help other people who got stuck, too.

Here's some background:

A while back, Cisco developed the idea of port profiles for the UCS environment, so that you could quickly and easily apply a templated configuration to a switch port.
With NX-OS 5.0(2)N1(1), Cisco included switch profiles, with a similar goal in mind. When you have a number of switches, all of which need to have the interfaces configured identically, then switch-profiles are what you're looking for.

You may be wondering why you would like a bunch of switches configured all the same...especially if you aren't familiar with Cisco Nexus switches. Just to clarify, here's a typical chassis switch:
cisco-chassis

It's basically a bunch of ports. That's what I have right now, and I've got a whole bunch of wiring going back to it from all of my racks in the server room. It's great for management, since there's only one device, but I hate running 30ft cords all the time.

Here's what I'm replacing it with:

nexus_5548up_switch_large

Actually, a pair of them. As you can see, the port count doesn't quite add up. That's ok, because I've got a bunch of Fabric Extenders (FEX), too:

nexus2248_large_photo

Here's how the FEX connect to the switches:

Nexus logical connections

As you can see, each FEX is connected to each switch (actually a couple of times - each FEX has four 10Gb/s SFP+ ports, so each FEX has a pair of 10Gb/s connections in a port channel configuration).

What I end up with is a physical layout that looks like "Top of Rack" or "End of Row", but doesn't have the headache of trying to configure six different switches. And with switch-profile synchronization, I don't even really have to deal with configuring two switches that often.

When a switch is configured to use a fabric extender, the FEX is assigned a number, from 100-199 (I don't know why that particular range of numbers). That configuration is pretty simple:


configure terminal
feature fex
fex 100
description "FEX100"

Then you just configure each of the ports that the FEX is attached to:


interface port-channel100
switchport mode fex-fabric
fex associate 100

interface Ethernet1/3
description UPLINK-FEX-112
switchport mode fex-fabric
fex associate 100
channel-group 100

interface Ethernet1/4
description UPLINK-FEX-112
switchport mode fex-fabric
fex associate 100
channel-group 100

You can check that things are working the way you think:


core01# sh fex
FEX FEX FEX FEX
Number Description State Model Serial
------------------------------------------------------------------------
100 FEX100 Online N2K-C2248TP-E-1GE FOX1724GZKL

Assuming the FEX is actually plugged in, that creates a series of interfaces, Ethernet100/1/1 - Ethernet100/1/48 (in NX-OS, everything is ethernet, regardless of the speed of the port).

Now, that configuration was done in isolation, with one switch. What about another switch that's also attached to the FEX? If you want to be able to use the second switch, something similar needs to be done on that switch, too.

The "right" way to do this is to set up Virtual Port Channels (vPC), as outlined in this document from Cisco. That's the document I used when I first started to configure the switches and FEX, and it worked. The problem is, that document doesn't actually explain to you that you should be using switch profiles. I mean, it mentions them twice in what are essentially footnotes, but by the time I got there, I assumed it was just another of the many Cisco technologies on the periphery that I don't know, don't use, and don't need to worry about.

But then, if that were the case, I wouldn't be writing this article, would I?

If you're doing this, you should use switch profiles. Seriously. And I can tell you from experience, it's harder to take an existing configuration and apply profiles into it than it is to start from a clean slate using profiles the first time.

So lets do this. I will assume that you have a couple of Nexus switches that have their interfaces, fex, and so on unconfigured, and that you have their management ports on the network. They need to be able to talk to each other, so this should work from both sides:


ping -other-switch-mgmt0-ip- vrf management

At this point, the first thing we need to set up is Cisco Fabric Services. CFS is designed to distribute configuration information throughout the network. Fortunately, this is relatively straight forward for an infrastructure the size I'm dealing with.

Basically, you need to create a CFS region, tell it to distribute the configurations over IPv4 (or IPv6 if you're awesome) and, for me anyway, it worked. Here's my config and status:


core01# sh run | include cfs
cfs ipv4 distribute
cfs region 20
cfs eth distribute

core01# sh cfs status
Distribution : Enabled
Distribution over IP : Enabled - mode IPv4
IPv4 multicast address : 239.255.70.83
IPv6 multicast address : ff15::efff:4653
Distribution over Ethernet : Enabled

core01# sh cfs peers

Physical Fabric
-------------------------------------------------------------------------
Switch WWN IP Address
-------------------------------------------------------------------------
20:00:00:2a:6a:47:3b:00 -switch1 IP- [Local]
core01
20:00:00:2a:6a:1a:3c:00 -switch2 IP-

Total number of entries = 2

Once this works, then you can start setting up the switch profile. To do that, you use a different configuration environment, one I'd never used before, called 'configure sync':


core01# configure sync
Enter configuration commands, one per line. End with CNTL/Z.
core01(config-sync)# switch-profile ?
WORD Enter the name of the switch-profile (Max Size 64)

core01(config-sync)# switch-profile core-shared
Switch-Profile started, Profile ID is 1
core01(config-sync-sp)#

In the switch profile, you can make configuration changes to most of the switch. For the specifics of what you can and can't do, you probably want to read the docs.

The workflow for adding a FEX for me was to start off by pre-provisioning a "slot" for it. When I wanted to preconfigure FEX111, for instance, I did this in config-sync-sp mode:


slot 111
provision model N2K-C2248TP-E-1GE

That pre-creates interfaces Eth111/1/1-48, and more importantly, it does the same thing on both switches.

I'm not going to walk through the entirety of my switch config, but if you have questions about a specific part, just ask.

While I was going through this work, the tools I used extensively were:

  • verify
  • This command is kind of a dry run which makes sure that the configuration changes aren't going to do anything too wrong. If this returns successfully, then there's a good chance that you can commit your change. I've seen the occasional time when a configuration change will pass on 'verify' but will fail on commit because of something on the remote node that it didn't take into account. I'm not sure what it is or isn't checking.

    That exception that I was talking about was when I changed the model number of a pre-configured FEX.

  • show switch-profile buffer
  • This shows a numerical list of the proposed changes that you are trying to commit. It's very useful as a sanity check, and to make sure that you haven't mistyped something or thought that you were another configuration mode when you tried configuring something.

  • buffer-delete
  • This gives you the ability to delete some or all of your buffered changes, as shown by the 'show switch-profile buffer' command above.

  • commit
  • When you're ready to apply the changes, you commit them, which locks the configuration on both switches. When you commit, the first thing that runs is 'verify', and if that passes, then it tries to apply the changes to the local switch. If that succeeds, then it tries to run the changes to the remote switch. If that succeeds, the change is a success. If any thing in this process fails, then the entire change is rolled back and nothing is applied anywhere. This atomicity allows for known-good identical configuration everywhere.

  • show switch-profile status
  • When something inevitably goes wrong, this command stands a good chance of helping you figure out what happened. Here's the output from one of my switches:


    core02# show switch-profile status

    switch-profile : core-shared
    ----------------------------------------------------------

    Start-time: 185468 usecs after Wed Dec 4 11:23:12 2013
    End-time: 414949 usecs after Wed Dec 4 11:23:14 2013

    Profile-Revision: 24
    Session-type: Commit
    Session-subtype: -
    Peer-triggered: Yes
    Profile-status: Sync Success

    Local information:
    ----------------
    Status: Commit Success
    Error(s):

    Peer information:
    ----------------
    IP-address: -other switch IP-
    Sync-status: In sync
    Status: Commit Success
    Error(s):

    Running it from both sides is helpful (and reassuring).

  • show run switch-profile
  • This shows the running configuration for only the bits of configuration that are applied through the switch-profile. This is a life saver because 'show running-config' doesn't actually differentiate what was picked up through local config and what was synced.

Big thanks go to Markku Leiniö for his blog entries on the topic. They really helped open my eyes. If you deal with this kind of stuff, I'd recommend reading what he writes.

In general, my advice is to have two terminals open - one to each switch, and to make small atomic changes while you're learning how configuration is applied. When I was configuring my FEX, I would do one FEX, run verify, do the commit, verify that it happened on both sides, then I would do all of the other FEX in exactly the same way, just in bulk, copied from a text buffer. Then I'd run verify, commit, and then check the status on the other switch. Make sure that you can apply configurations from both switches and that things are working right.

Being cautious is a good thing. Like any other distributed configuration system, this allows you to quickly and easily make changes, but also exacerbates mistakes.

Thanks for reading, and I hope it was helpful!