Ubuntu and SNMP

Date August 21, 2014

After running Ubuntu for about two years now, I have a laundry list of complaints. Whether Ubuntu is automagically starting daemons that I install, or the relative difficulty of running an internal repo, or (and I'm heartily agreeing with my coworker Nick here) that it doesn't actually include a firewall out of the box....there are very basic issues I have with running and managing Ubuntu machines.

The one that inspired this entry, though, is like, super dumb and annoying.

Suppose I'm trying to do something like snmpwalk on a switch:

$ snmpwalk -v 2c -c public myswitch.mydomain
-bash: /usr/bin/snmpwalk: No such file or directory

Of course, I need snmp. Lets install that:

~$ sudo apt-get install snmp
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
libperl5.18 libsensors4 libsnmp-base libsnmp30
Suggested packages:
lm-sensors snmp-mibs-downloader
The following NEW packages will be installed:
libperl5.18 libsensors4 libsnmp-base libsnmp30 snmp
0 upgraded, 5 newly installed, 0 to remove and 36 not upgraded.
Need to get 1,168 kB of archives.
After this operation, 4,674 kB of additional disk space will be used.

and try it again:

$ snmpwalk -v 2c -c public myswitch.mydomain
iso. = STRING: "Cisco NX-OS(tm) n5000, Software (n5000-uk9), Version 6.0(2)N2(3), RELEASE SOFTWARE Copyright (c) 2002-2012 by Cisco Systems, Inc. Device Manager Version 6.2(1), Compiled 12/17/2013 2:00:00"
iso. = OID: iso.
iso. = Timeticks: (575495258) 66 days, 14:35:52.58
iso. = STRING: "me@here"
iso. = STRING: "myswitch"
iso. = STRING: "snmplocation"
iso. = INTEGER: 70
iso. = Timeticks: (4294966977) 497 days, 2:27:49.77
iso. = OID: iso.
iso. = OID: iso.
iso. = OID: iso.

Well, we're close, but all we have are a bunch of OIDs. I'd really like names. If you've read my introduction to SNMP, you know that it's not loading the MIBs. Weird. On RHEL/CentOS, that's kind of automatic. Maybe there's another package?

Well, that snmp-mibs-downloader that was listed as a suggested package above sounds pretty promising. Lets install that.

$ sudo apt-get install snmp-mibs-downloader
...snip lots of installing MIBS...

So basically, 300+ MIBs were just installed into /var/lib/mibs/ - this is awesome. Lets run that command again:

$ snmpwalk -v 2c -c public myswitch.mydomain
iso. = STRING: "Cisco NX-OS(tm) n5000, Software (n5000-uk9), Version 6.0(2)N2(3), RELEASE SOFTWARE Copyright (c) 2002-2012 by Cisco Systems, Inc. Device Manager Version 6.2(1), Compiled 12/17/2013 2:00:00"
iso. = OID: iso.
iso. = Timeticks: (575577418) 66 days, 14:49:34.18
iso. = STRING: "me@here"
iso. = STRING: "myswitch"
iso. = STRING: "snmplocation"
iso. = INTEGER: 70
iso. = Timeticks: (4294966977) 497 days, 2:27:49.77
iso. = OID: iso.
iso. = OID: iso.
iso. = OID: iso.

That's strange. As it turns out, though, Ubuntu has yet another trick up its sleeve to screw with you. Check out /etc/snmp/snmp.conf:

msimmons@nagios:/var/log$ cat /etc/snmp/snmp.conf
# As the snmp packages come without MIB files due to license reasons, loading
# of MIBs is disabled by default. If you added the MIBs you can reenable
# loaging them by commenting out the following line.
mibs :

This file's entire purpose in life is to stop you from having MIBs out of the box.

Obviously, you can comment out that line and then things work:

$ snmpwalk -v 2c -c public myswitch.mydomain
SNMPv2-MIB::sysDescr.0 = STRING: Cisco NX-OS(tm) n5000, Software (n5000-uk9), Version 6.0(2)N2(3), RELEASE SOFTWARE Copyright (c) 2002-2012 by Cisco Systems, Inc. Device Manager Version 6.2(1), Compiled 12/17/2013 2:00:00
SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (575602144) 66 days, 14:53:41.44
SNMPv2-MIB::sysContact.0 = STRING: me@here
SNMPv2-MIB::sysName.0 = STRING: myswitch
SNMPv2-MIB::sysLocation.0 = STRING: snmplocation
SNMPv2-MIB::sysServices.0 = INTEGER: 70
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (4294966977) 497 days, 2:27:49.77
SNMPv2-MIB::sysORID.1 = OID: SNMPv2-MIB::snmpMIB
SNMPv2-MIB::sysORID.3 = OID: SNMP-FRAMEWORK-MIB::snmpFrameworkMIBCompliance
SNMPv2-MIB::sysORID.4 = OID: SNMP-MPD-MIB::snmpMPDCompliance
SNMPv2-MIB::sysORDescr.1 = STRING: The MIB module for SNMPv2 entities
SNMPv2-MIB::sysORDescr.2 = STRING: View-based Access Control Model for SNMP.
SNMPv2-MIB::sysORDescr.3 = STRING: The SNMP Management Architecture MIB.
SNMPv2-MIB::sysORDescr.4 = STRING: The MIB for Message Processing and Dispatching.
SNMPv2-MIB::sysORDescr.5 = STRING: The management information definitions for the SNMP User-based Security Model.
SNMPv2-MIB::sysORUpTime.1 = Timeticks: (4294966977) 497 days, 2:27:49.77
SNMPv2-MIB::sysORUpTime.2 = Timeticks: (4294966977) 497 days, 2:27:49.77

But...if you're not actually running an snmp server, and you just want to use snmp for querying...if you get rid of that file entirely, it ALSO fixes the problem.

Anyway, just another annoying thing I've found that I thought I'd share.

Adam Moskowitz's video: The Future of System Administration

Date August 6, 2014

My friend Adam Moskowitz presented a topic at LOPSA-East this past year that is one near and dear to my heart - the Future of System Administration.

I've written a blog entry with something very close to that title twice:

A while back, I started to realize something...and I haven't written about this before, but I'm firmly coming to see that a system administrator is not just something someone is, system administration is something someone does. Regardless of whether someone's title is SysAdmin, IT Operations, Systems Engineering, or Sparkly DevOps Prince(ess), you might not be a system administrator, but you are doing systems administration.

And in the future, a lot of the people who are going to be doing systems administration are developers.

<voice="Levar Burton">But you don't have to take my word for it...</voice>

The Future of System Administration from Adam Moskowitz on Vimeo.

DRACs and Macs and Java Hacks

Date August 5, 2014

I'm inherently lazy. Like, "I want a bottle of water, but it's all the way over there, so I guess I'll just dehydrate" style lazy.

Our server room is maybe, MAYBE, 50 feet from my desk. But I've got to go around a wall and unlock two doors, disable an alarm system, fight a dragon, and then find the machine I want to administer. Ain't nobody got time for that.

To make sure that I can maintain my standards of laziness, we have remote management available for all of our servers. Since we're a Dell shop, that consists of DRAC , the Dell Remote Access Controller (with the Enterprise license for remote console).

In the past, these have been of questionable quality and perhaps I have, on occasion, called into question the marital status of the DRAC's mother, but we seem to have settled into an uneasy truce where I try to do things, and it continually tells me no until I ask nicely enough.

The first case in point is the DRAC's remote console and the security surrounding it. If you ever take a look at the jnlp file, you'll see that the application pulls down a platform-specific jar file that is the actual KVM application.

When you run jarsigner -verbose -certs -verify on the jar file, here's what you get:

$ jarsigner -verbose -certs -verify ./avctVMLinux32.jar

135 Thu Sep 08 13:44:00 EDT 2011 META-INF/MANIFEST.MF
259 Thu Sep 08 13:44:00 EDT 2011 META-INF/DELL.SF
5406 Thu Sep 08 13:44:00 EDT 2011 META-INF/DELL.RSA
0 Thu Sep 08 13:42:48 EDT 2011 META-INF/
sm 160904 Thu Dec 11 19:54:40 EST 2008 libavmlinux.so

[entry was signed on 9/8/11 9:44 AM]
X.509, CN=Dell Inc., OU=PG Software Development, OU=Digital ID Class 3 - Java Object Signing, O=Dell Inc., L=Round Rock, ST=Texas, C=US
[certificate expired on 8/30/13 7:59 PM]

X.509, CN=VeriSign Class 3 Code Signing 2009-2 CA, OU=Terms of use at https://www.verisign.com/rpa (c)09, OU=VeriSign Trust Network, O="VeriSign, Inc.", C=US
[certificate is valid from 5/20/09 8:00 PM to 5/20/19 7:59 PM]
[KeyUsage extension does not support code signing]
X.509, OU=Class 3 Public Primary Certification Authority, O="VeriSign, Inc.", C=US
[certificate is valid from 1/28/96 7:00 PM to 8/1/28 7:59 PM]

s = signature was verified
m = entry is listed in manifest
k = at least one certificate was found in keystore
i = at least one certificate was found in identity scope

jar verified.

This jar contains entries whose signer certificate has expired.

Emphasis mine.

As you can see via the bolded line, the certificate used to sign this jar file expired in August of 2013. This creates an awkward situation, because Java 1.7 got all gung-ho about security and was like, "sorry dude, no can do. If you want to run this, you've got to turn your security into swiss cheese. Via con Dios". That might be a loose quote.

Of course, that's assuming you can actually run the thing. If you're on a Mac, your computer is conspiring against you. If you run Chrome, for reasons that have never been apparent to me, the jnlp file doesn't actually get downloaded as viewer.jnlp. It gets downloaded as viewer.jnlp(servername@0@idrac-6C61TS1,+PowerEdge+R410,+User-USERNAME@RANDOMNUMBERS).YOURTLDHERE@0@idrac-6C61TS1,+PowerEdge+R410,+User-USERNAME@MORERANDOMNUMBERS), and at least on a Mac, the OS is like, "Sorry, I've got no idea. Would you like to open this with like, text edit, or maybe XMLedit? Because I can't do anything with this".

So once you rename it to .jnlp (complete with scary "changing the file extension may lead to wormholes destroying the universe" warnings), then you double click the file and your computer goes full-on HAL9000. "I'm sorry, Matt, I'm afraid I can't let you do that".


The workaround is, of course, to right click the file then select Open. This lets the computer know that you're totally serial about opening the malware.

Once you do that (then click Open again), Java actually pulls down the jar file and runs it. If you haven't replaced the self-signed DRAC certificate (you lout), then you get an SSL warning when it pulls the jar file, of course. But then you have the jar file, and it runs it, and immediately pukes because of the aforementioned signing issue.

The way to trick it into running is to go into your System Preferences, launch the Java Control Panel, go to Security, say "I solomnly swear that I am up to no good", set your security slider to "High" or if you live dangerously, Medium. I'd say "Low", but much like the pizza place by my house, there is no "Small", only "Medium, "Large", and "Extra Large". You know what? That makes Medium the Small, you morons! Gah!

Anyway, where were we? Right, change your security level to High or Medium, then in the Exception Site List section, click the Edit Site List, then add the https:// url to your DRAC. Yes, all of them. No, you can't use wildcards. Awesome, right?

Anyway, add your DRACs to the list (or just edit the list once and distribute it) and then you'll be able to connect.

Of course...if you're trying to manage ESXi machines like I am, and you're logging in to the console to, say, reset the management agents, you're going to run into other problems...like the fact that VMware's ESXi console makes extensive use of the F11 key, something Macs have claimed as their own, as pressing F11 launches Expose. So basically, by default, you can't possibly press F11 in the virtual terminal because it makes all of your windows scatter like cockroaches.

You can also fix that by going into the control panel, clicking on Mission Control, and change the "Show Desktop" shortcut to something that your enterprise consoles haven't used for critical functionality.

So there. If you use a Mac to manage things, hopefully you can at least remove those particular hurdles. What other things do you run into with Macs (or just stupid platforms in general)? Comment and let me know!

Xenophobia Revisited

Date July 30, 2014

Way back in 2010, I wrote an article called Xenophobia and Elitism in the Community. The article talked mostly about xenophobia and elitism in the community through a technical lens - about the tendencies that we all sometimes get toward mocking some technology or technique, and treating people who practice using it as less intelligent than ourselves, or less worthy than ourselves. It's not a bad read. You might want to check it out.

A recent post on Reddit, titled I've gone off the deep end, brought that same feeling of revulsion back to me in full force. It's rare to see a display of open racism so blatant - regardless of what the original poster calls it. Here's the text, and you should be forewarned - this will probably upset you if you're easily upset. It pissed me off, for what that's worth:

Did any of you see the video on the front page of reddit a few days ago of a saudi guy beating the shit out of an Indian with a belt? It was pretty horrible.

However it made me laugh, and for that I'm ashamed. Working in IT I've developed a deep hatred for Indians, and I'm not even a bad or racist person. I'm just so sick of them as a group thinking they know so much more than everyone else when they often know nothing, and that damn accent, and the weird sense of superiority.

My initial thought was "I wish I could do that to the useless indian IT guys at my company" when I saw that saudi kid swinging the belt and beating him senseless.
I never used to be like that.

"I'm not even a bad or racist person"

I don't even really know how to properly respond to that, when it's immediately followed by "I'm just so sick of them as a group thinking they know so much more than everyone else when they often know nothing, and that damn accent, and the weird sense of superiority".

I'm not going to say, "this should make you mad", but if it doesn't, you might want to think about why it doesn't, and maybe reconsider some things.

What is just as depressing to me is the sheer number of comments of supporters in that thread. People who take the side of the person who wrote that he wishes he could take a belt and beat people in his company senseless.

Look, people can be pretty horrible. People can be monsters, and people can be awful to their fellow people, but I can't sit by and not comment on this blatant ...racist...xenophobic....asshole behavior. These aren't the words of someone who is frustrated. These are the words of someone who really needs to step back and understand that he is everything he claims to not be. He's racist and I don't know whether he's a bad person, but he definitely has some anger issues he should deal with.

Racism is real, and it's more than just "That person is Indian so I'm biased", it's deeply cultural, and it might be deeply biological, but regardless of where the xenophobic, racist distrust originates, we need to see it and we need to understand that it is a bias, and to compensate for it.

The temple of Apollo at Delphi, in ancient Greece, had a stone carved with the words, "γνῶθι σεαυτόν". We're more familiar with the Latin translation, "temet nosce". It means "know thyself", or literally, "get to know yourself", and that's the advice I would give to everyone, because it's something that I struggle with, too, but I work toward because it's important. It might be the most important thing that any of us can do.

We each see the world through a series of lenses handed to us by our parents, our teachers, our religious leaders, and others who we encounter through life. What we end up with is a very customized version of the world that is unique to each individual, and your experience on this Earth are very different than mine, so we see different things. If we're ever to truly understand each other, and work together effectively, I need to understand that my vision is impacted by my experiences, and I need to account for that, and you need to do the same. I can't do that unless I frankly consider how I look at things, and why, and neither can you.

If you talk with another admin or a support person, and you hear the words, "Do the needful", or "please advise", and you feel something negative, ask yourself why that is. Is a negative reaction helping your predicament? Are you better off for automatically having that kind of response? If not, work toward correcting for the response, however you can make that happen. Maybe you need to identify where in your life you got that reaction and isolate what makes you respond that way, so that you can separate that from your current experience. Or maybe you should just use the phrase occasionally yourself. Taken literally, there's nothing wrong with the phrase, assuming both sides of the conversation know what needs to be done.

In the end, I would just encourage you to 'temet nosce', and to work toward a better version of you in the future. I'm trying, and I know that it's an uphill battle, but I think it's worth fighting for. And hopefully you do, too.

Kerbal Space System Administration

Date July 28, 2014

I came to an interesting cross-pollination of ideas yesterday while talking to my wife about what I'd been doing lately, and I thought you might find it interesting too.

I've been spending some time lately playing video games. In particular, I'm especially fond of Kerbal Space Program, a space simulation game where you play the role of spaceflight director of Kerbin, a planet populated by small, green, mostly dumb (but courageous) people known as Kerbals.

Initially the game was a pure sandbox, as in, "You're in a planetary system. Here are some parts. Go knock yourself out", but recent additions to the game include a career mode in which you explore the star system and collect "science" points for doing sensor scans, taking surface samples, and so on. It adds a nice "reason" to go do things, and I've been working on building out more efficient ways to collect science and get it back to Kerbin.

Part of the problem is that when you use your sensors, whether they detect gravity, temperature, or materials science, you often lose a large percentage of the data when you transmit it back, rather than deliver it in ships - and delivering things in ships is expensive.

There is an advanced science lab called the MPL-LG-2 which allows greater fidelity in transmitted data, so my recent work in the game has been to build science ships which consist of a "mothership" with a lab, and a smaller lightweight lander craft which can go around whatever body I'm orbiting and collect data to bring to the mothership. It's working pretty well.

At the same time, I'm working on building out a collectd infrastructure that can talk to my graphite installation. It's not as easy as I'd like because we're standardized on Ubuntu Precise, which only has collectd 4.x, and the write_graphite plugin began with collectd 5.1.

To give you background, collectd is a daemon that runs and collects information, usually from the local machine, but there are an array of plugins to collect data from any number of local or remote sources. You configure collectd to collect data, and you use a write_* plugin to get that data to somewhere that can do something with it.

It was in the middle of explaining these two things - KSP's science missions and collectd - that I saw the amusing parity between them. In essence, I'm deploying science ships around my infrastructure to make it easier to get science back to my central repository so that I advance my own technology. I really like how analogous they are.

I talked about doing the collectd work on twitter, and Martijn Heemels expressed interest in what I was doing, since he would also like write_graphite on Precise, so I figured that other people probably might want to get in on the action, so to speak. I could give you the package I made, or I could show you how I made it. That sounds more fun.

Like all good things, this project involves software from Jordan Sissel - namely fpm, effing package management. Ever had to make packages and deal with spec files, control files, or esoteric rulesets that made you go into therapy? Not anymore!

So first we need to install it, which is easy, because it's a gem:

$ sudo gem install fpm

Now, lets make a place to stage files before they get packaged:

$ mkdir ~/collectd-package

And grab the source tarball and untar it:

$ wget https://collectd.org/files/collectd-5.4.1.tar.gz
$ tar zxvf collectd-5.4.1.tar.gz
$ cd collectd-5.4.1/

(if you're reading this, make sure to go to collectd.org and get the new one, not the version I have listed here.)

Configure the Makefile, just like you did when you were a kid:

$ ./configure --enable-debug --enable-static --with-perl-bindings="PREFIX=/usr"

Hat tip to Mike Julian who let me know that you can't actually enable debugging in the collectd tool unless you actually use the flag here, so save yourself some heartbreak by turning that on. Also, if I'm going to be shipping this around, I want to make sure that it's compiled statically, and for whatever reason, I found that the perl bindings were sad unless I added that flag.

Now we compile:

$ make

Now we "install":

make DESTDIR="/home/YOURUSERNAME/collectd-package" install

I've found that the install script is very grumpy about relative directory names, so I appeased it by giving it the full path to where the things would be dropped (the directory we created earlier)

We're going to be using a slightly customized init script. I took this from the version that comes with the precise 4.x collectd installation and added a prefix variable that can be changed. We didn't change the installation directories above, so by default, everything is going to eventually wind up in /opt/collectd/ and the init script needs to know about that:

$ cd ~
$ mkdir -p collectd-package/etc/init.d/
$ wget --no-check-certificate -O collectd-package/etc/init.d/collectd http://bit.ly/1mUaB7G
$ chmod +x collectd-package/etc/init.d/collectd

This is pulling in the file from this gist.

Now, we're finally ready to create the package:

fakeroot fpm -t deb -C collectd-package/ --name collectd \
--version 5.4.1 --iteration 1 --depends libltdl7 -s dir opt/ usr/ etc/

Since you may not be familiar with fpm, some of the options are obvious, but for the ones that aren't, -C changes directory to the given argument, --version is the version of the software, as opposed to --iteration is the version of the package. If you package this, deploy it, then find a bug in the packaging, when you package it again after fixing the problem, you increment the iteration flag, and your package management can treat it as an upgrade. The --depends is a library that collectd needs on the end systems. -s sets the source type to "directory", and then we give it a list of directories to include (remembering that we've changed directories with the -C flag).

Also, this was my first foray into the world of fakeroot, which you should probably read about if you run Debian-based systems.

At this point, in the current directory, there should be "collectd_5.4.1-1.deb", a package file that works for installing using 'dpkg -i' or in a PPA or in a repo, if you have one of those.

Once collectd is installed, you'll probably want to configure it to talk to your graphite host. Just edit the config in /opt/collectd/etc/collectd.conf. Make sure to uncomment the write_graphite plugin line, and change the write_graphite section. Here's mine:

    Port "2003"
    Protocol "tcp"
    LogSendErrors true
    # remember the trailing period in prefix
    #    otherwise you get CCIS.systems.linuxTHISHOSTNAME
    #    You'll probably want to change it anyway, because 
    #    this one is mine. ;-) 
    Prefix "CCIS.systems.linux."
    StoreRates true
    AlwaysAppendDS false
    EscapeCharacter "_"

Anyway, hopefully this helped you in some way. Building a puppet module is left as an exercise to the reader. I think I could do a simplistic one in about 5 minutes, but as soon as you want to intelligently decide which modules to enable and configure, then it gets significantly harder. Hey, knock yourself out! (and let me know if you come up with anything cool!)

Happy SysAdmin Appreciation Day 2014!

Date July 25, 2014

Well, well, well, look what Friday it is!

In 1999, SysAdmin Ted Kekatos decided that, like administrative professionals, teachers, and cabbage, system administrators needed a day to recognize them, to appreciate what they do, their culture, and their impact to the business, and so he created SysAdmin Appreciation Day. And we all reap the benefit of that! Or at least, we can treat ourselves to something nice and have a really great excuse!

Speaking of, there are several things I know of going on around the world today:

As always, there are a lot of videos released on the topic. Some of the best I've seen are here:

  • A karaoke power-ballad from Spiceworks:
  • A very funny piece on users copping to some bad behavior from Sophos:
  • A heartfelt thanks from ManageEngine:
  • Imagine what life would be like without SysAdmins, by SysAid:

Also, I feel like I should mention that one of my party sponsors, Goverlan, is also giving away their software today to the first thousand people who sign up for it. It's a $900 value, so if you do Active Directory management, you should probably check that out.

Sophos was giving away socks, but it was so popular that they ran out. Not before sending me the whole set, though!

I don't even remember signing up for that, but apparently I did, because they came to my work address. Awesome!

I'm sure there are other things going on, too. Why not comment below if you know of one?

All in all, have a great day and try to find some people to get together with and hang out. Relax, and take it easy. You deserve it.