Reminder (to self, too): Use Python virtualenv!

Date April 20, 2015

I’m really not much of a programmer, but I dabble at times, in order to make tools for myself and my colleagues. Or toys, like the time I wrote an entire MBTA library because I wanted to build a Slack integration for the local train service.

One of the things that I want to learn better, because it seems so gosh-darned helpful, is Python. I’m completely fluent (though non-expert level) in both Bash and PHP, so I’m decent at writing systems scripts and web back-ends, but I’m only passingly familiar with Perl. The way I see it, the two “modern” languages that get the most use in systems scripts are Python and Ruby, and it’s basically a toss-up for me as to which to pick.

Python seems a little more pervasive, although ruby has the benefit of basically being the language of our systems stack. Puppet, Foreman, logstash, and several other tools are all written in Ruby, and there’s a lot to be said for being fluent in the language of your tools. That being said, I’m going to learn Python because it seems easier and honestly, flying sounds so cool.

 

One of the things that a lot of intro-to-Python tutorials don’t give you is the concept of virtual environments. These are actually pretty important in a lot of ways. You don’t absolutely need them, but you’re going to make your life a lot better if you use them. There’s a really great bit on why you should use them on the Python Guide, but basically, they create an entire custom python environment for your code, segregated away from the rest of the OS. You can use a specific version of python, a specific set of modules, and so on (with no need for root access, since they’re being installed locally).

Installing virtualenv is pretty easy. You may be able to install it with your system’s package manager, or you may need to use pip. Or you could use easy_install. Python, all by itself, has several package managers. Because of course it does.

Setting up a virtual environment is straight forward, if a little kudgy-feeling. If you find that you’re going to be moving it around, maybe from machine to machine or whatever, you should probably know about the —relocatable flag.

By default, the workflow is basically, create a virtual environment, “activate” the virtual environment (which mangles lots of environmental variables and paths, so that python-specific stuff runs local to the environment rather than across the entire server), configuring it by installing the modules you need, write/execute your code as normal, and then deactivate your environment when you’re done, which restores all of your original environmental settings.

There is also a piece of software called virtualenvwrapper that is supposed to make all of this easier. I haven’t used it, but it looks interesting. If you find yourself really offended by the aforementioned workflow, give it a shot and let me know what you think.

Also as a reminder, make sure to put your virtual environment directory in your .gitignore file, because you’re definitely using version control, right? (Right?) Right.

Here’s how I use virtual environments in my workflow:


msimmons@bullpup:~/tmp > mkdir mycode
msimmons@bullpup:~/tmp > cd mycode
msimmons@bullpup:~/tmp/mycode > git init
Initialized empty Git repository in /home/msimmons/tmp/mycode/.git/
msimmons@bullpup:~/tmp/mycode > virtualenv env
New python executable in env/bin/python
Installing setuptools, pip...done.
msimmons@bullpup:~/tmp/mycode > echo "env" > .gitignore
msimmons@bullpup:~/tmp/mycode > git add .gitignore # I always forget this!
msimmons@bullpup:~/tmp/mycode > source env/bin/activate
(env)msimmons@bullpup:~/tmp/mycode >
(env)msimmons@bullpup:~/tmp/mycode > which python
/home/msimmons/tmp/mycode/env/bin/python
(env)msimmons@bullpup:~/tmp/mycode > deactivate
msimmons@bullpup:~/tmp/mycode > which python
/usr/bin/python

Spinning up a quick cloud instance with Digital Ocean

Date April 15, 2015

This is another in a short series of blog posts that will be brought together like Voltron to make something even cooler, but it’s useful on its own. 

I’ve written about using a couple other cloud providers before, like AWS and the HP cloud, but I haven’t actually mentioned Digital Ocean yet, which is strange, because they’ve been my go-to cloud provider for the past year or so. As you can see on their technology page, all of their instances are SSD backed, they’re virtualized with KVM, they’ve got IPv6 support, and there’s an API for when you need to automate instance creation.

To be honest, I’m not automating any of it. What I use it for is one-off tests. Spinning up a new “droplet” takes less than a minute, and unlike AWS, where there are a ton of choices, I click about three buttons and get a usable machine for whatever I’m doing.

To get the most out of it, the first step you need to do is to generate an SSH key if you don’t have one already. If you don’t set up key-based authentication, you’ll get the root password for your instance in your email, but ain’t nobody got time for that, so create the key using ssh-keygen (or if you’re on Windows, I conveniently covered setting up key-based authentication using pageant the other day – it’s almost like I’d planned this out).

Next, sign up for Digital Ocean. You can do this at DigitalOcean.com or you can get $10 for free by using my referral link (and I’ll get $25 in credit eventually).  Once you’re logged in, you can create a droplet by clicking the big friendly button:

This takes you to a relatively limited number of options – but limited in this case isn’t bad. It means you can spin up what you want without fussing about most of the details. You’ll be asked for your droplet’s hostname (which will be used to refer to the instance both in the Digital Ocean interface and will actually be set to to the hostname of the created machine),  you’ll need to specify the size of the machine you want (and at the current moment, here are the prices:)

The $10/mo option is conveniently highlighted, but honestly, most of my test stuff runs perfectly fine on the $5/mo, and most of my test stuff never runs for more than an hour, and 7/1000 of a dollar seems like a good deal to me. Even if you screw up and forget about it, it’s $5/mo. Just don’t set up a 64GB monster and leave that bad boy running.

Next there are several regions. For me, New York 3 is automatically selected, but I can override that default choice if I want. I just leave it, because I don’t care. You might care, especially if you’re going to be providing a service to someone in Europe or Asia.

The next options are for settings like Private Networking, IPv6, backups, and user data. Keep in mind that backups cost money (duh?), so don’t enable that feature for anything you don’t want to spend 20% of your monthly fee on.

The next option is honestly why I love Digital Ocean so much. The image selection is so painless and easy that it puts AWS to shame. Here:

You can see that the choice defaults to Ubuntu current stable, but look at the other choices! Plus, see that Applications tab? Check this out:

I literally have a GitLab install running permanently in Digital Ocean, and the sum total of my efforts were 5 seconds of clicking that button, and $10/mo (it requires a gig of RAM to run the software stack). So easy.

It doesn’t matter what you pick for spinning up a test instance, so you can go with the Ubuntu default or pick CentOS, or whatever you’d like. Below that selection, you’ll see the option for adding SSH keys. By default, you won’t have any listed, but you have a link to add a key, which pops open a text box where you can paste your public key text. The key(s) that you select will be added to the root user’s ~/.ssh/authorized_keys file, so that you can connect in without knowing the password. The machine can then be configured however you want. (Alternately, when selecting which image to spin up, you can spin up a previously-saved snapshot, backup, or old droplet which can be pre-configured (by you) to do what you need).

Click Create Droplet, and around a minute later, you’ll have a new entry in your droplet list that gives you the public IP to connect to. If you spun up a vanilla OS, SSH into it as the root user with one of the keys you specified, and if you selected one of the apps from the menu, try connecting to it over HTTP or HTTPS.

That’s really about it. In an upcoming entry, we’ll be playing with a Digital Ocean droplet to do some cool stuff, but I wanted to get this out here so that you could start playing with it, if you don’t already. Make sure to remember, though, whenever you’re done with your machine, you need to destroy it, rather than just shut it down. Shutting it down makes it unavailable, but keeps the data around, and that means you’ll keep getting billed for it. Destroy it and that erases the data and removes the instance, which is what causes you to be billed.

Have fun, and let me know if you have any questions!

Dealing with key-based authentication on Windows with Putty

Date April 10, 2015

I’m writing this entry because I’m going to be writing another entry soon, and I want to point to this rather than explain it in situ. 

Here lately, I’ve been using Windows on my desktop. At work, this is mostly because of the extensive administration I do with VMware, and there’s STILL no native way on Linux or Mac to do things like Update Manager, and at home, because I play lots of video games. Lots. Of. Games.

The end result is that I spend a lot of time using Putty. There are a lot of Windows-specific SSH clients, but I like Putty’s great combination of being tiny, running without any actual installation, and reasonably dense feature-set. If you’re on Windows and you need to deal with Linux hosts, you’re probably already using Putty, but maybe not as completely as you could be.

There is a small  ecosystem of applications that work with Putty, including sftp clients and an SSH client that runs in the Windows command prompt (plink). They’re all available on the same Putty download page. The biggest win, in my opinion, is to combine it with Pageant. Much like ssh-agent on Linux, Pageant manages your SSH keys, allowing you to log into remote hosts without typing passwords, and only typing your key’s passphrase once.

The first step with key-based authentication is to actually generate some keys. For Pageant, the easiest way is probably to use PuttyGen, which looks like this:

Click “Generate” and move the mouse around as the directions say:

This produces your actual key:

 

You want to type in a “Key passphrase” that is a long-ish phrase that you can remember well enough to re-type occasionally. Once you’ve done that, click “Save public key”, make a keys directory, and save it in there, then do the same with “Save private key”. You should care that people don’t get the private key, but your passphrase should be long enough that it’s unlikely that anyone could brute-force your key before you change it or lose it or maybe if you like typing, until the heat death of the universe.

Copy the text at the top and save that into notepad so we can have it after this closes. We can get it again by re-running the key generator, but if you’re like me, you didn’t install it, you just kind of ran it from your downloads, and you’d probably have to download it again to run it again, so just keep the text in Notepad for now.

Alright, so now you want to download Pageant and this time, you want to save it somewhere useful. I have a “Programs” directory that I made under C:\Users\msimmons\ that holds stuff like this, so I saved it there. Once it was there, I right clicked and said “Create Shortcut”, which I then dragged into C:\Users\msimmons\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup – this makes sure that Pageant will start when I log in. By default, that won’t actually load my key, though, so we have to edit the properties on the shortcut and add the key as an argument to the executable:

 

Now, when you log in, you’ll be prompted to type the passphrase to your private key, which will allow you to put that public key into the authorized_keys of a target host and authenticate as that user without typing a password every time! Excellent!

Big Changes at USENIX LISA in the last 5-10 Years

Date April 8, 2015

We received an interesting email recently:

> Did the submissions process for LISA change
> in recent years? I recall going to submit a talk a couple years ago
> and being really put off by the requirements for talks to be
> accompanied by a long paper, and be completely original and not
> previously presented elsewhere. Now it seems more in line with other
> industry conferences.

Yes, LISA is very different than it was years ago. If you haven’t attended LISA in a while, you may not realize how different it is!

The conference used to be focused on papers with a few select “invited talks”. A few years ago, the conference changed its focus to be great talks. LISA still accepts “original research” papers, but they’re just one track in a much larger conference and have a separate review process. In fact, the conference now publishes both a Call for Participation and a separate Call for Research Papers and Posters.

If LISA is now “talk-centric”, what kind of talks does it look for? Quoting from the Call for Participation, “We invite industry leaders to propose topics that demonstrate the present and future state of IT operations. [Talks should] inspire and motivate attendees to take actions that will positively impact their business operations.” LISA looks for a diverse mix of speakers, not just gender diversity, but newcomers and experienced speakers alike. We have special help for first time speakers, including assistant with rehearsals and other forms of mentoring.

What about the papers that LISA does publish? The papers have different criteria than talks. They should “describe new techniques, tools, theories, and inventions, and present case histories that extend our understanding of system and network administration.” Starting in 2014, the papers have been evaluated by a separate sub-committee of people with academic and research backgrounds. This has had an interesting side-effect: the overall quality of the papers has improved and become more research/forward-looking.

Because LISA mixes industry talks and research papers, attendees get to hear about new ideas along before they become mainstream. Researchers benefit by having the opportunity to network and get feedback from actual practitioners of system administration. This gives LISA a special something you don’t find anywhere else.

Another thing that makes LISA better is the “open access” policy. Posters, papers, and presentations are available online at no charge. This gives your work wider visibility, opening up the potential to have greater impact on our industry. Not all conferences do this, not even all non-profit conferences do this.

Does that make you more interested in submitting a proposal?

We hope it does!

All proposal submissions are due by April 17, 2015.

Tom Limoncelli and Matt Simmons
(volunteer content-recruiters for LISA ‘15)

P.S. LISA has a new mission statement:
LISA is the premier conference for IT operations, where systems engineers, operations professionals, and academic researchers share real-world knowledge about designing, building, and maintaining the critical systems of our interconnected world.

Connecting Apache Directory Studio to Active Directory

Date March 10, 2015

This is more of a reminder for me than anything, but you might find it useful as well. You may be aware that querying LDAP using the command line tools in Linux are a PITA. Fortunately, the Apache Directory Project has released the Apache Directory Studio (this isn’t new software, I’ve just never written about it) to help deal with LDAP.

I’ve had our production LDAP cluster in ADS for a while and used it to take a look around when necessary (usually because I always forget how exactly to set up a DRAC to bind to LDAP), but I realized today that I’d never configured it to look at our AD schema. I’m not “technically” the Windows guy, but I figured, hey, what’s the worst that could happen? Ahem. Anyway, nothing bad happened.

Because a couple of parts weren’t exactly straightforward, I figured I’d write it down and you and I could both get something out of it.

Step 1: Create a new LDAP Connection by clicking the yellow LDAP icon to the right of “LDAP Servers”

Step 2: Fill out the information in the box specific to your domain. These are the settings that worked for me

Note that I enabled “Read-Only” because I’m really not into making schema changes from outside of AD, even if I knew what I was doing. Which I don’t.

Step 3: Fill out the authentication credentials

Note here that although there are several authentication methods (including GSSAPI (Kerberos)), I couldn’t get any to work except this. I don’t know why. If you can figure out how to get the connection to work with your existing Kerberos ticket, I’d be interested in knowing how to set that up.

You’ll be prompted to trust the certificate (or not), and at that point you should be able to browse the AD schema to your heart’s content.

Let me know if this worked for you.

Annoying pfSense Issue with 2.15 -> 2.2 Upgrade

Date March 3, 2015

I run several pfSense boxes throughout my network. Although the platform doesn’t have an API, and it can be a pain to configure manually in certain cases, it’s generally very reliable once running, and because it’s essentially a skinned BSD, it’s very easy on resources. There’s also a really nice self-update feature that I use to get things to the newest release when they’re available.

It’s that last feature that bit me in my butt Sunday night. After doing the upgrade at midnight or so, I went to bed after everything seemed to work alright, but then this morning, I started getting reports that people couldn’t log into the captive portal that we use for our “guest” wired connections.

I thought, “That’s strange…everything seemed to work after the upgrade, but I’ll check it out”, and sure enough, as far as I could tell, all of the networks were working fine on that machine, but there was no one logged into the captive portal.

Taking a look at the logs, I found this error:

logportalauth[42471]: Zone: cpzone – Error during table cpzone creation.
Error message: file is encrypted or is not a database

Well, hrm. “Error during table cpzone creation” is strange, but “file is encrypted or is not a database” is even weirder. Doing a quick google search, I came across this thread on the pfSense forums where someone else (maybe the only other person?) has encountered the same problem I have.

As it turns out, prior to version 2.2, pfSense was still using sqlite2, but now, it’s on sqlite3, and the database formats are incompatible. A mention of that in the upgrade notes would have been, you know, swell.

The thread on the forums suggests to shut off the captive portal service, remove the .db files, and then restart the service. I tried that, and it didn’t work for me, so what I did after that was to shut down the captive portal (to release any file locks), remove the db files, and then from the text-mode administrative menu, force an re-installation of pfSense itself.

Although I haven’t actually tested the captive portal yet (I’m at home doing this remotely, because #YOLO), a new database file has been created (/var/db/captiveportalcpzone.db) and inspecting it seems to show sqlite3 working:

[2.2-RELEASE][root@host]/var/db: sqlite3 captiveportalcpzone.db
SQLite version 3.8.7.2 2014-11-18 20:57:56
Enter ".help" for usage hints.
sqlite> .databases
seq  name             file
---  ---------------  ----------------------------------------------------------
0    main             /var/db/captiveportalcpzone.db
sqlite> .quit

This is as opposed to some of the older database files created prior to upgrade:

[2.2-RELEASE][root@host]/var/db/backup: sqlite3 captiveportalinterface.db
SQLite version 3.8.7.2 2014-11-18 20:57:56
Enter ".help" for usage hints.
sqlite> .databases
Error: file is encrypted or is not a database
sqlite>

What I don’t understand is that the normal way to convert from sqlite2 to sqlite3 is to dump and restore, but it doesn’t look like this process did that at all. It would be incredibly easy to do a database dump/restore during an upgrade, ESPECIALLY when revving major database versions like this.

Anyway, this kind of experience is very unusual for me with pfSense. Normally it’s “set it and forget it”. Hopefully this will work and I can get back to complaining about a lack of API.