Debian Jessie Preseed – Yes, please

What could possibly draw me out of blogging dormancy? %!@^ing Debian, that’s what.

I’m working on getting a Packer job going so I can make an updated Vagrant box on a regular basis, but in order to do that, I need to get a preseed file arranged. Well, everything goes swimmingly using examples I’ve found on the internet (because the Debian-provided docs are awful) until this screen:

partition disks if you continue the changes listed below will be written to the disks write the changes to disks yes no

That’s weird. I’ve got the right answers in the preseed:

d-i partman-auto/method                 string lvm
d-i partman-auto-lvm/guided_size        string max
d-i partman-lvm/device_remove_lvm       boolean true
d-i partman-lvm/confirm                 boolean true
d-i partman-auto-lvm/confirm            boolean true
d-i partman-partitioning/confirm_write_new_label   boolean true
d-i partman-partitioning/confirm        boolean true
d-i partman-lvm/confirm_nooverwrite     boolean true

and I’ve even got this, as a backup:

d-i partman/confirm                     boolean true
d-i partman/confirm_nooverwrite         boolean true

The “partman-lvm” is what should be the answer, because of the “partman-auto/method” line. Well, it doesn’t work. That’s weird. So I start Googling. And nothing. And nothing. Hell, I can’t even figure out if preseed files are order-specific. The last time someone apparently asked that question on the internet was 2008.

Anyway. My coworker found a link to a Debian 6 question on ServerFault and suggested that I try something crazy: set the preseed answer for the partman RAID setting:

d-i partman-md/confirm_nooverwrite      boolean true

which makes no freaking sense. So, of course, it worked.

Why you have to do that to make the partman-lvm installer work, I have no idea. Please comment if you have any inkling. Or just want to rant against preseeds too. It’s 8:45am and I’m ready to go home.

How I approach a new python project

It’s been a long time since I’ve written a post, but that’s just because I haven’t had time. For real I haven’t had time, not like, “making excuses, I haven’t had time”. I stay busy, and I love it.

I’m in what passes for a lull starting today (and I’ve been sick to boot), so I answered a question on Reddit about Raspberry Pi development, and I got a PM from someone who read it, who was interested in learning how I approached a new project with Python. I wrote a long answer, so I figured I’d share it here. This totally counts as a blog entry ;-)

Hey there,

I saw your post on programming or rasp pi for someone learning python. My knowledge base is probably under theirs, however im still interested in learning python. Im actually in the process of making a magic mirror.

I was wondering where would be a good starting point to learn CLI and scraping data and putting together a gui. I would like to do as much as i personally could.



Here’s my answer:

Hi X!,

This is a really good guide:

But there are lots of ways to do it. In general, since Python is so easy to develop on, there are tons of modules out there to do it. In general, after I figure out what I want to work on, I see what the existing libraries are for dealing with that, and use one of them if I can, and if I can’t, I go back to first principles and look at what the methods are for interacting with the service I want to interface with.

For instance, lets say that I live in Los Angeles (which I do), and I want to monitor earthquakes so that my magic mirror can let me know when I wake up if there’s been anything happen while I was asleep. The first thing I’d do is to Google for “python earthquake library“, and what’s the first hit? A very in depth guide  about how to use python to monitor for earthquakes, including maps and stuff way beyond what I’m looking for.

Since that’s a bit overwhelming and I’m not quite ready to get into matplotlib (which is the graphical stuff they’re using), lets see what the next few links are:

So we have our choice. Since I really just care about quakes affecting the LA basin, I’ll look through the libraries and see which ones have the best interface to allow me to geographically select things, and then move on from there.

So that’s an existing library, but suppose we need to screen-scrape things? Like, what if you wanted to generate a number for the average rental prices in a zip code? Well, first search “python library rental prices“, and there’s nothing, so we’re going to have to do it the hard way.

There are almost certainly better ways, but just as an example, lets take a look at (note, what we’re doing is probably against the terms of service, so you should probably not offer this as a public solution – just keep it for your personal use). That allows us to specify a zip code, a couple of quick filters, and when we search, it gives us most of that on the URL. Easy-peasy!

When I search for two bedroom apartments in the zipcode of my job, I get this link:

As a human, it’s easy for me to look at that URL and figure out what everything is, with the exception of the 1z141y8. I’m worried that it might be a unique ID, like a cookie almost, that makes sure that I was the one that loaded the previous page. When I look at the original page’s (very ugly) source, though, I see that it shows up here:

<meta class=”pageInfo” content=”2-beds-1z141y8″ name=”refinements”>

On a hunch, I go to the results page again, and I change the zipcode to Columbus, Ohio, where I lived for a while, and I hit enter. The results for Columbus pop up, so I think we’re good to go.

So, given the very first link I mentioned (remember?), I’d go through the web page results with the XML scraper and get the information I needed, then do whatever I needed with them – average, or whatever.

Does that make sense? Thanks!


So, generally, that’s how I approach the problem, regardless of the language. Sometimes, if I’m really into a project, I’ll actually write the library myself (like I did with libMBTA), but the world is gradually becoming one big API, and if you can program, you can take advantage of it!

I hope you enjoyed it. If so, let me know in the comments. Thanks!

Debian Jessie and Puppet

Please correct me if this blog entry is wrong, because I really, truthfully hope it is.

It seems that Debian Jessie is not going to be receiving backports to Puppet 3 (client, anyway).  The way forward is through Puppet 4 (which most of you have probably known about for a while). Like me, you probably hoped to not have to go there so soon. Well, maybe not, but I had hoped to not go there so soon, anyway.

So the first thing I started to do was build out a Puppet 4 testing Vagrant box, which was surprisingly hard. Basically, puppet 3 supported the “–manifestdir” argument, but puppet 4 doesn’t, so if you install the puppet-agent package (which is how you install puppet 4 rather than puppet 3), it dies with “Could not parse application options: invalid option: –manifestdir”. Not cool.

The solution is to redesign your Vagrant puppet directory. Instead of “puppet/init.pp” and “puppet/modules/”, you need to make “puppet/environments/site/environmentname/manifests/site.pp” and “puppet/environment/site/environmentname/modules/”, respectively. The puppet config for my box looks like this:

config.vm.provision :puppet do |puppet|
puppet.hiera_config_path = "hiera.yaml"
puppet.environment_path = "puppet/environments"
puppet.environment = "vagrant"

The directory structure looks like this:

├── hiera
│   ├── common.yaml
│   └── nodes
├── hiera.yaml
├── puppet
│   ├── environments
│   │   └── vagrant
│   │       ├── manifests
│   │       │   └── site.pp
│   │       ├── modules
│   │       │   ├── collectd
│   │       │   ├── concat
│   │       │   ├── redis
│   │       │   ├── stdlib
│   │       │   └── sysctl
│   │       └── Puppetfile
│   └── Puppetfile.lock
└── Vagrantfile

Now, you should know… a lot of the puppet code will work without changing, but there were some not-small changes, including a few real head scratchers. For instance:

Empty Strings in Boolean Context are true

In previous versions of Puppet, an empty string was evaluated as a false boolean value. You would see this in variable and parameter default values where conditional checks would be used to determine if someone passed in a value or left it blank.

class empty_string_defaults (
  $parameter_to_check = ''
) {
  if $parameter_to_check {
    $parameter_to_check_real = $parameter_to_check
  } else {
    $parameter_to_check_real = 'default value'

Puppet’s old behavior of evaluating the empty string as false would allow you to set the default based on a simple if-statement. In Puppet 4.x, this behavior is flipped and $parameter_to_check_real will be set to an empty string.

You can check your existing codebase for this behavior with a puppet-lint plugin.

See the language page on boolean values for more info.

I… um… I don’t even really know what to say to that. I mean, they completely flipped the entire logic of the test around. That’s pretty unnecessary, in my book. I understand what they’re trying to say, that even an empty value isn’t False, but it’s hard to think of a more common shortcut when first writing code. I’m not going to argue that checking against an empty string is a good way to determine true or false, but wow, to change the behavior of the language entirely like that? That’s intense.

Another one I found that will probably kill code:

Regular Expressions Against Non-Strings

Matching a value that is not a string with a regular expression now raises an error. In 3.x, other data types were converted to string form before matching (often with surprising and undefined results).

$securitylevel = 2

case $securitylevel {
  /[1-3]/: { notify { 'security low': } }
  /[4-7]/: { notify { 'security medium': } }
  default: { notify { 'security high': } }

Prior to Puppet 4.0, the first regex would match, and the notify { ‘security low’: } resource would be put into the catalog.

Now, in Puppet 4.0, neither of the regexes would match because the value of $securitylevel is an integer, not a string, and so the default condition would match, resulting in the inclusion of notify { 'security high': } in the catalog.

So, I can kind of see that. A 2 is clearly a number. But on the other hand, the puppet code will happily cast the string ’30’ as a number because that’s valid. But the number 30 can easily be turned into a string, too: “30”. See?

I actually tested to see if you explicitly declared the “2” as a string, would the regex pick it up. Nope, not at all:

$securityLevel = “2”

case $securitylevel {
/[1-3]/: { notify { ‘security low’: } }
/[4-5]/: { notify { ‘security medium’: } }
default: { notify { ‘security high’: } }

[email protected]:/tmp/vagrant-puppet/environments/vagrant/manifests$ sudo puppet apply –modulepath=../modules/ ./site.pp
Warning: Facter: timeout option is not supported for custom facts and will be ignored.
Notice: Compiled catalog for test-monitoring-client.spacex.corp in environment production in 4.16 seconds
Notice: security high

Because of course not.

Anyway, I’m currently dealing with this, so I figured I’d write my first blog entry in a while so you could share in the fun. Good luck! And if you have a good technique for testing existing code against new puppet builds, let me know! I’m considering my CI options, but I’ve got a lot of new tests to write, it seems like.

A blog for IT Admins who do everything by an IT Admin who does everything