Tag Archives: howto

Installing Windows 8 on VirtualBox

Windows 8 is scheduled to be the next workstation release of Windows, following the path of Windows 7, Vista, and XP before it. The developer preview became available the other day, and I just now got time to download and play with it. If you want to, you can pick it up here. It’s free, and doesn’t require registration.

Because I had a couple of problems getting it to work on my platform of choice (VirtualBox), I thought I’d do a write-up with some screen shots.

After you download the ISO, go to VirtualBox, and create a new VM:

On the next screen, enter whatever you want as the name of your new VM, select “Microsoft Windows” as the OS, and “Other Windows” as the version:

I gave my OS 2GB of RAM, but you can adjust this as you need. There are examples of it running with under 300MB of RAM. I think I’d only do that if you want to punish the system to see what it can take. Give it at least a GB just to get a baseline.

Create a new startup disk of whichever format you want. I also make mine dynamically sized (thin provisioned) so I have some diskspace left, but you don’t have to. Basically, however you’re used to making VMs is fine for this part. I gave the system disk 20GB (but since it’s thin-provisioned, it’ll only use what it needs).

Click “Create” at the end of the process to make the shiny new VM.

Before we fire it up, we’ve got to change some settings. First we’ll add the ISO to the VM’s CDROM by clicking storage:

Then in the Storage tree, click the empty CD-ROM icon, then on the far right, click the CD icon next to “IDE Secondary”, then select “Choose a virtual CD/DVD disk file”, then browse to the DVD ISO you downloaded from Microsoft, and select it.

Now, if you are like me, and start the VM right now, you’ll have a problem. Namely this:

Status: 0xc0000225
Info: An unexpected error has occurred

Yep, that was unexpected.

If you research that error, you find that in Windows 7, you got that message when you used a RAM disk on a UEFI-enabled computer. Since Windows 8 is undoubtably using a RAM disk to install itself (because there’s no ACTUAL disk available yet). So to fix that, we need to alter the VirtualBox settings.

Click on the Settings box:

Then under “Extended Features”, make sure “Enable IO APIC” is enabled:

Now, if you were to boot up, it would install, but you would have no network connection. To fix that, lets change the virtualized network card. Click the “Network” box, like you did the “System” box above, then check the “Advanced” settings on Adapter 1.

Pull the dropdown menu on “Adaptor Type”, and select “Intel PRO/1000 MT Desktop (82540EM)”:

After that, boot up the VM, and you’ll be installing Windows 8 in no time!

HOWTO: RedHat Cluster Suite

Alright, here it is, my writeup on RHCS. Before I continue, I need to remind you that, as I mentioned before, I had to pull the plug on it. I never got it working reliably so that a failure wouldn’t bring down the entire cluster, and from the comments in that thread, I’m not alone.

This documentation is provided for working with the RedHat Cluster Suite that shipped with RHEL/CentOS 5.2. It is important to keep this in mind, because if you are working with a newer version, there may be major changes. This has already happened before with the 4.x-5.x switch, rendering most of the documentation on the internet deprecated at best, and destructive at worst. The single most helpful document I found was this: The Red Hat Cluster Suite NFS Cookbook, and even with that, you will notice the giant “Draft Copy” watermark. I haven’t found anything to suggest that it was ever revised past “draft” form.

In my opinion, RedHat Cluster Suite is not ready for “prime time”, and even in the words of a developer from #linux-cluster, “I don’t know if I would use 5.2 stock release with production data”. That being said, you might be interested in playing around with it, or you might choose to ignore my warnings and try it on production systems. If it’s the latter, please do yourself a favor and have a backup plan. I know from experience that it’s no fun to rip out a cluster configuration and try to set up discrete fileservers.

Alright, that’s enough of a warning I think. Lets overview what RHCS does.

RedHat Cluster Suite is designed to allow High Availability (HA) services, as opposed to a compute cluster which gives you the benefit of parallel processing. If you’re rendering movies, you want a compute cluster. If you want to make sure that your fileserver is always available, you want an HA cluster.

The general idea of RHCS is that you have a number of servers (nodes), hopefully at least 3, but 2 is possible but not recommended. Each of those machines is configured identically and has the cluster configuration distributed to it. The “cluster manager” (cman) keeps track of who all is a member of the cluster. The Cluster Configuration System (ccs) makes sure that all cluster nodes have the same configuration. The resource manager (rgmanager) makes sure that your configured resources are available on each node, the Clustered Logical Volume Manager (clvmd) makes sure that everyone agrees that disks are available to the cluster, and the lock manager (dlm (distributed lock manager) or gulm (grand unified lock manager (deprecated))) ensures that your filesystems’ integrity is maintained across the cluster. Sounds simple, right? Right.

Alright, so lets make sure the suite is installed. Easiest way is to make sure the Clustering and Cluster Storage options are selected at install or in system-config-packages. Note: if you have a standard RedHat Enterprise license, you’ll need to pony up over a thousand dollars more per year per node to get the clustering options. The benefit of this is that you get support from Redhat, the value of which I have heard questioned by several people. Or you could just install CentOS, which is a RHEL-clone. I can’t recommend Fedora, just because Redhat seems to test things out there as opposed to RHEL (and CentOS) which only gets “proven” software. Unless you’re talking about perl, but I digress.

So the software is installed, terrific. Lets discuss your goals now. It is possible, though not very useful, to have a cluster configured without any resources. Typically you will want at least one shared IP address. In this case, the active node will have the IP, and whenever the active node changes, the IP will move with it. This is as good a time as any to mention that you won’t be able to see this IP when you run ‘ifconfig’. You’ve got to find it with ‘ip addr list’.

Aside from a common IP address, you’ll probably want to have a shared filesystem. Depending on what other services your cluster will be providing, it might be possible to get away with having them all mount a remote NFS share. You’ll have to determine whether your service will work reliably over NFS on your own. Here’s a hint: vmware server won’t, because of the way NFS locks files (At least I haven’t gotten it to work since I last tried a few months ago. YMMV).

Regardless, we’ll assume you’re not able to use NFS, and you’ve got to have a shared disk. This is accomplished by using a Storage Area Network (SAN), most commonly. Setting up and configuring your SAN is beyond the scope of this entry, but the key point is that all of your cluster nodes have to have equal access to the storage resources. Once you’ve assigned that access in the storage configuration, make sure that each machine can see the volumes that it is supposed to have access to.

After you’ve verified that all the volumes can be accessed by all of the servers, filesystems must be created. I cannot recommend LVM highly enough. I created an introduction to LVM a last year to help understand the concept and why you want to use it. Use this knowledge and the LVM Howto to create your logical volumes. Alternately, system-config-lvm is a viable gui alternative, although the interface takes some getting used to. When creating volume groups, make sure that the clustered flag is set to yes. This will stop them from showing up when the node isn’t connected to the cluster, such as right after booting up.

To make sure that the lock manager can deal with the filesystems, on all hosts, you must also edit the LVM configuration (typically /etc/lvm/lvm.conf) to change “locking_type = 1” to “locking_type = 3”, which tells LVM to use clustered locking. Restart LVM processes with ‘service lvm2-monitor restart’.

Now, lets talk about the actual configuration file. cluster.conf is an XML file that’s separated by tags into sections. Each of these sections is housed under the “cluster” tag.

Here is the content of my file, as an example:

<cluster alias=”alpha-fs” config_version=”81″ name=”alpha-fs”>
<fence_daemon clean_start=”1″ post_fail_delay=”30″ post_join_delay=”30″/>
<clusternode name=”fs1.int.dom” nodeid=”1″ votes=”1″>
<method name=”1″>
<device modulename=”Server-2″ name=”blade-enclosure”/>
<clusternode name=”fs2.int.dom” nodeid=”2″ votes=”1″>
<method name=”1″>
<device modulename=”Server-3″ name=”blade-enclosure”/>
<clusternode name=”fs3.int.dom” nodeid=”3″ votes=”1″>
<method name=”1″>
<device modulename=”Server-6″ name=”blade-enclosure”/>
<cman expected_votes=”3″ two_node=”0″/>
<fencedevice agent=”fence_drac” ipaddr=”10.x.x.4″ login=”root” name=”blade-enclosure” passwd=”XXXXX”/>
<failoverdomain name=”alpha-fail1″>
<failoverdomainnode name=”fs1.int.dom” priority=”1″/>
<failoverdomainnode name=”fs2.int.dom” priority=”2″/>
<failoverdomainnode name=”fs3.int.dom” priority=”3″/>
<clusterfs device=”/dev/vgDeploy/lvDeploy” force_unmount=”0″ fsid=”55712″ fstype=”gfs” mountpoint=”/mnt/deploy” name=”deployFS”/>
<nfsclient name=”app1″ options=”ro” target=”10.x.x.26″/>
<nfsclient name=”app2″ options=”ro” target=”10.x.x.27″/>
<clusterfs device=”/dev/vgOperations/lvOperations” force_unmount=”0″ fsid=”5989″ fstype=”gfs” mountpoint=”/mnt/operations” name=”operationsFS” options=””/>
<clusterfs device=”/dev/vgWebsite/lvWebsite” force_unmount=”0″ fsid=”62783″ fstype=”gfs” mountpoint=”/mnt/website” name=”websiteFS” options=””/>
<clusterfs device=”/dev/vgUsr2/lvUsr2″ force_unmount=”0″ fsid=”46230″ fstype=”gfs” mountpoint=”/mnt/usr2″ name=”usr2FS” options=””/>
<clusterfs device=”/dev/vgData/lvData” force_unmount=”0″ fsid=”52227″ fstype=”gfs” mountpoint=”/mnt/data” name=”dataFS” options=””/>
<nfsclient name=”ops1″ options=”rw” target=”10.x.x.28″/>
<nfsclient name=”ops2″ options=”rw” target=”10.x.x.29″/>
<nfsclient name=”ops3″ options=”rw” target=”10.x.x.30″/>
<nfsclient name=”preview” options=”rw” target=”10.x.x.42″/>
<nfsclient name=”ftp1″ options=”rw” target=”10.x.x.32″/>
<nfsclient name=”ftp2″ options=”rw” target=”10.x.x.33″/>
<nfsclient name=”sys1″ option=”rw” target=”10.x.x.31″/>
<script name=”sshd” file=”/etc/init.d/sshd”/>
<service autostart=”1″ domain=”alpha-fail1″ name=”nfssvc”>
<ip address=”10.x.x.50″ monitor_link=”1″/>
<script ref=”sshd”/>
<smb name=”Operations” workgroup=”int.dom”/>
<clusterfs ref=”deployFS”>
<nfsexport name=”deploy”>
<nfsclient ref=”app1″/>
<nfsclient ref=”app2″/>
<clusterfs ref=”operationsFS”>
<nfsexport name=”operations”>
<nfsclient ref=”ops1″/>
<nfsclient ref=”ops2″/>
<nfsclient ref=”ops3″/>
<clusterfs ref=”websiteFS”>
<nfsexport name=”website”>
<nfsclient ref=”ops1″/>
<nfsclient ref=”ops2″/>
<nfsclient ref=”ops3″/>
<nfsclient ref=”preview”/>
<clusterfs ref=”usr2FS”>
<nfsexport name=”usr2″>
<nfsclient ref=”ops1″/>
<nfsclient ref=”ops2″/>
<nfsclient ref=”ops3″/>
<clusterfs ref=”dataFS”>
<nfsexport name=”data”>
<nfsclient ref=”ops1″/>
<nfsclient ref=”ops2″/>
<nfsclient ref=”ops3″/>
<nfsclient ref=”ftp1″/>
<nfsclient ref=”ftp2″/>
<nfsclient ref=”sys1″/>

If you read carefully, most of the entries can be self explained, but we’ll go over the broad strokes.

The first line names the cluster. It also has a “config_version” property. This config_version value is used to decide which cluster node has the most up-to-date configuration. In addition, if you edit the file and try to redistribute it without incrementing the value, you’ll get an error, because the config_versions are the same but the contents are different. Always remember to increment the config_version.

The next line is a single entry (you can tell from the trailing /) which defines the fence daemon. Fencing in a cluster is a means to disable a machine from accessing cluster resources. The reason behind this is that if a node goes rogue, detaches itself from the other cluster members, and unilaterally decides that it is going to have read-write access to the data, then the data will end up corrupt. The actual cluster master will be writing to the data at the same time the rogue node will, and that is a Very Bad Thing(tm). To prevent this, all nodes are setup so that they are able to “fence” other nodes that disconnect from the group. The post fail delay in my config means “wait 30 seconds before killing a node”. How to do this is going to be talked about later in the fencedevices section.

The post_join_delay is misnamed and should really be called post_create_delay, since the only time it is used is when the cluster is started (as in, there is no running node, and the first machine is turned on). The default action of RHCS is to wait 6 seconds after being started, and to “fence” any nodes listed in the configuration who haven’t connected yet. I’ve increased this value to 30 seconds. The best solution is to never start the cluster automatically after booting. This allows you to manually startup cluster services, which can prevent unnecessary fencing of machines.

Fencing is by far what gave me the most problem.

The next section is clusternodes. This section defines each of the nodes that will be connecting to this cluster. The name will be what you refer to the nodes by using the command line tools, the node ID will be used in the logs and internal referencing, and “votes” has to do with an idea called “quorum”. The quorum is the number of nodes necessary to operate a cluster. Typically it’s more than 50% of the total number of nodes. In a three-node cluster, it’s 2. This is the reason that two node clusters are tricky: by dictating a quorum of 1, you are telling rogue cluster nodes that they should assume they are the active node. Not good. If you find yourself in the unenviable position of only having 2 possible nodes, you need to use a quorum disk.

Inside each cluster node declaration, you need to specify a fence device. The fence device is the method used by fenced (the fencing daemon) to turn off the remote node. Explaining the various methods is beyond this document, but read the fencing documentation for details, and hope not much has changed in the software since they wrote the docs.

After clusternodes, the cman (cluster manager) line dictates the quorum (called “expected_votes”) and two_node=”0″, which means “this isn’t a two node cluster”.

The next section is the fencedevices declaration. Since I was using dell poweredge blades, I used the fence_drac agent, which has DRAC specific programming to turn off nodes. Check the above-linked-documentation for your solution.

<rm> stands for Resource Manager, and is where we will declare which resources exist, and where they will be assigned and deployed.

failoverdomains are the list of various groups of cluster nodes. These should be created based on the services that your clusters will share. Since I was only clustering my three file servers, I only had one failover domain. If I wanted to cluster my web servers, I would have created a 2nd failover domain (in addition to creating the nodes in the upper portion of the file, as well). You’ll see below in the services section where the failoverdomain comes into effect.

In the resources list, you create “shortcuts” to things that you’ll reference later. I’m doing NFS, so I’ve got to create resources for the filesystems I’ll be exporting (the lines that start with cluisterfs), and since I want my exports to be secure, I create a list of clients that will have access to the NFS exports (all others will be blocked). I also create a script that will make changes to SSH and allow me to keep my keys stable over all three machines.

After the resources are declared, we begin the service specification. The IP address is set up, sshd is invoked, samba is started, and the various clusterfs entries are configured. All pretty straightforward here.

Now that we’ve gone through the configuration file, lets explain some of the underlying implementation. You notice that the configuration invoked the script /etc/init.d/sshd. As you probably know, that is the startup/shutdown script for sshd, which is typically started during the init for multiuser networked runlevels (3 and 5 in RH machines). Since we’re starting it now, that would seem to imply that it wasn’t running beforehand, however that is not the case. Actually, I had replaced /etc/init.d/sshd with a cluster-aware version that pointed various key files to the clustered filesystems. Here are the changes:

# Begin cluster-ssh modifications
if [ -z “$OCF_RESKEY_service_name” ]; then
# Normal / system-wide ssh configuration
# Per-service ssh configuration
prog=”$prog ($OCF_RESKEY_service_name)”
[ -n “$PID_FILE” ] && OPTIONS=”$OPTIONS -o PidFile=$PID_FILE”
# End cluster-ssh modifications

I got these changes from this wiki entry, and it seemed to work stably, even if the rest of the cluster didn’t always.

You’ll also notice that I specify all the things in the services section that normally exist in /etc/exports. That file isn’t used in RHCS-clustered NFS. The equivalent of exports is generated on the fly by the cluster system. This implies that you should turn off the NFS daemon and let the cluster manager handle it.

When it comes to Samba, you’re going to need to create configurations for the cluster manager to point to, since the configs aren’t generated on the fly like NFS. The naming scheme is /etc/samba/smb.conf.SHARENAME, so in the case of Operations above, I used /etc/samba/smb.conf.Operations. I believe that rgmanager (resource group manager) automatically creates a template for you to edit, but be aware that it takes a particular naming scheme.

Assuming you’ve created cluster aware LVM volumes (you did read the howto I linked to earlier, right?), you’ll undoubtedly want to create a filesystem. GFS is the most common filesystem for RHCS, and can be made using ‘mkfs.gfs2’, but before you start making filesystems willy-nilly, you should know a few things.

First, GFS2 is a journaled filesystem, meaning that data that will be written to disk is written to a scratch pad first (the scratch pad is called a journal), then copied from the scratch pad to the disk, thus if access to the disk is lost while writing to the filesystem, it can be recopied from the journal.

Each node that will have write access to the GFS2 volume needs to have its own scratch pad. If you’ve got a 3 node cluster, that means you need three journals. If you’ve got 3 and you’re going to be adding 2 more, just make 5 and save yourself a headache. The number of journals can be altered later (using gfs2_jadd), but just do it right the first time.

For more information on creating and managing gfs2, check the Redhat docs.

I should also throw in a note about lock managers here. Computer operating systems today are inherently multitasking. Whenever one program starts to write to a file, a lock is produced which prevents (hopefully) other programs from writing to the same file. To replicate that functionality in a cluster, you use a “lock manager”. The old standard was GULM, the Grand Unified Lock Manager. It was replaced by “DLM”, the Distributed Lock Manager. If you’re reading documentation that openly suggests GULM, you’re reading very old documentation and should probably look for something newer.

Once you’ve got your cluster configured, you probably want to start it. Here’s the order I turned things on in:

# starts the cluster manager
service cman start

# starts the clustered LVM daemon
service clvmd start

# mounts the clustered filesystems (after clvmd has been started)
mount -a

# starts the resource manager, which turns on the various services, etc
service rmanager start

I’ve found that running these in that order will sometimes work and sometimes they’ll hang. If it hangs, it’s waiting to find other nodes. To remedy that, I try to start the cluster on all nodes at the same time. Also, if you don’t the post_join_delay will bite your butt and fence the other nodes.

Have no false assumptions that this will work the first time. Or the second. As you can see, I made it to my 81st configuration before I gave up, and I did a fair bit of research between versions. Make liberal use of your system logs, which will point to reasons that your various cluster daemons are failing, and try to divine the reasons.

Assuming that your cluster is up and running, you can check on the status with clustat. Move the services with clusvcadm, and manuallyfence nodes with fence_manual. Expect to play a lot, and give yourself a lot of time to play and test. Test Test Test. Once your cluster is stable, try to break it. Unplug machines, network cables, and so on, watching logs to see what happens, when, and why. Use all the documentation you can find, but keep in mind that it may be old.

The biggest source of enlightenment (especially to how screwed I was) came from the #linux-cluster channel on IRC. There are mailing lists, as well, and if you’re really desperate, drop me a line and I’ll try to find you help.

So that’s it. A *long* time in the making, without a happy ending, but hopefully I can help someone else. Drop a comment below regaling me with stories of your great successes (or if RHCS drove you to drink, let me know that too!).

Thanks for reading!

Howto: Racks and rackmounting

I’m going to start a special feature on Fridays. It’s going to be sharing the sorts of tips that systems admins need to know, but can’t learn in a book. There are so many things that you learn on the job, figure out on your own, or run across on the net which make you realize that you’ve been doing something wrong for years. Sometimes you learn about things that you might have had no clue about. For instance, I just found out that you can do snapshots with LVM

Anyway, this Friday, I’m going to be showing you what I know about server racks.

I started out on a network that had a bunch of tower machines on industrial shelves; the sort you pick up at Harbor Freight or Big Lots. When we moved to racks and rackmount servers, it was like a whole new world.

The first difference is form-factor. Tower servers are usually rated by the “tower” descriptive. Full tower, half tower, mid-tower. Rack Servers are sized according to ‘U’s, short for “Rack Unit”. It’s equivalent to 1 3/4 inches, so a 2U server is 3.5” tall. The standard width for rackmount servers is 19” across. Server racks vary in depth, between 23 and 36”, with deeper being more common.

Instead of shelves for each server, rack hardware holds the server in place, usually suspended by the sides of the machine. They allow the server to slide in and out, sometimes permitting the removal of the server’s cover to access internal components. Different manufacturers have different locking mechanisms to keep the servers in place, but all rack kits I’ve seen come with instructions.

To anchor the rack hardware (also known as rails) to the rack itself, a variety of methods have been implemented. There are two main types of rack. Round hole racks, seen at the left, require a special type of rack hardware. Much more common is square hole racks, which require the use of rack nuts. The rack nuts act as screw anchors to keep the hardware in place. Some server manufacturers have created specific rack hardware that fits most square hole racks, and don’t require the use of rack nuts. Dell’s “rapidrail” system is one with which I’m very familiar. Typically you get the option of which rail system you want when you purchase the system.

Installing the rack nuts is made easier with a specialized tool. I call it the “rack tool”, but I’m sure there’s another name. The rack nut is place with the inside edge clip in place, through the hole. The tool is inserted through the hole, grabs the outside clip, and then you pull the hook towards you. This pulls the outside clip to the front of the hole, securing the nut in place.

A typical server will require eight nuts, usually at the top and bottom of each rack unit, on the right and left sides, front and back. Each rack unit consists of three square holes, and a rack nut is put in the top and bottom of both the right and the left sides. Several pieces of networking equipment have space for four screws, but I’ve found that they stay in place fine with two. I can’t really recommend it for other people, but if you’re low on rack nuts, it’s better than letting the switches just sit there (and it almost always seems like you have fewer rack nuts than you need once your rack starts growing). If you only use two screws to hold in your networking equipment, make sure it’s the bottom two. The center of gravity of a rackmount switch is always behind the screws, so if the top screws hold it up, the bottom has a tendency to swing out, and that’s not good for your rack or your hardware.

While I’m on the subject of swtches, let me give you this piece of advice. Mount your switches in the rear of the rack. It seems obvious, but you have no idea how many people mount them on the front in the beginning because “it looks cooler” and then regrets it when they continually have to run cable through the rack to the front.

Once your rack starts to fill out, heat will become an issue. When you align your rack for your air conditioner, another bit of common sense that’s frequently ignored. Air goes into the servers through the front, and hot air leaves through the back. This means that when you cool your rack, you should point the AC towards the front of your rack, not the back.

Air comes in here… And leaves back here
And leaves here...

It’s probably not a stranger to anyone who’s used a computer, but the cables seem to have a mind of their own, and nowhere is it more apparent than a reasonably full server rack. Many higher-end solutions provide built-in cable management features, such as in-cabinet runs for power cables or network cables, swing arms for cabling runs, and various places to put tie-downs.

There is no end-all-be-all advice to rack management, but there are some tips I can give you from my own experience.

Use Velcro for cabling that is likely to change in the next year. Permanent or semi-permanent cabling can deal with plastic zipties, as long as they aren’t pulled too tight, but anytime you see yourself having to clip zipties to get access to a cable, use Velcro. It’s far too easy to accidentally snip an Ethernet cable in addition to the ziptie.

Your rackmount servers will, in many cases, come with cable management arms. Ignore them. Melt them down or throw them away, but all they’ve ever done for me is block heat from escaping out the back.

Label everything. That includes both ends of the wires. Do this for all wires, even power cables (or especially power cables). Write down which servers are powered by which power sources.

If you have a lot of similar servers, label the back of the servers too. Pulling the wrong wire from the wrong server is not my idea of a good time.

Keep your rack tool in a convenient, conspicuous spot. I ran a zip tie through the side of the rack, and hang mine there.

(Some photos were courtesy of Ronnie Garciavia Flickr)