<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Modern uptime &#8211; measured from the outside in</title>
	<atom:link href="http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/</link>
	<description>A blog for IT Admins who do everything by an IT Admin who does everything</description>
	<lastBuildDate>Sat, 20 Mar 2010 09:57:50 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Shannon</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3330</link>
		<dc:creator>Shannon</dc:creator>
		<pubDate>Thu, 24 Sep 2009 01:54:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3330</guid>
		<description>I&#039;ve been working alot with F5 Big IP boxes.  They&#039;re pretty frikkin sweet.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve been working alot with F5 Big IP boxes.  They&#8217;re pretty frikkin sweet.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Preston de Guise</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3322</link>
		<dc:creator>Preston de Guise</dc:creator>
		<pubDate>Thu, 17 Sep 2009 10:30:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3322</guid>
		<description>In the last job I worked at, we had a sales director who insisted we use the term &quot;Towards 100% Uptime!&quot; ... to me it just smacked too much of &quot;To infinity and beyond!&quot; However, the real problem I had with it was that it created too unrealistic an expectation.

The more customer sites I saw, the more I realised that the mainframe teams always had it right - uptime is not about continuous availability, but continuous &lt;I&gt;negotiated&lt;/I&gt; availability. Midrange system administrators in particular have too often become hung up on perfect uptime records, when in actual fact a bit of downtime now and again can sometimes be good - for performance, for maintenance and stability, etc.

Here&#039;s an example - a customer once (a major financial institution in Australia) had some business improvement consultants in, and they surveyed the IT people and asked them what their system availability percentages were like. The general consensus was around the 90% mark. On the other hand, the end users, when asked, presented a strikingly different average - around 60% or perhaps even less. The main reason it was determined for the difference was that the IT people were measuring uptime, whereas the users were measuring &lt;I&gt;responsiveness&lt;/I&gt;.

Uptime by itself isn&#039;t a good enough system metric. Negotiated uptime, on the other hand, is much better.</description>
		<content:encoded><![CDATA[<p>In the last job I worked at, we had a sales director who insisted we use the term &#8220;Towards 100% Uptime!&#8221; &#8230; to me it just smacked too much of &#8220;To infinity and beyond!&#8221; However, the real problem I had with it was that it created too unrealistic an expectation.</p>
<p>The more customer sites I saw, the more I realised that the mainframe teams always had it right &#8211; uptime is not about continuous availability, but continuous <i>negotiated</i> availability. Midrange system administrators in particular have too often become hung up on perfect uptime records, when in actual fact a bit of downtime now and again can sometimes be good &#8211; for performance, for maintenance and stability, etc.</p>
<p>Here&#8217;s an example &#8211; a customer once (a major financial institution in Australia) had some business improvement consultants in, and they surveyed the IT people and asked them what their system availability percentages were like. The general consensus was around the 90% mark. On the other hand, the end users, when asked, presented a strikingly different average &#8211; around 60% or perhaps even less. The main reason it was determined for the difference was that the IT people were measuring uptime, whereas the users were measuring <i>responsiveness</i>.</p>
<p>Uptime by itself isn&#8217;t a good enough system metric. Negotiated uptime, on the other hand, is much better.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arjen Lentz</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3307</link>
		<dc:creator>Arjen Lentz</dc:creator>
		<pubDate>Sat, 12 Sep 2009 01:16:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3307</guid>
		<description>Good article. I&#039;d also add software-loadbalancing into the mix, like haproxy. Can be wrapped into what&#039;s essentially an appliance for deployment, too.

I do think you introduce one fail as part of the solution, and that&#039;s that lightweight SAN you talk about. The SAN becomes a single point of failure (and they do/will fail!), and even SANs need upgrades. Because more things rely on them, both these things cause hassle.

I prefer local storage, aka shared nothing. And I make servers not just redundant but expendable. That is, if one fails completely, there&#039;ll be at least one (if not more) copies of both data and processing power elsewhere. Then you can just take out anything without a worry.</description>
		<content:encoded><![CDATA[<p>Good article. I&#8217;d also add software-loadbalancing into the mix, like haproxy. Can be wrapped into what&#8217;s essentially an appliance for deployment, too.</p>
<p>I do think you introduce one fail as part of the solution, and that&#8217;s that lightweight SAN you talk about. The SAN becomes a single point of failure (and they do/will fail!), and even SANs need upgrades. Because more things rely on them, both these things cause hassle.</p>
<p>I prefer local storage, aka shared nothing. And I make servers not just redundant but expendable. That is, if one fails completely, there&#8217;ll be at least one (if not more) copies of both data and processing power elsewhere. Then you can just take out anything without a worry.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Devin</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3292</link>
		<dc:creator>Devin</dc:creator>
		<pubDate>Tue, 08 Sep 2009 19:03:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3292</guid>
		<description>Now you ruined my pride at seeing my odometer hit 66666.6</description>
		<content:encoded><![CDATA[<p>Now you ruined my pride at seeing my odometer hit 66666.6</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bart</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3283</link>
		<dc:creator>Bart</dc:creator>
		<pubDate>Sat, 05 Sep 2009 12:57:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3283</guid>
		<description>System uptime is nothing to brag about, all it says about you is that you do a lousy job at keeping your servers secure. A few months ago I had a discussion with a sysadmin who claimed patches weren&#039;t required because his servers were on a secure vlan (whatever that may be). A few weeks later, all his servers were infected by conficker and caused an outage that cost millions. I haven&#039;t heard from him ever since, I&#039;m quite sure he got fired..

Its all about service availability, I constantly reboot servers during production hours but thanks to good redundancy, nobody will ever notice. In fact, the services have been more reliable. The actual method employed here depends on the service but most use load-balancers. 

Other than security there are many more reasons to reboot. If you never reboot a server, how will you know that it will boot correctly? I&#039;ve seen servers develop faults on parts of the disk that were only used during boot. I&#039;ve often encountered sysadmins who install some service, start it manually and forget to add it to the startup scripts.. these are not the kind of things you want to be dealing with at 3AM when a power outage causes that server to reboot..
Another advantage of redundant servers is that you need some way of keeping your configuration in-sync, which often involves some kind of configuration management. This will eventually result in faster repair times when something goes wrong.


Proper load-balancers are required to make many services truly redundant. But I will never let anything with a Barracuda badge inside the datacenter. If you are going to load-balance many critical services, do you really want to pull all traffic through some cheap box? I&#039;d go for one with a big red light on the front. It may be more expensive, but its worth it.</description>
		<content:encoded><![CDATA[<p>System uptime is nothing to brag about, all it says about you is that you do a lousy job at keeping your servers secure. A few months ago I had a discussion with a sysadmin who claimed patches weren&#8217;t required because his servers were on a secure vlan (whatever that may be). A few weeks later, all his servers were infected by conficker and caused an outage that cost millions. I haven&#8217;t heard from him ever since, I&#8217;m quite sure he got fired..</p>
<p>Its all about service availability, I constantly reboot servers during production hours but thanks to good redundancy, nobody will ever notice. In fact, the services have been more reliable. The actual method employed here depends on the service but most use load-balancers. </p>
<p>Other than security there are many more reasons to reboot. If you never reboot a server, how will you know that it will boot correctly? I&#8217;ve seen servers develop faults on parts of the disk that were only used during boot. I&#8217;ve often encountered sysadmins who install some service, start it manually and forget to add it to the startup scripts.. these are not the kind of things you want to be dealing with at 3AM when a power outage causes that server to reboot..<br />
Another advantage of redundant servers is that you need some way of keeping your configuration in-sync, which often involves some kind of configuration management. This will eventually result in faster repair times when something goes wrong.</p>
<p>Proper load-balancers are required to make many services truly redundant. But I will never let anything with a Barracuda badge inside the datacenter. If you are going to load-balance many critical services, do you really want to pull all traffic through some cheap box? I&#8217;d go for one with a big red light on the front. It may be more expensive, but its worth it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: chewy_fruit_loop</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3278</link>
		<dc:creator>chewy_fruit_loop</dc:creator>
		<pubDate>Fri, 04 Sep 2009 00:50:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3278</guid>
		<description>thankfully i don&#039;t have to worry about things like that, I just have to make sure that theres no major interruptions to peoples workflows.

the only real thing i need to do is keep everything up for about 12 hours a day, the rest of the time its unlikely people will notice an outage.

but we&#039;re now starting to serve data to sites that are following the sunlight :( but its not a big deal for the most part</description>
		<content:encoded><![CDATA[<p>thankfully i don&#8217;t have to worry about things like that, I just have to make sure that theres no major interruptions to peoples workflows.</p>
<p>the only real thing i need to do is keep everything up for about 12 hours a day, the rest of the time its unlikely people will notice an outage.</p>
<p>but we&#8217;re now starting to serve data to sites that are following the sunlight <img src='http://www.standalone-sysadmin.com/blog/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' />  but its not a big deal for the most part</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steven Schwartz</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3276</link>
		<dc:creator>Steven Schwartz</dc:creator>
		<pubDate>Thu, 03 Sep 2009 20:45:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3276</guid>
		<description>Many moons ago, we hired an electrician to help us live move a Solaris server since it had an uptime of over 450 days.  When we finally did shut that server down, it hadn&#039;t been rebooted in almost 700 days.  Oh the days of wanton IT spending.</description>
		<content:encoded><![CDATA[<p>Many moons ago, we hired an electrician to help us live move a Solaris server since it had an uptime of over 450 days.  When we finally did shut that server down, it hadn&#8217;t been rebooted in almost 700 days.  Oh the days of wanton IT spending.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3275</link>
		<dc:creator>Bob</dc:creator>
		<pubDate>Thu, 03 Sep 2009 15:57:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3275</guid>
		<description>&#039;A system with a multi-year uptime had been providing uninterrupted service.&#039;

I have aways been sceptical about people who brag about system uptime. That a box itself is online for years never meant the services that the box provided where available to users for the same amount of time.

System uptime is just what what the word implies, system uptime. I am much more interested in service availability, though it is nice to have triple digits when i typ my &#039;uptime&#039; command.</description>
		<content:encoded><![CDATA[<p>&#8216;A system with a multi-year uptime had been providing uninterrupted service.&#8217;</p>
<p>I have aways been sceptical about people who brag about system uptime. That a box itself is online for years never meant the services that the box provided where available to users for the same amount of time.</p>
<p>System uptime is just what what the word implies, system uptime. I am much more interested in service availability, though it is nice to have triple digits when i typ my &#8216;uptime&#8217; command.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael T</title>
		<link>http://www.standalone-sysadmin.com/blog/2009/09/modern-uptime-measured-from-the-outside-in/comment-page-1/#comment-3273</link>
		<dc:creator>Michael T</dc:creator>
		<pubDate>Thu, 03 Sep 2009 13:48:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.standalone-sysadmin.com/blog/?p=893#comment-3273</guid>
		<description>I&#039;m &lt;b&gt;FINALLY&lt;/b&gt; getting to catch up on some of my reading and yes, this is one of them.

One of the things that needs to be remembered in this profesional and polished environment we live in, uptime be it measured as pure uptime or available time, while important, can be completely undone by bad &lt;i&gt;time to repair&lt;/i&gt;.

Sure it&#039;s impressive that someone&#039;s machine has 456 days of uptime. We already know that this means there&#039;s a lack of patches and &quot;maintenence&quot; done to the system. So what happens when the sprinklers go off? What? You&#039;re gonna quit? No you&#039;re not, you&#039;re like the rest of us and you&#039;re going to spend the next sleepless nights trying to reconstruct what &lt;i&gt;was&lt;/i&gt; running instead of rebuilding the machine using current patch levels and then copying over the customer&#039;s applications/webpages/other from the backups, looking like a genius and going home at a reasonable hour.

I used to be a projectionist and got taught that &quot;The difference between amateurs and professionals is that professionals make their mistakes quietly.&quot; I also learned that speed of recovery is the measure of a professional.</description>
		<content:encoded><![CDATA[<p>I&#8217;m <b>FINALLY</b> getting to catch up on some of my reading and yes, this is one of them.</p>
<p>One of the things that needs to be remembered in this profesional and polished environment we live in, uptime be it measured as pure uptime or available time, while important, can be completely undone by bad <i>time to repair</i>.</p>
<p>Sure it&#8217;s impressive that someone&#8217;s machine has 456 days of uptime. We already know that this means there&#8217;s a lack of patches and &#8220;maintenence&#8221; done to the system. So what happens when the sprinklers go off? What? You&#8217;re gonna quit? No you&#8217;re not, you&#8217;re like the rest of us and you&#8217;re going to spend the next sleepless nights trying to reconstruct what <i>was</i> running instead of rebuilding the machine using current patch levels and then copying over the customer&#8217;s applications/webpages/other from the backups, looking like a genius and going home at a reasonable hour.</p>
<p>I used to be a projectionist and got taught that &#8220;The difference between amateurs and professionals is that professionals make their mistakes quietly.&#8221; I also learned that speed of recovery is the measure of a professional.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
