Quality Assurance vs Quality Control

Are you good at finding faults in your infrastructure, or are you good at making sure there are no faults. As Jason Cohen relates, Quality Assurance is not Quality Control.

Like many other topics, this is written to programmers, but is a good lesson for sysadmins as well.

  • Jason Cohen

    Thanks for the mention Matt!

    It’s a good point that it applies to sysadmining as well as programming. Some commentors also pointed out how the role of QC is different in manufacturing.

    What kinds of tools would you use in QC sysadmining? Do monitoring consoles count? That is, they identify failures (or at least unexpected cases?) but often don’t identify the source of the failure?

  • Matt


    Thanks for the return visit!

    Sysadmins as a general rule tend to rely heavily on monitoring. We use alerting tools to notify us on the state of various services and network performance, and we do trend graphing to monitor performance habits of things across the infrastructure. At least we should :-)

    To reduce the instances of configuration issues, it’s generally accepted that managed changes which are planned beforehand and executed via scripts. That reduces the number of fat-finger errors.

    Overall, I really like examining the ways that software developers engineer their solutions. I’ve found that the solutions they’ve come up with often match very closely situations that we sysadmins experience.

    Thanks again for the return visit. Please come back when you get a chance. Take care!

  • Jason Cohen

    Yeah software development needs all the help it can get in process improvement. Some things you mentioned are frequently employed (e.g. scripts), but others are not (e.g. monitoring).

    I’m subscribed to your feed and Digg’ing your stuff, so I’ll be back. :-)

  • Matt


    Thanks very much, I really appreciate your support!

    Do you think that monitoring is something that would help developers? What sort of warning or failure conditions would be helpful to know of when they happen? I’m 100% sure that a Nagios check could be written to alert in that event.