Change Management and Revision Control

 

Change Management.  If you work in the computing field then you need to be able to deal with change.  According to Limoncelli, "Change management is the process that ensures effective planning, implementation, and postevent analysis of changes made to a system."

Significant changes to just about anything other than isolated desktops should be coordinated through a change review process.  Each organization needs to determine its own levels of change they want to manage for each category/type of change.  This can often be put together into a matrix with level of coordinated processing versus machine, service or whatever.

Technical Issues.  It is probably of the utmost importance to have documented procedures for the sys admins to use to update system configuration files.  It is also very important to track the revisions that have been implemented and make sure they have a desirable level of detail.  Having a revision history lets anyone with appropriate authorizations review what changes have been made and return to a previous version of the file.  This is particularly important if the current version has become corrupted.

It is also likely to be very important to make use of revision control software to develop these revision histories.  It is also going to be important to make certain this software can be used to prevent two people from trying to modify the same configuration file at the same time, called locking.

It is also likely to be very important that there are some automated checks to ensure that the files being changed are syntactically correct.  The software is also almost certainly going to need to be able to tell the various servers that their configuration files have changed or push files to other locations as necessary.

Communications Structure.  It is essential to develop a communications structure to inform users about sufficiently important changes being made.  If changes involve hard cutovers to new or upgraded services or software you need to especially make certain that all the appropriate users are informed.  You also need to be able to ensure that these same people are finding the new service or software functional and useful.

At the same time it is almost surely going to be important to not flood users with too many messages so that the messages are not too overwhelming and distracting.  If they receive too many irrelevant messages they are very likely to assume that even more are irrelevant.

How these communications should be best disseminated is very likely to depend on the organization culture.  Sometimes a "push" mechanism is best where everyone deemed important to receive particular communiqués receives one.  Other approaches may depend much more of a "pull" type of mechanism.  This would likely involve something like a newsletter or web page where users know to find particular information.  You may also have something like users groups that have common interests.  This may be something like an early adopters group who are most likely to keep up on the latest services and features.  Or it may be something like a group of people that write copy for the organization.

Scheduling.  It should be the case that timing determination is very important in change management.  Updates may be classified into three major categories.

  • Routine Updates

    • these can usually happen at just about anytime and are typically invisible to the customer

      • update contents of a directory

      • update contents of a server

      • update contents of an authentication database

      • helping individual users to customize their computer environment

      • debugging desktop problems

      • debugging printer problems

       

  • Major Updates

    • these affect many systems or require a significant system, network or service outage

    • they may involve a large number of systems or even desktops

      • upgrading the authentication system

      • changing the e-mail or printer infrastructure

      • upgrading the core network infrastructure

    • these updates usually involve some sort of push mechanism

    • they are often, if not usually, performed at off peak times

     

  • Sensitive Updates

    • These may not be all that visible to users or all that large, but you might cause a significant outage if there is a problem with it

      • router configurations

      • global access policies

      • firewall configurations

      • alterations to a critical server

    • These sorts of updates are likely to occur quite frequently and so are more likely to be communicated through some sort of pull mechanism

    • It is important for the appropriate sys admins to stay around and be involved

Sometimes you actually want people and users around when you make changes.  Though, most places are much more inclined to do them when there are few or no users on the system so that their efforts aren't disrupted.

It can also be very difficult to classify updates as routine, major or sensitive.  These classifications will definitely depend on the type of organization.  If you are running an 24-7 e-commerce site it is going to be very different from a mom and pop camera store on the corner.

Process and Documentation. In many sys admin groups it is required to fill out change control or change proposal forms.  These forms detail

  • the changes that will be made

  • the systems and services affected

  • the reasons for the changes

  • the risks

  • the test procedures

  • the back out plans

  • how long the changes will take to implement

  • how long the backout plan takes to implement

Some sys admins are even required to write down all of the commands they will give.

Obviously, the requirements to be met on such forms are going to depend on things such as the site, the level of experience of the sys admin, the machine being changed and its criticality to the site's operations.

It may also be useful to institute quiet times when minor updates can be implemented.  They can occur at a variety of prespecified times.

The Icing.  Assuming there is a basic change management process in place that describes a process for configuration updates, the communication methods that are used, and how to schedule change, there are some additional techniques that can be used to improve stability at your site.

One of the main things is to obtain or develop is an automated front end interface to the system files that are likeliest to be processed to

  • ask particular questions when first being worked on

  • check the replies to these questions for errors

  • look for omissions

  • makes appropriate updates based on these answers

Another important thing to implement that can provide some important additional change management is to have an appropriate set of meetings to address change management.  A more formalized structure of meetings can help keep the sys admins as a group up to date on things such as

  • what some of them are planning to do

  • when

  • how long they anticipate it will take

  • what can go wrong

  • testing approaches

  • back out processes

  • how long it will take to implement

Again, these meetings also keep others informed on proposed changes.  The people who approve, refuse or reschedule proposed changes need to be selected from across the organization.  This also helps a larger group of people, not on the committee but with contacts on the committee, be aware of what's developing.  This also helps others have an overall view of what is happening at the site.

After having said all this, it is also almost surely important to make certain that the process is streamlined.

  • Does the format of the change proposal?

  • Are their paper based parts of the process that can be done more effectively?

  • Is there a way to make sue of default settings if the forms are online?

  • What problems do people who use the system have with the current setup?