9. Configuration and Version Control

Notes by Dr. Tony Cowling.


One very practical issue which arises in the course of developing a large software system is that there will be a large number of documents created, and many of these will need to be changed as the development progresses. Thus, members of the development team who will have to refer to these documents will need some way of ensuring that at any given stage they are consulting the correct version of a document (which depending on why they need to refer to it, may not necessarily be the most up-to-date version). This problem is made worse by the fact that for a large system there may be a number of different configurations of the software which need to be produced, for instance to contain different special features for particular clients, or to run with different hardware or operating systems. Thus, as well as versions created at different times in the history of a system, there may also be different versions in parallel for different configurations of the software, and these need to be managed properly. Furthermore, as a system develops there are likely to be different versions of each configuration of it that are released to the clients at different times: these different versions are often referred to as different releases of the system. Thus, in principle there are three aspects to be managed: the different versions of components, the different configurations and the different releases, but this is usually treated as just two separate but related problems: version management and configuration management respectively.


Naming


An important part of both version and configuration management is to have a scheme for naming documents and components of the system which will reflect as far as possible the relationships that will exist between these components (in their different versions) and the different releases and configurations of the whole system. In terms of the configuration this usually implies some form of hierarchical structure that reflects the actual structure of the project, and then within that a sub-hierarchy for components that relate to particular configurations: usually this hierarchy will be defined in terms of the directory structure of the file store. Thus, in the case of the test data for a particular component of the system, one might have a directory containing files of test data that were common to all the configurations of the system, and then sub-directories named after different special configurations that contained additional test data that was needed for those version of that component.


In terms of the version control, the usual basis is that the different releases will usually be numbered in a linear sequence, although it is common to have a two-level scheme with major and minor releases. The idea of this is that major release numbers will change when there are major changes to the functionality of the system (ie as part of adaptive or perfective maintenance), and minor release numbers will change when the new version is produced primarily to cure problems identified with the previous version (ie corrective maintenance). It is also common that for each release there will be an alpha version (built for internal testing only), a beta version (built for release to a small number of selected clients for them to test), and then the final version which is released generally to all clients, so that the minor release numbers also need to identify whether this is an alpha, beta or full release of the system.


The scheme for naming documents (and other components of the system) is then based on this numbering scheme for releases, so that versions of these are given new numbers whenever they need to change in order to match the new release of the system. In practice, though, the naming scheme for documents will also need at least a third level of numbering, since it is likely that the updating of a document to match the development of a new release may happen in several stages, as new versions are inspected and corrected before finally being accepted.


Configuration Management


The first stage in configuration management is simply to define what different configurations of the system are likely to exist, and hence will need to be provided for in the naming scheme. In practice, of course, this is something that is likely to develop over time, and so there also needs to be a mechanism for deciding when additional configurations need to be defined, and adding these to the naming scheme. Both the initial configurations and the mechanism for establishing additional configurations then need to be documented.


Associated with the existence of different configurations for a system will be the problem of ensuring that the correct configurations will be delivered to the correct clients, and that these will have been built using the correct combinations of components. To manage this it is usual to have a configuration database that will maintain (as a minimum) records for each of the configurations (including which clients they are for), and for each of the components (including in particular which version of that component is to be used in which configurations). This database can then be used by software tools which will build the components into complete systems, although in practice this may require some form of mapping to be created between logical components of the system and the actual physical files that contain those components: such mappings can either be defined directly within the database, or by creating separate descriptions of the mappings in some form of module interconnection language, which can then be processed by the tools that build the system. Once established, a configuration database can also be used for collecting other information about configurations and components, including information about faults found and suggestions for improvements.


Version Management


The first stage in planning version management is to define which documents are sufficiently important that they need to be subjected to proper version control, and which ones are sufficiently trivial that such control is not necessary. For those documents which are to be controlled, there then needs to be a way of distinguishing between a version which has reached a state where it needs to be controlled, and one which is still only a temporary working version, and this is usually associated with the way in which the quality assurance processes will operate. Thus, once a document has reached a stage where others need to work with it, such as by inspecting it, then it needs to be given a version number which can be used subsequently to refer to it, and which ideally should be related to the version number of the release of the system that will be produced from it. This version number needs to be recorded in some form of database, and typically this will need to be integrated with the configuration database.


Although the document is now recorded in the database, it is not yet in a state where others can use it, and so it needs to be recorded that it is not yet available for use by others, and it is usual also to record who is responsible for the work being done on this document, so that queries about its status can be referred to the right person. If the result of this work (eg an inspection) means that further changes to the document are required, then when the new version has been completed it should be given a new number, and this recorded in the database. Once the document has been inspected in accordance with the procedures laid down in the quality plan, and has passed those inspections, then it can be defined to be a baseline version, which implies that others can take it as a basis on which to work, and so the possibility of any future changes to it must be more strictly controlled.


The usual mechanism for this is that if subsequent development work on a system suggests that changes are needed to a baseline version of a document, then this must be put into the form of a formal change request, which must then be considered by a group within the project management team known as the change control board. This board will ensure that the reasons given for wanting a change are properly analysed, and the likely costs of making the change, and will then assess whether or not the change should be made. If it permits the change, then the fact that a new version of the document is being produced will be recorded in the database, and once the new version has been produced and accepted (after inspection) then those who had been using the document as baseline will need to be informed that a new version of it has been established. Ideally this means that the database should contain enough information about the dependencies between different documents that it can identify which other documents might be affected: otherwise the information about the change has to broadcast to all members of the development team, so that they can decide whether it affects them.


System Building


Thus, by the time a release of the system is built, it should be the case that baseline versions will exist of all the documents and components, and the release will be built from these, although in practice it may be the case that at the same time as this is being done, new versions of some of the documents will already be under development for some of the components that are intended to go into the next release. This is particularly the case if there are a number of different configurations to be released, as they may not all be built together, and so the earlier ones may have got to beta or even full releases by the time the alpha versions of the later ones are produced.


To ensure that these different releases and configurations can be built correctly, it is essential that the configuration database be able to hold information about which version of a component is used to build which release of a particular configuration, not least so that if it is subsequently necessary to reconstruct an earlier release this can be done. In principle this could mean keeping a large number of versions of a document which only differ in minor details, and so in some commercial systems for maintaining versions of source program code it is common to use the approach that minor changes from a baseline version of a document are stored not as a complete new document, but as a delta: that is, as a record of the changes made from the previous version. Thus, to build the system they actually apply the deltas successively to the baseline in order to produce a temporary file containing the version of the code which is to be compiled, and then this temporary file can be discarded once the system has been built. The Unix SCCS (Source Code Control System) is an example of this approach.