In particular, the term is used in connection with mainframes or enterprise computing, often as part of a service-level agreement. For example, electricity that is delivered without interruptions ( blackouts, brownouts or surges) 99.999% of the time would have 5 nines reliability, or class five. Percentages of a particular order of magnitude are sometimes referred to by the number of nines or "class of nines" in the digits. Keep the perspective consistent throughout a discussion, then uptime and availability can be used synonymously.
The perspective is important here - whether the item being discussed is the server hardware, server OS, functional service, software service/process.etc. This can also be viewed as a system that is available to be worked on, but its services are not up from a functional perspective (as opposed to software service/process perspective). That is, a system can be up, but its services are not available, as in the case of a network outage. Uptime and availability can be used synonymously as long as the items being discussed are kept consistent. The following table shows the translation from a given availability percentage to the corresponding amount of time a system would be unavailable. Service level agreements often refer to monthly downtime or availability in order to calculate service credits to match monthly billing cycles. The following table shows the downtime that will be allowed for a particular percentage of availability, presuming that the system is required to operate continuously. For certain systems, scheduled downtime does not matter, for example system downtime at an office building after everybody has gone home for the night.Īvailability is usually expressed as a percentage of uptime in a given year. Systems that exhibit truly continuous availability are comparatively rare and higher priced, and most have carefully implemented specialty designs that eliminate any single point of failure and allow online hardware, network, operating system, middleware, and application upgrades, patches, and replacements. By doing this, they can claim to have phenomenally high availability, which might give the illusion of continuous availability. Many computing sites exclude scheduled downtime from availability calculations, assuming that it has little or no impact upon the computing user community. But if the requirement is for true high availability, then downtime is downtime whether or not it is scheduled. If users can be warned away from scheduled downtimes, then the distinction is useful. Examples of unscheduled downtime events include power outages, failed CPU or RAM components (or possibly other failed hardware components), an over-temperature related shutdown, logically or physically severed network connections, security breaches, or various application, middleware, and operating system failures. Unscheduled downtime events typically arise from some physical event, such as a hardware or software failure or environmental anomaly.
In general, scheduled downtime is usually the result of some logical, management-initiated event. Scheduled downtime events might include patches to system software that require a reboot or system configuration changes that only take effect upon a reboot. Typically, scheduled downtime is a result of maintenance that is disruptive to system operation and usually cannot be avoided with a currently installed system design. ( June 2008) ( Learn how and when to remove this template message)Ī distinction can be made between scheduled and unscheduled downtime. Unsourced material may be challenged and removed. Please help improve this section by adding citations to reliable sources. If the two principles above are observed, then a user may never see a failure – but the maintenance activity must. Reliable systems must provide for reliable crossover. In redundant systems, the crossover point itself tends to become a single point of failure. This means adding or building redundancy into the system so that failure of a component does not mean failure of the entire system. Elimination of single points of failure.There are three principles of systems design in reliability engineering which can help achieve high availability.