Main Menu


Distros
Computational
Load Balancing
High Availability
Software
Documentation
Books
Vendors
Useful Links
News & Print
Etc...
About Us


Back To Main Page

Sitemap

High Availability Clustering

These days, high availability is in demand. No one wants any services they offer to be unreachable, especially since that downtime will almost undoubtably cost them money. High Availability Linux clusters offers the solution to this need for a lack of downtime.

This page will point you towards various software projects, tools, documentation, and other related sites that deal with high availability Linux clusters.

There are high availability services offered in most load sharing and balancing software. For that reason, you might also want to check out our load sharing and balancing clusters page, located here. However, load sharing and balancing systems are much more costly, since to have a truly highly available LVS style cluster, you will need a minimum of six systems, while for a truly highly available cluster, you will only need three.






High Availability Software

  • HA-OSCAR: HA-OSCAR is based on OSCAR, and aims to add an element of high availability support, in order to enhance a Beowulf cluster system for mission critical grade applications. HA-OSCAR incorporates a self-healing mechanism, failure detection and recovery, and automatic failover/fail-back.

  • The High Availability Linux Project: Linux-HA is the home of "heartbeat," the high availability code that can be used by itself, but is also implemented by various load sharing and balancing cluster software.

  • Kimberlite: A Kimberlite cluster provides support for two nodes on a shared SCSI or Fibre Channel storage system, in an active-active failover environment.

  • Lifekeeper: Lifekeeper, by Steeleye, is a High Availability solution for clustered servers, scalable up to 32 nodes. It can use shared storage, as well as synchronous or asynchronous data replication over LAN or WAN. It is available for Linux, x86 Solaris, and Windows.

  • Linux FailSafe: Linux Failsare is the port of SGI's FailSafe software to the Linux platform.

  • Linux Replicated High Availability Manager: The Linux Replicated High Availablity Manager tries to offer a solution for Linux, based on ENBD, similar to what the author had previously developed on the Solaris platform using shared SCSI disks and Veritas Volume Manager. It tries to offer redundancy, flexibility, and use cost effective hardware.

  • MC/Serviceguard: Multi computer/Serviceguard, by Hewlett Packard, is a High Availability facility that can protect against a variety of hardware and software failures, while ensuring the availability of critical services. It is available for both Linux and HP-UX.

  • Red Hat Cluster Suite: Red Hat Cluster Suite is available as additional programs to the Red Hat Enterprise Server distribution. Cluster Suite is comprised of two main parts, Cluster Manager, which provides application failover, and IP Load Balancing (formerly known as Piranha), which provides the ability to load balance IP traffic across a farm of servers.

  • RSF-1: RSF-1, from the folks at High-Availability.com, sits between the storage volume management and application layers, and also has optional monitoring agents for many popular databases, middleware and end-user software. It can maintain availability of services for failures of hardware, network, application or OS.

  • SRRD: SRRD (Service Routing Redundancy Daemon) is a high availability cluster server that can monitor and maintain services as well as shared resources. It is configurable via a web interface, supports PKI and SSL client and server authentication, as well as features such as "Service Groups" and "Service Dependencies," and "Critical Services." Cluster nodes can even reside on different physical networks.

  • VERITAS Cluster Server: VERITAS Cluster Server supports up to 16 nodes in its Linux implementation, in both SAN and traditional client/server environments, and offers a variety of high availability support setups.

  • WDX: WDX is a new project from the Shaman-X project. Its goal is to help build high availability clusters. Currently, WDX is available for beta testing for Linux with a lot of good documentation.


High Availability Related Sites

  • DRBD: DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via a (dedicated) network. You could see it as a network raid 1. This is a very interesting concept, which, when coupled with some sort of heartbeat and an application that works on top of a block device (such as a filesystem with fsck, a journaling file system, or a database with recovery capabilities), will set up a nice high availability situation which runs on an IP network, with no special hardware.

  • High Availability RAID: From the Software RAID HOWTO

  • IEEE CS Task Force on Cluster Computing High Availability Page: The TFCC-HA page contains some basic information on high availability, as well as links to publications on HA, research projects, and some commercial HA products.

  • Linux Network Address Translation Project: This page provides an assortment of links and information related to NAT from around the Internet.

  • Linux Scalability Project: The primary goal of the Linux Scalability Project is to improve the scalability and robustness of Linux to support greater network server workloads more reliably. Specific interest is geared towards single-system scalability, performance, and reliability of network server infrastructure products, such as LDAP directory servers, IMAP e-mail servers, and web servers, among others.





    This site maintained by Joe Greenseid
    Direct questions or comments to webmaster@lcic.org