The Oracle RAC: what is it an how does it work03/31/2008
|Tips & Tricks|
One of the new features included within the Oracle software is the ability to create database clusters (also called Real Application Cluster – RAC). According to Oracle, this feature allows for high availability, performance and scalability. The transparent failover (Transparent Application Failover – TAF) feature is also included, which is used by the deployed applications to synchronize their requests through the Oracle cluster without knowing whenever any of the cluster nodes has been disconnected.
How does it work?
Conceptually, the Oracle RAC architecture is shown on the following diagram:
The database requests are generated by the application (for instance, from a database connection pool configured on the Application Server), and the Oracle RAC is in charge of redirecting these requests to the working server. Note that in this configuration there is no load balancing, so the showed configuration is plain failover – that is, all incoming requests will reach Node 1, and just in case it ceases to work or is disconnected, all requests will be redirected to Node 2.
However, we need to look how the Oracle RAC really works. This can be better explained if we use a UML component diagram:
In fact, the Oracle client or service has the Node 1 configured as the primary connection; if it is not possible to resolve the request on that server, the service will redirect the request to the backup server (in this case the Node 2). Also, it is important to mention that it is the database engine listener what is running on nodes 1 and 2, not the database by itself: the information (that is, the files that form part of the database) are located on a disk array with a mirror configuration to provide redundancy – and therefore, high availability. On summary:
However, RAC has two very important limitations that are necessary to take into account:
Once an Oracle RAC node is disconnected – being the cause a hardware or network failure o a resource over-demand – all transactions must be automatically redirected to the backup node. The point of such scheme is not to loose the requests that were on-the-fly at the time of the disconnection.
The problem is, according to Oracle (see here), to use the TAF features we must implement the Oracle Call Interface – OCI API. In short, we need to install an Oracle client on the Application Server, and the use of a simple database connection driver – such as the Java Thin Client – is not enough.
Also, at least for the Oracle 10g version, TAF performs the transparent failover just for queries of the SELECT kind only. All other operation types will throw an error automatically:
Therefore, we reach the following conclusion:
This limits the Oracle RAC viability, considering the cost-benefit as this is quoted separately from the database engine (see here).
Improving Oracle RAC
However, not all is lost. Both issues (load balancing and transparent failover) can be solved by a software or hardware load-balanced-cluster. The scalability of such solution depends more on the budget we have, but it supports our decision to implement the Oracle RAC for database high-availability.
The conceptual diagram is showed on the following image:
The balancer is in charge of the load balancing; this can be either a software component (for instance, a web server with round-robin balancing like Apache) or a hardware component (like an F5 Switch) in such configuration as to allow:
Oracle RAC is a component that provides high availability to our back-end by allowing the deployment of multiple instances of a database listener – and a single storage unit. This in turn, allows us – with the help of a balancing component – high availability and scalability of the services offered by the database.
Update on (12/12/2007)
The latest version of Oracle RAC (for Oracle 10g release 2) includes a major overhaul in terms of technology implemented by the solution. Therefore, we now have two features that were very much absent on the previous version:
As a side note, load balancing is done by the RAC itself as long as the only content that exists in the nodes of the Oracle RAC is the database itself – i.e. the RAC does not work properly if within the filesystems of such nodes there is something else, such as external logging files or additional information to be stored and synchronized. In that case, we must use the alternative of an additional balancing component.