|
Basics of SQL Server
Clustering
If your AAA critical SQL server faces a memory board failure, how
long will be the outage? How much will this cost your business in
productivity and data availability to the users? Being a SQL Server
DBA can be demanding and stressful, as the success of your
application is often a function of your database uptime. As DBA, we
have some control over the uptime of SQL servers, but there are many
uncertain areas, which we do not have full control of. There is not
much a DBA can do if motherboard fails on a server. As you may
already be aware, there is one way to help boost your SQL Server’s
uptime, and that is by clustering your SQL Servers. This way, should
one SQL Server fail in the cluster, another clustered server will
automatically take over, keeping downtime to minutes, instead of
hours or more.
Clustering can be best described as a technology that automatically
allows one physical server to take over the tasks and
responsibilities of another physical server that has failed. The
obvious goal behind this, given that all computer hardware and
software will eventually fail, is to ensure that users running AAA
applications will have little or no downtime when such a failure
occurs. Downtime can be very expensive, and our goal as DBA is to
help reduce it as much as possible.
More specifically, clustering refers to a group of two or more
servers, also called nodes, that work together and represent
themselves as a single virtual server to the network. In other
words, when a client connects to clustered SQL Servers, it thinks
there is only a single SQL Server, not more than one. When one of
the nodes fails, its responsibilities are taken over by another
server in the cluster, and the end-user notices little, if any
differences before, during, and after the failover.
One very important aspect of clustering that often gets overlooked
is that it is not a complete backup system for your databases. It is
only one part of a multi-part strategy required to ensure minimum
downtime and 100% recoverability.
The main benefits that clustering provides is the ability to recover
from failed server hardware -- excluding the shared disk, and failed
software; such as failed services or a server lockup. It is not
designed to protect data, to protect against a shared disk array
from failing, to prevent hack attacks, to protect against network
failure, or to prevent SQL Server from other potential disasters,
such as power outages.

Clustering is just a
part of an entire strategy needed to help reduce SQL Server
downtime. You will also need to have a shared disk array that offers
redundancy and make tape backups. So don’t think that clustering is
all you need to create a highly available SQL Server system. It is
just one part of it.
Types of SQL Server Clustering
Once you decide to go for clustered SQL Server, you have to choose
the cluster layout. This choice is extremely important for
architecting the clustering environment and it can be made upon your
application and business needs. Let’s look at the configuration
types.
Active / Passive
An Active/Passive, or
Single Instance cluster, refers to a scenario where only one
instance of SQL Server is running on one of the physical node in the
cluster, and the other physical node does nothing, other then
waiting to takeover should the primary node fail, or a manual
failover for maintenance. From a performance perspective, this is
the better solution. On the other hand, this option makes less
productive use of your physical hardware, which means this solution
is more expensive.
If an active node fails and there is a passive node available,
applications and services running on the failed node can be
transferred to the passive node. Since the passive node has no
current workload, the server should be able to assume the workload
of the failed server without any problems (assuming the hardware of
the nodes is the same).
2-Node Clustering
Active / Passive Scenario
In this case, let's
look at a two node example, Node X and Node Y. Node X will be
configured as Active Node -- Primary Owner of SQL Server instance
and having that instance running on it. As you can see in the case
below, Node Y is in passive or standby mode, doing nothing. The
active cluster will be communicating and working along with the
shared disks.

2-Node Clustering
Active / Passive Failover Scenario
When a failover occurs on Node X, SQL Server instance A will get
transferred, with all its running processes, connections, and
responsibilities to Passive Node Y, and now Node Y will be the
Active Node. As you can see, even after the failover, the active
cluster is communicating and working with Shared Disks as usual,
there is no change.

4-Node Clustering
Active / Passive Scenario
In this case, let's look at an example of four nodes, Node X and
Node Y, Node XX and Node YY. Node X will be configured as an Active
Node -- Primary Owner of SQL Server Instance A and Node XX is also
an Active Node – Primary Owner of SQL Server Instance AA. As you can
see in below case Node Y and YY are in Passive, or Standby mode,
doing nothing.

4-Node Clustering
Active / Passive Failover Scenario
When failover occurs on Node X, SQL Server Instance A will get
transferred with all its running processes, connections, and
responsibilities to Passive Node Y, and now Node Y will be an Active
Node. When failover occurs on Node XX, SQL Server Instance AA will
get transferred with all its running processes, connections, and
responsibilities to Passive Node YY, and now Node YY will be an
Active Node.

Active / Active
An Active/Active SQL
Server cluster means two separate SQL Server instances are running
on both nodes of a two-way cluster. Each SQL Server acts
independently, and users see two different SQL Servers instances. If
one of the SQL Servers in the cluster should fail, then the failed
instances of SQL Server will failover to the remaining server. This
means that then both instances of SQL Server will be running on one
physical server, instead of two. As you can imagine, if two
instances have to run on one physical server, performance can be
affected, especially if the server’s have not been sized
appropriately. Remember that two separate SQL Server instances in
this configuration are entirely isolated entities by default.
If all severs in a cluster are active and a node fails, the
applications and services running on the failed node can be
transferred to another active node. Since the server is already
active, the server will have to handle the processing load of both
systems. The server must be sized to handle multiple workloads or it
may fail as well.
2-Node Clustering
Active / Active Scenario
In this case, let's look at an example of two nodes, Node X and Node
Y. Node X and Y both will be configured as Active Nodes, Primary
Owner of SQL Server Instances A and B on each of them. As you can
see below, Node X and Y both are active and running an instance of
SQL server on each of them.

2-Node Clustering
Active / Active Failover Scenario
When failover occurs
on Node X, SQL Server Instance A will be transferred with all its
running processes, connections, and responsibilities to Active Node
Y, and now Node Y will have to share all its memory, CPU and network
resources with Instance A and B.

4-Node Clustering
Active / Active Scenario
In the 4-node
configuration illustrated below, where nodes X, Y, XX and YY are
configured as active and failover could go to between nodes X and Y
or nodes XX and YY, this could mean configuring servers so that they
use about 25% of CPU and memory resources under average workload. In
this example, node X could fail over to Y or node XX could fail over
to YY.

4-Node Clustering
Active / Active Failover Scenario
When failover occurs
on Node Y, SQL Server Instance B will be transferred with all its
running processes, connections, and responsibilities to active Node
X, and now Node X will have two instances A and B, sharing all the
resources. When failover occurs on Node YY, SQL Server Instance BB
will get transferred with all its running processes, connections,
and responsibilities to Active Node XX, and now Node XX will have
two instances, AA and BB, sharing all the resources.

In a multi-node
configuration where there are more active nodes than passive nodes,
the servers can be configured so that under average workload they
use a proportional percentage of CPU and memory resources.
Active/Active configuration can have multiple-instance cluster set
up, which can support up to 16 SQL Server instances. Windows NT
Server 4.0 Enterprise Edition, Windows 2000 Advanced Server, and
Window 2003 Advanced Server all support two-node clustering, Windows
2000 Datacenter Server supports up to four-node clustering, and
Windows 2003 supports up to eight node clustering, however you are
limited to four nodes if SQL Server 2000 clustering is to be used.
SQL Server, in a clustered environment, also behaves differently
from a stand-alone named instance in relation to IP ports. During
the installation process, a dynamic port that may be something other
than 1433 is configured, and that port number is reserved for the
instance. In a failover cluster, multiple instances can be
configured to share the same port, such as 1433, because the
failover cluster listens only to the IP address assigned to the SQL
Server virtual server, and is not limited to a 1:1 ratio. However,
for security and potentially increased availability, you may want to
assign each virtual server to its own unique port of your choice, or
leave it as it was configured during installation.
Scaling Clustering Resources

|