The Best of Both Worlds
ScaleDB delivers the best of
both worlds; the benefits of both shared-disk and
shared-nothing databases. We started with a shared-disk database, in order
to get high-availability and elasticity, and then we
designed it to deliver performance like a shared-nothing database.
The Performance Challenges
1. Data contention
2. Sharing data via the disk
3. Maximize local cache
4. Network traffic for messaging
5. Network traffic for large chunks of data
Data Contention
When two or more database nodes in a cluster want access to the
same data, that data must be moved between the nodes. This is
much slower than keeping that data in a single node.
If the data stays in a single node of the cluster then it: (a)
avoids nodes having to wait for the node with that data to
finish its process;
(b)
eliminates the delay for moving the data between the nodes; (c)
enables each node to operate on the
data in cache, instead of
writing the data constantly to facilitate sharing. Each of these
improves performance significantly.
To achieve these objectives, you need locality. Locality is
where database requests are sent to specific database nodes
in the cluster
that
specialize in that data requested. In effect, certain nodes in
the database specialize in certain data. Unlike a shared-nothing
database,
where certain nodes in the database are hard-coded to work only with
a portion of the database, shared-disk databases, even with
locality,
retain the flexibility to add or remove nodes and to change which
nodes handle which data on the fly. In short, ScaleDB retains
the
flexibility of shared-disk databases, but with locality, it
delivers the performance that rivals shared-nothing databases.
Sharing Data Via the Disk
By utilizing data locality, as described above, the sharing of
data between nodes is dramatically reduced, but it isn’t always
eliminated.
Since the disk is the slowest component in the
computer, you want the database to use it as little as possible;
instead using cache,
which is much faster than disk. ScaleDB has
implemented a shared cache, which dramatically reduces or
eliminates the need to share
data via the disk.
This is done through our Cache Accelerator Server (CAS).
Of course, sharing via cache can be risky, because data in the
cache is lost when you have power failure. We address this by mirroring
the
data to two
or more CAS. As a result, you have the data stored
in two or more caches, so failure of one server will not result in
loss
of data. This
mirroring allows ScaleDB to run in a mode,
should you choose to do so, where the data from the nodes is
written to mirrored cache
and then it is flushed to disk outside
of the transaction. Using this approach, the entire database runs at cache speed.
Maximize Local Cache
ScaleDB is designed so that each database node in the cluster fully utilizes
its local cache. If one node
changes data located in another node’s cache, that node is
alerted to
invalidate that data in the cache. As a result, when that node
goes to retrieve the data, it is no longer
in the cache, so it must go to the CAS to get the updated data.
This maintains cache coherency, while still using the local
cache for each node.
By implementing
data locality, as decribed in the "Data Contention" section
above, each database node in the cluster develops a
cache
that is
specialized,
for the data set it is addressing at the time. This results in a much higher cache hit
ratio, which results in much
faster
performance. In essence, each node in the cluster has a local
cache optimized for the data it is serving at the time, but if
another
node modifies that data, ScaleDB alerts the other nodes
accordingly.
Network Traffic for Messaging
Shared-disk databases rely on messaging between nodes and a
distributed lock manager to coordinate between the nodes. If
multiple
nodes are requesting the same data those requests must be scheduled so that they don’t conflict. This process
results in messaging
traffic that reduces overall database performance.
Utilizing data locality reduces the amount of data being shared
between nodes, but the cluster still needs to make sure that two
nodes
aren’t locking the same data at the same time. ScaleDB has
implemented a solution that enables the nodes to operate in a
safe and
efficient manner that dramatically reduces or
eliminates any synchronous messaging with the distributed lock
manager. By eliminating
synchronous messaging the nodes operate
almost totally independently of each other, enabling them to
perform like a shared-nothing database.
Network Traffic for Large Chunks of Data
When a shared-nothing database requests a large chunk of data,
it travels from the local hard drive via the local bus. When a
shared-
disk system requests a large chunk of data, one not found in the
local cache, this must come from the CAS, and that data must
travel over the network. Since the
network, or interconnect, is slower than the local bus, there is
a
performance hit to this approach.
The ideal solution to this, is to process the
data at the CAS and then ship only the
result set over the
network.
Because the result set is typically much smaller than the entire
data set (e.g. table), this approach delivers superior performance.
For example,
if you wanted to find out how many blue pants were sold in May
across all
stores, you would need to send all sales
data for all
stores, plus the product catalog, to a single node to process.
It is much faster
if the CAS server can simply process that
request
locally and only send the results back to that database
node. This is what
Oracle RAC® achieves with Exadata®. ScaleDB
does
this, at a fraction of a fraction of the cost of Oracle, using the
CAS.
Conclusion
By eliminating partitioning (or sharding) and repartitioning,
master-slave replication, slave promotion, implementing joins in
the
application layer, and much more, shared-disk databases
significantly simplify database set-up, maintenance and tuning,
thereby
reducing costs. They also
deliver
high-availability and elasticity, two benefits that are
particularly important in the cloud. Historically,
these
advantages came at the
cost of reduced performance and very high
license fees. As described above, ScaleDB has addressed
the traditional
performance penalty of
shared-disk databases, while delivering all of this for dramatically less than that other shared-
disk database.