Couchbase, Key-Value
Storage, Data Modelling & Transactions
Introduction To Couchbase
Couchbase is a
NoSQL, distributed, and multi-model database designed to handle large volumes of data
As it is
multi-model, couchbase allows users to store/retrieve data in different formats. It supports
the following models:
Document Model: Data is stored as JSON documents, allowing for varying
structures within each document
Key-Value Model: Simple data storage and retrieval using key-value pairs
which is ideal for caching and quick lookups
Query Model: Couchbase offers a powerful query language called N1QL, enabling
SQL-like queries on JSON documents for structured data handling
Geospatial Model: The database supports geospatial data, facilitating storage
and querying of location-based information
It supports
real-time data processing with its in-memory caching feature, accelerating application
performance and reducing latency for frequently accessed data
Couchbase offers
high availability and fault tolerance through its distributed architecture, which includes
automatic data replication and sharding, ensuring data is always accessible and consistent
Below is a
description of what sharding is, how it works and the problems it solves:
Sharding involves dividing the data into smaller, manageable subsets called shards
Each shard contains a subset of the dataset and the shards are distributed across
multiple servers or nodes in a cluster where each node is responsible for handling
one or more shards
By distributing data across multiple nodes, the database can distribute the workload
more evenly, preventing any single node from becoming a bottleneck when dealing with
a large volume of data or requests
Sharding enables horizontal scalability as you can add more nodes to the cluster as
data and traffic grow
Sharding also provides a level of fault tolerance since in the case one node or
shard goes offline, the system can still continue to function by redirecting
requests to the available shards
Key-Value Storage
Below is an
explanation about how Couchbase efficiently manages the retrieval of data using a key-value
store through the utilisation of a hashmap:
Couchbase offers a key-value store data model where data is stored in key-value
pairs, with each piece of data associated with a unique key
When storing or retrieving data in Couchbase, the key undergoes hashing, converting
it into a fixed-size value by applying a hash function
The hashed key determines the appropriate shard holding the requested data
Couchbase maintains a cluster map which is a metadata representation of the entire
database cluster containing information about the data distribution across shards
and their locations on different nodes
The cluster map is often represented as a hashmap data structure hence allowing for
quick retrieval of the shard assigned to the document containing the requested data
Data Modelling
Document-Based Data Modelling:
In a document-based data model, data is represented as JSON documents which allows
for a flexible and dynamic schema design
Each JSON document can have a different structure, enabling you to store varying
data types and fields within a collection
JSON supports nested data structures, arrays, and complex relationships between
objects
Couchbase provides secondary indexes to efficiently query data within documents,
helping to create high-performance queries on specific fields
An explanation about designing document-based databases is given in week
9 and the same rules apply for Couchbase
Key-Value
Data Modelling:
In a key-value data model, data is stored and retrieved using unique a key
Key-value modelling is efficient for simple data access since each data item is
identified by a unique key which enables direct retrieval without the need for
complex queries
The use of keys for data access results in low-latency operations, making it ideal
for caching frequently accessed data
Couchbase's key-based approach enables effective sharding and distribution of data
across nodes which supports horizontal scalability and fault tolerance
To optimise key-value models, it is recommended for keys to be natural e.g. email
address or username, be human-readable, deterministic and semantic to give them some
meaning
N1QL Data
Modelling:
N1QL is a powerful query language which resembles SQL but is designed specifically
for querying JSON documents in Couchbase
It supports JOIN operations which allows to combine data from multiple documents or
different collections in a single query
This allows you to write queries declaratively, specifying the data you want to
retrieve rather than focusing on how to retrieve it
Transactions
Why
Transactions?
Transactions ensure data integrity by grouping multiple operations into a single
unit hence if any part of the transaction fails, the entire transaction is rolled
back which ensures the database remains in a consistent state by preventing partial
changes
In multi-user systems, transactions provide concurrent access control by isolating
and protecting data from simultaneous updates by other transactions thus reducing
the risk of data conflicts
Transactions are essential for handling complex operations that involve multiple
parts or interdependent data changes
A further explanation about transactions and how they work is given in week
7
Transactions
On The Client-Side Or Server-Side?
Doing transactions on the server-side means only light
SDK changes would be required but you would need to setup and configure a global tool for
co-ordination, lock managing and for scheduling
Doing transactions on the client-side means no global
tool requires setup and configuration, it allows for quick iteration and no new configurations need performing on the
server but major SDK changes would be
needed and all SDKs must use the same
algorithm
What Are The
ACID Properties?
Atomicity: A group of operations either all complete successfully or all fail
Consistency: Data is never in an invalid state
Isolation: Each operation is independent of other concurrent operations
Durability: Data is safely stored in case of a system failure
Practice Question
When designing a complex
data model in Couchbase, which of the following features should you consider utilising for
optimal performance and data handling?
End Of Week
Quiz
Note: A score of 70% or higher would mean you
have successfully completed the week 10 workshop