Week 10 · NoSQL in Couchbase

Introduction To Couchbase

Couchbase is a NoSQL, distributed, and multi-model database designed to handle large volumes of data
As it is multi-model, couchbase allows users to store/retrieve data in different formats. It supports the following models:
- Document Model: Data is stored as JSON documents, allowing for varying structures within each document
- Key-Value Model: Simple data storage and retrieval using key-value pairs which is ideal for caching and quick lookups
- Query Model: Couchbase offers a powerful query language called N1QL, enabling SQL-like queries on JSON documents for structured data handling
- Geospatial Model: The database supports geospatial data, facilitating storage and querying of location-based information
It supports real-time data processing with its in-memory caching feature, accelerating application performance and reducing latency for frequently accessed data
Couchbase offers high availability and fault tolerance through its distributed architecture, which includes automatic data replication and sharding, ensuring data is always accessible and consistent
Below is a description of what sharding is, how it works and the problems it solves:
- Sharding involves dividing the data into smaller, manageable subsets called shards
- Each shard contains a subset of the dataset and the shards are distributed across multiple servers or nodes in a cluster where each node is responsible for handling one or more shards
- By distributing data across multiple nodes, the database can distribute the workload more evenly, preventing any single node from becoming a bottleneck when dealing with a large volume of data or requests
- Sharding enables horizontal scalability as you can add more nodes to the cluster as data and traffic grow
- Sharding also provides a level of fault tolerance since in the case one node or shard goes offline, the system can still continue to function by redirecting requests to the available shards

Key-Value Storage

Data Modelling

Document-Based Data Modelling:
- In a document-based data model, data is represented as JSON documents which allows for a flexible and dynamic schema design
- Each JSON document can have a different structure, enabling you to store varying data types and fields within a collection
- JSON supports nested data structures, arrays, and complex relationships between objects
- Couchbase provides secondary indexes to efficiently query data within documents, helping to create high-performance queries on specific fields
- An explanation about designing document-based databases is given in week 9 and the same rules apply for Couchbase
Key-Value Data Modelling:
- In a key-value data model, data is stored and retrieved using unique a key
- Key-value modelling is efficient for simple data access since each data item is identified by a unique key which enables direct retrieval without the need for complex queries
- The use of keys for data access results in low-latency operations, making it ideal for caching frequently accessed data
- Couchbase's key-based approach enables effective sharding and distribution of data across nodes which supports horizontal scalability and fault tolerance
- To optimise key-value models, it is recommended for keys to be natural e.g. email address or username, be human-readable, deterministic and semantic to give them some meaning
N1QL Data Modelling:
- N1QL is a powerful query language which resembles SQL but is designed specifically for querying JSON documents in Couchbase
- It supports JOIN operations which allows to combine data from multiple documents or different collections in a single query
- This allows you to write queries declaratively, specifying the data you want to retrieve rather than focusing on how to retrieve it

Transactions

Why Transactions?
- Transactions ensure data integrity by grouping multiple operations into a single unit hence if any part of the transaction fails, the entire transaction is rolled back which ensures the database remains in a consistent state by preventing partial changes
- In multi-user systems, transactions provide concurrent access control by isolating and protecting data from simultaneous updates by other transactions thus reducing the risk of data conflicts
- Transactions are essential for handling complex operations that involve multiple parts or interdependent data changes
- A further explanation about transactions and how they work is given in week 7
Transactions On The Client-Side Or Server-Side?
- Doing transactions on the server-side means only light SDK changes would be required but you would need to setup and configure a global tool for co-ordination, lock managing and for scheduling
- Doing transactions on the client-side means no global tool requires setup and configuration, it allows for quick iteration and no new configurations need performing on the server but major SDK changes would be needed and all SDKs must use the same algorithm
What Are The ACID Properties?
- Atomicity: A group of operations either all complete successfully or all fail
- Consistency: Data is never in an invalid state
- Isolation: Each operation is independent of other concurrent operations
- Durability: Data is safely stored in case of a system failure

Week 10: Couchbase

Couchbase, Key-Value Storage, Data Modelling & Transactions

Introduction To Couchbase

Key-Value Storage

Data Modelling

Transactions

Practice Question

When designing a complex data model in Couchbase, which of the following features should you consider utilising for optimal performance and data handling?

End Of Week Quiz