The choice between SQL and NoSQL databases is fundamental to system design. Each has distinct characteristics, trade-offs, and ideal use cases.
+------------------------+ +------------------------+
| SQL Databases | | NoSQL Databases |
+------------------------+ +------------------------+
| MySQL | | Document: MongoDB |
| Oracle | | Key-Value: Redis |
| SQL Server | | Column: Cassandra |
| SQLite | | Graph: Neo4j |
+------------------------+ +------------------------+
SQL databases are relational databases that use Structured Query Language (SQL) for defining and manipulating data. They organize data into tables with rows and columns.
+--------+-----------+------------------+------------+
| id | name | email | created |
+--------+-----------+------------------+------------+
| 1 | John | john@email.com | 2024-01-15|
| 2 | Jane | jane@email.com | 2024-01-16|
+--------+-----------+------------------+------------+
+--------+--------+----------+-------+
| id | user_id| amount | status|
+--------+--------+----------+-------+
| 1 | 1 | 100.00 |shipped|
| 2 | 1 | 50.00 | pending|
+--------+--------+----------+-------+
Relationship: Orders.user_id -> Users.id
Characteristic Description Schema Pre-defined, structured schema ACID Atomicity, Consistency, Isolation, Durability Relationships Foreign keys, JOIN operations Query Language Standard SQL Scaling Typically vertical (scale up)
Use Case Why SQL Works Financial transactions ACID guarantees data integrity Complex queries Powerful JOINs and aggregations Structured data Fixed schema works well Reporting/Analytics Powerful query capabilities Data integrity required Foreign keys, constraints
NoSQL databases are non-relational databases designed for flexible schemas, horizontal scaling, and specific data models beyond tabular storage.
+----------------+ +----------------+ +----------------+
| Document | | Key-Value | | Column-Family|
+----------------+ +----------------+ +----------------+
| MongoDB | | Redis | | Cassandra |
| CouchDB | | DynamoDB | | HBase |
| | | Memcached | | Bigtable |
+----------------+ +----------------+ +----------------+
+----------------+ +----------------+
+----------------+ +----------------+
| Amazon Neptune | | TimescaleDB |
+----------------+ +----------------+
Document Database (MongoDB)
===========================
+--------------------------------------------------+
| "_id": "ObjectId(...)", |
| "email": "john@email.com", |
| {"id": 1, "amount": 100}, |
| {"id": 2, "amount": 50} |
+--------------------------------------------------+
Flexible schema: Each document can have different fields
SET user:1:email "john@email.com"
SET user:1:session "abc123xyz"
Column-Family Store (Cassandra)
==============================
+------------------------+
+------------------------+
+------------------------+
+------------------------+
+------------------------+
+------------------------+
Nodes: John, Jane, ProductA, ProductB
(John)-[:BOUGHT]->(ProductA)
(Jane)-[:BOUGHT]->(ProductB)
MATCH (john {name: 'John'})-[:KNOWS]->(friend)
Characteristic Description Schema Flexible or schema-less BASE Basically Available, Soft state, Eventual consistency Denormalized Data embedded in documents APIs Document/JSON, key-value APIs Scaling Typically horizontal (scale out)
Use Case Why NoSQL Works Rapid development Flexible schema Massive scale Horizontal scaling Unstructured data JSON/document support High write throughput Cassandra, DynamoDB Caching Redis, Memcached Graph relationships Neo4j
Feature SQL NoSQL Data Model Tables (rows/columns) Documents, key-value, graphs Schema Fixed Flexible Query Language SQL (standardized) API/SDK based ACID Full support Limited Scalability Vertical Horizontal Relationships Foreign keys, JOINs Embed or link Transactions Multi-statement ACID Limited Indexing Rich Varies Learning Curve Moderate Varies
+------------------------------------------+
| Table -> Row -> Columns |
| Complex Queries via JOIN |
+------------------------------------------+
+------------------------------------------+
| Collection -> Document -> Fields |
| Data embedded in documents |
+------------------------------------------+
+------------------------------------------+
+------------------------------------------+
SQL Is data highly structured?
Requirement Recommended Database Financial transactions PostgreSQL, MySQL Analytics/Reporting PostgreSQL, ClickHouse User profiles PostgreSQL, MongoDB Session storage Redis Real-time inventory Cassandra, DynamoDB Product catalog MongoDB, Elasticsearch Chat messages Cassandra, DynamoDB Social graph Neo4j Time-series data InfluxDB, TimescaleDB Caching layer Redis, Memcached
Modern Application Architecture
==============================
+---------------------------------------------------+
+---------------------------------------------------+
+-----------+ +-------+ +------+ +------+
| PostgreSQL| |Redis | |Elastic| |S3 |
| User | |Session| |search | |Files |
| Orders | |Cache | | | | |
+----------+ +------+ +------+ +------+
Each data type uses the best-fit database
Database Selection by Domain
===========================
+-------------------+--------------------------------+
+-------------------+--------------------------------+
| User accounts | PostgreSQL (ACID required) |
| Product catalog | MongoDB (flexible schema) |
| Shopping cart | Redis (fast, session-like) |
| Orders | PostgreSQL (transactional) |
| Product search | Elasticsearch (full-text) |
| Inventory | Cassandra (high write) |
| Recommendations | Neo4j (graph relationships) |
| Logs/Analytics | ClickHouse (OLAP) |
+-------------------+--------------------------------+
Consideration Details Schema Design Embed vs reference decisions Data Modeling Different approach to relationships Queries No JOINs, denormalize data Transactions May need application-level handling Migrations Plan for data transformation
Consideration Details Schema Define upfront Relationships Use foreign keys Queries Use JOINs ACID Can now use transactions
Key points for SQL vs NoSQL:
SQL for structure and transactions - Use when data integrity is critical
NoSQL for scale and flexibility - Use when scaling is priority
Consider polyglot persistence - Use multiple databases
Match database to data characteristics - Different data, different DBs
Plan schema design - SQL needs upfront, NoSQL can evolve
Think about operations - Different operational complexity
Next: Chapter 8: Database Replication