Skip to content

VOCABULARY

Comprehensive Reference for System Design Terms

Section titled “Comprehensive Reference for System Design Terms”

ACID (Atomicity, Consistency, Isolation, Durability)

Section titled “ACID (Atomicity, Consistency, Isolation, Durability)”

A set of properties of database transactions intended to guarantee data validity despite errors or power failures.

A server that acts as a single entry point for a set of microservices, handling request routing, composition, and protocol translation.

A Python library for writing concurrent code using the async/await syntax, used for I/O-bound operations.

Automatically adjusting the number of compute resources based on demand.

The percentage of time a system is operational and accessible when needed.


A mechanism where a downstream system signals an upstream system to slow down when it’s unable to keep up with the incoming load.

BASE (Basically Available, Soft state, Eventual consistency)

Section titled “BASE (Basically Available, Soft state, Eventual consistency)”

A consistency model used in distributed systems that provides guarantees different from ACID.

A deployment strategy that runs two identical production environments (blue and green) and switches between them for zero-downtime releases.

In domain-driven design, a conceptual boundary within which a particular domain model is defined and applicable.

A middleman that handles message routing between producers and consumers.

A pattern that isolates resources to prevent cascading failures.


States that a distributed system can only provide two of three guarantees: Consistency, Availability, and Partition Tolerance.

A pattern that identifies and tracks changes to data in a database and propagates those changes to downstream systems.

A geographically distributed network of servers that delivers content to users based on their location.

A design pattern that prevents cascading failures by stopping requests to a failing service.

Rules that define when changes to data become visible to readers (strong, eventual, causal, read-your-writes).

A lightweight, standalone executable package that includes everything needed to run software.

CQRS (Command Query Responsibility Segregation)

Section titled “CQRS (Command Query Responsibility Segregation)”

A pattern that separates read and write operations into different models.

Data structures that can be replicated across multiple nodes and updated independently without coordination.


Horizontal partitioning of data across multiple databases to distribute load.

A queue that stores messages that cannot be processed successfully.

A technique to limit the rate at which a function fires by waiting for a pause in events.

A system where components located on networked computers communicate and coordinate their actions.

A hierarchical naming system that translates domain names to IP addresses.

A platform for developing, shipping, and running applications in containers.

An approach to software design that focuses on modeling the domain and business logic.

Period when a system is unavailable or not functioning.

Amazon’s proprietary NoSQL database that provides high availability and scalability.


A block storage service provided by AWS for use with EC2 instances.

An architecture where services communicate through events rather than direct calls.

Processing data near the source rather than in a centralized data center.

A combination of Elasticsearch, Logstash, and Kibana for logging and observability.

A pattern where state changes are stored as a sequence of events rather than just the current state.

A consistency model where all replicas will eventually become consistent given no new updates.


Automatic switching to a backup system when the primary system fails.

An alternative response or action when the primary method fails.

A system’s ability to continue operating despite component failures.

Combining multiple separate organizations or systems into a single logical unit.


A high-performance, open-source framework for inter-service communication developed by Google.

Load balancing across multiple geographic locations.

A query language for APIs that allows clients to request exactly the data they need.


A system design approach that ensures a high level of uptime.

A data structure that maps data to physical nodes using consistent hashing.

An endpoint or mechanism to verify if a service is functioning correctly.

Adding more machines to a system to handle increased load.

Data classification based on access frequency (hot = frequent, cold = rare).


The property that an operation can be applied multiple times without changing the result beyond the initial application.

Entry point for traffic coming into a cluster from outside.

A service mesh that provides traffic management, security, and observability.

A setting that determines the degree to which concurrent transactions are isolated from each other.


A compact, URL-safe token format for securely transmitting claims between parties.


A distributed event streaming platform capable of handling trillions of events per day.

A NoSQL database that stores data as key-value pairs.

An open-source container orchestration platform for automating deployment and scaling.

The average number of recipients who receive a message and forward it to others, used to measure viral growth.


The time delay between a request and its response.

A process by which a cluster of nodes selects one node as the leader.

A device or software that distributes network traffic across multiple servers.

The practice of recording events and information during system operation.

A caching algorithm that evicts the least recently accessed items first.


A programming model for processing large data sets in parallel.

An architectural style that structures an application as a collection of loosely coupled services.

Software that acts as a bridge between an operating system or database and applications.

The practice of collecting and analyzing system metrics to ensure health and performance.

A software architecture where a single instance serves multiple customers.

A product with just enough features to satisfy early customers and provide feedback for future development.


A method of remapping IP addresses to preserve IP address space.

A performance anti-pattern where a query is executed for each item in a collection, causing excessive database calls.

A type of database that provides flexible schemas and scales horizontally.

A single server or instance in a distributed system.


An authorization framework that enables applications to obtain limited access to user accounts.

An open-source observability framework for generating, collecting, and exporting telemetry data.

Automating the arrangement, coordination, and management of complex systems.


The 99th percentile response time - 99% of requests are faster than this threshold.

The failure of data packets to reach their destination.

A system’s ability to continue operating despite network partitions.

A consensus algorithm for reaching agreement in distributed systems.

A distributed network architecture where nodes share resources directly.

The efficiency of a system in terms of speed, throughput, and resource usage.

The durability of data in storage beyond the lifetime of the process.

A message that causes repeated processing failures.

A .NET library that provides resilience and transient-fault-handling.

Initializing resources before they’re needed to avoid cold starts.

A database replication model where one node is primary and others are replicas.

A messaging pattern where senders publish messages without knowing the receivers.

A pattern where consumers actively request data from producers rather than receiving pushed updates.


A data structure that holds messages for asynchronous processing.

The minimum number of nodes that must agree for an operation to succeed.


A consensus algorithm designed to be understandable and practical.

Restricting the number of requests a user or service can make in a given time.

RDBMS (Relational Database Management System)

Section titled “RDBMS (Relational Database Management System)”

A database system based on the relational model.

A copy of a database that serves read requests to reduce load.

An in-memory data structure store used as a database, cache, and message broker.

Copying data across multiple nodes for redundancy and performance.

An architectural style for designing networked applications.

Automatically attempting failed operations with potential backoff.

A load balancing algorithm that distributes requests sequentially.

The process of selecting a path for traffic in a network.


Amazon’s Simple Storage Service for object storage.

The ability of a system to handle growing amounts of work.

Horizontal partitioning of data across multiple databases.

Deploying auxiliary components alongside the main application container.

A commitment between a service provider and client about service standards.

A target level of reliability for a service.

A domain-specific language for managing relational databases.

Routing requests from a user to the same server.

Processing data in continuous streams rather than batches.

A consistency model where all reads see the most recent write.

A logical subdivision of an IP network.


A connection-oriented protocol for reliable communication.

Deliberately limiting the rate of requests to prevent overload.

The maximum time to wait for a response before considering a request failed.

A protocol for secure communication over a network.

Directing network traffic to appropriate services or nodes.

A protocol for achieving atomic commitment in distributed transactions.


A connectionless protocol for fast, unreliable communication.

The presentation layer of an application.


Adding more resources (CPU, RAM) to an existing machine.

An isolated virtual network within a cloud provider.


A protocol for full-duplex communication over a single TCP connection.

A technique where changes are logged before being applied.

A security tool that monitors and filters HTTP traffic to protect against web attacks.

A security tool that monitors and filters HTTP traffic to protect against web attacks.


A coordination service for distributed systems.

The ability to update or maintain a system without interrupting service.


AcronymFull Form
ACLAccess Control List
APIApplication Programming Interface
ASGAuto Scaling Group
CDCChange Data Capture
CDNContent Delivery Network
CQRSCommand Query Responsibility Segregation
CRUDCreate, Read, Update, Delete
DAGDirected Acyclic Graph
DBDatabase
DDoSDistributed Denial of Service
DLQDead Letter Queue
DNSDomain Name System
EBSElastic Block Store
EDAEvent-Driven Architecture
ELBElastic Load Balancer
ELKElasticsearch, Logstash, Kibana
gRPCGoogle Remote Procedure Call
HAHigh Availability
HTTPHypertext Transfer Protocol
IAMIdentity and Access Management
IPInternet Protocol
JSONJavaScript Object Notation
LBLoad Balancer
LRULeast Recently Used
MQMessage Queue
MVPMinimum Viable Product
NATNetwork Address Translation
NICNetwork Interface Card
OTelOpenTelemetry
PKIPublic Key Infrastructure
QoSQuality of Service
RBACRole-Based Access Control
SLAService Level Agreement
SLOService Level Objective
SRESite Reliability Engineering
SSLSecure Sockets Layer
TCPTransmission Control Protocol
TLSTransport Layer Security
TTLTime To Live
VPCVirtual Private Cloud
WAFWeb Application Firewall

Last Updated: February 2026