Message Brokers and Distributed Architectures

Message brokers play a crucial role in addressing the challenges of distributed architectures and microservices by providing a reliable, flexible, and scalable means of communication between services. In this blog we will explore the role of message brokers, take a deep dive into some mechanisms used by message brokers to address common challenges in distributed systems.

Photo by Tim Evans on Unsplash

Here are some of the key features of message brokers that align well with the needs of distributed and microservices architectures:

Service Decoupling: Message brokers act as an intermediary layer that allows services to communicate without being directly connected. This allows services to operate, scale, and evolve independently.
Asynchronous Communication: Asynchronous messaging allows services to send and receive messages without waiting for immediate responses. This improves system responsiveness and efficiency.
Scalability: Message brokers are designed to handle high volumes of messages and can be scaled to accommodate the growing communication needs of a microservices architecture.
Reliability and Fault Tolerance: Features like message queuing, acknowledgment mechanisms, and durable storage ensure that messages are not lost and can be retried in case of service failures, contributing to the overall resilience of the system.
Load Balancing: Message brokers can distribute messages across multiple instances of a service, balancing the load and ensuring that individual services can scale out to handle high demand.
Routing and Filtering: Advanced routing capabilities allow messages to be directed to specific services based on content, priority, or other criteria. This enables more sophisticated communication patterns and ensures that messages are processed by the appropriate service instances.
Interoperability: Message brokers support multiple messaging protocols and data formats, making it easier for different services to communicate effectively.

Use cases

This ability to decouple components, manage message queues, ensure reliable delivery, and support asynchronous communication makes them ideal for numerous use cases. Some scenarios are listed below.

Event-Driven Architecture: In an event-driven architecture, message brokers are used to distribute events to different parts of a system in real-time. This is crucial for applications that rely on immediate data processing, such as real-time analytics, monitoring systems, and IoT applications.
Event-Driven Architecture (EDA) Example. Source
Microservices Communication: In microservices architectures, message brokers facilitate loose coupling and asynchronous communication between services. This setup enhances scalability, reliability, and resilience, allowing individual microservices to be developed, deployed, and scaled independently.
Integration of Heterogeneous Systems: Message brokers facilitate the integration of diverse applications, systems, or services, potentially written in different programming languages or running on different platforms. They provide a common platform for data exchange, simplifying the process of connecting disparate systems.
Workflow Processing and Business Process Automation: They can orchestrate complex workflows or business processes by routing messages between different services responsible for various steps in a process. This capability is crucial for automating and managing long-running, multi-step business operations.
Data Streaming and Real-Time Processing: For applications that require real-time data processing, such as streaming analytics, message brokers can facilitate the continuous flow of data between producers and consumers, enabling timely analysis and decision-making.
IoT and Device Communication: In IoT ecosystems, message brokers manage communication between a vast number of devices and backend systems. They handle massive volumes of messages generated by devices, supporting use cases like remote monitoring, smart homes, and industrial automation.
Notification and Alert Systems: Message brokers are used to distribute notifications or alerts to users or systems. This is common in applications that need to send real-time updates, such as stock trading apps, weather alerts, or system monitoring tools.

Core components

The core components of a message broker include the following.

Producer and Consumer APIs: These interfaces allow applications to send (produce) and receive (consume) messages. Producers send messages to the broker, specifying the destination, while consumers subscribe to queues or topics to receive messages. The APIs abstract the complexities of the underlying network protocols, providing a simple way for applications to interact with the message broker.
Queues and Topics: These are the destinations to which messages are sent. Queues store messages until they are consumed, typically supporting a point-to-point communication model where each message is consumed by a single consumer. Topics, on the other hand, support a publish-subscribe model, where messages published to a topic can be received by multiple subscribers. This allows for broadcasting messages to multiple consumers simultaneously.
Message Storage: This component is responsible for persisting messages until they are processed by consumers. It ensures that messages are not lost in case of failures, providing various levels of durability and reliability depending on the system's requirements.
Routing and Filtering: Many message brokers offer advanced routing and filtering capabilities, allowing messages to be directed to specific queues or topics based on rules or content. This enables more sophisticated message distribution strategies, such as load balancing or prioritization.
Management and Monitoring Tools: These tools help administrators manage the message broker, configure queues and topics, monitor message flow, and troubleshoot issues. They are essential for maintaining the health and performance of the system.

Messaging models and patterns

Message brokers support various messaging models to cater to different communication patterns and application requirements. Each model serves specific use cases, depending on the nature of the interaction between the sender and receiver, the need for message persistence, the requirement for direct or pattern-based routing, and other factors. Here are the primary messaging models used by brokers, along with their typical use cases.

Point-to-Point (Queue). In the point-to-point model, messages are sent to a queue, and each message is processed by exactly one consumer. If multiple consumers are listening on the same queue, the message broker ensures that each message is processed by only one of them, typically on a first-come, first-served basis. Common use cases include task distribution and job scheduling, where each task needs to be processed by only one worker.
Publish-Subscribe (Pub/Sub). In the publish-subscribe model, messages are published to a topic, and all subscribers to that topic receive the messages. This model supports one-to-many communication, allowing for broad dissemination of messages. Common use cases include event notification and real-time updates, where multiple consumers need to receive the same message.
Fan-out, Topic-Based Routing (Broadcast) and Direct Routing. These models are variations of the publish-subscribe model, offering more granular control over message routing and delivery. In the fan-out model, messages are broadcast to all subscribers, regardless of routing rules. This is useful for scenarios where messages need to be sent to all available receivers, such as logging and monitoring. In topic-based routing, messages are published to topics with a routing pattern, and subscribers receive messages based on topic patterns they are interested in. This allows for more selective notification and multicast routing. In direct routing, messages are sent with a specific routing key, and the message broker routes them to queues that are bound with a matching routing key. This model allows for direct, yet flexible, routing of messages to specific consumers.
Request-Reply. This model extends the basic messaging patterns to support synchronous interactions, where a sender sends a message (request) and waits for a response. Though message brokers are inherently asynchronous, this pattern can be implemented using temporary queues or correlation IDs. It is useful for remote procedure calls (RPC) and synchronous interactions between services.

Messaging protocols

Here are some common messaging protocols implemented by message brokers.

AMQP (Advanced Message Queuing Protocol). AMQP is an open standard messaging protocol known for its reliability, flexibility, and wide range of features. It ensures message delivery, supports various routing patterns (point-to-point, pub/sub), and works well in enterprise systems where cross-platform communication is essential. AMQP is often used in financial systems needing transactional support and in microservice architectures to promote robust yet loosely-coupled communication.
MQTT (Message Queuing Telemetry Transport). MQTT is a lightweight publish/subscribe protocol designed for devices with constrained resources or unreliable networks. It prioritizes minimal overhead and offers different levels of Quality of Service for message delivery. MQTT is ideal for IoT applications, mobile messaging, and systems where data is collected from remote locations with potentially limited bandwidth.
STOMP (Streaming Text Orientated Messaging Protocol). STOMP is a simple, text-based protocol focusing on reliable queuing and basic publish/subscribe functionality. Its simplicity makes it easy to implement and interoperable with various systems. STOMP is a good option for integrating legacy systems or when advanced messaging features are not a primary requirement.
Proprietary Protocols. Some message brokers, like Apache Kafka, utilize their own optimized protocols to achieve maximum performance and exceptionally low latency. These protocols often provide unparalleled speed for high-throughput data streaming use cases such as real-time analytics or specialized financial trading systems. However, the potential downside is vendor lock-in, where you become reliant on that specific broker.

Message delivery, persistence, and durability

Message delivery, persistence and durability choices play an important role in reliability and consistency. Here's a brief overview.

Message Delivery Guarantees

Message delivery guarantees determine how reliably messages are transferred from producers to consumers. The choice of delivery guarantee impacts system performance, complexity, and reliability. It must align with overall goals regarding data accuracy and processing idempotency. There are typically three levels of delivery guarantees:

At-most-once: Messages are delivered once or not at all, prioritizing speed over reliability. This approach risks message loss but minimizes latency and resource usage.
At-least-once: Messages are guaranteed to be delivered at least once, ensuring no message loss but potentially leading to duplicate messages if acknowledgments fail or are lost. This requires mechanisms on the consumer side to handle or idempotent processing.
Exactly-once: Ensures each message is delivered and processed exactly once, eliminating the risk of message loss or duplication. This is the most complex and resource-intensive guarantee to implement, involving transactional or coordinated mechanisms between producers, brokers, and consumers.

Ordering: Ensuring messages are delivered in the order they were sent can be critical for certain applications. This can be challenging in distributed systems, especially when scaling out or dealing with failures. There are different order guarantees, some of which include FIFO (First-In-First-Out), partial ordering, causal ordering, and total ordering.

Message Persistence

Persistence refers to the capability of the messaging system to store messages on a stable storage medium like a disk until they have been successfully delivered to and acknowledged by the consumer. This ensures that messages are not lost in case of system failures. Persistence is crucial for applications where message loss cannot be tolerated, ensuring that messages can survive broker restarts or crashes. Enabling persistence can affect throughput due to the overhead of disk I/O operations. Systems must balance the need for message durability with performance requirements. Implementing message persistence also requires strategies for managing storage, including considerations for disk space and message retention policies.

Message Durability

Durability goes beyond persistence by ensuring that once a message is reported as successfully sent or committed, it will not be lost even in the event of failures. This often involves replication of messages across multiple nodes or brokers. Durability is essential for maintaining data integrity and system reliability, particularly in systems where the accuracy and completeness of message delivery are critical. Durable systems should also include mechanisms for recovering from failures, such as transaction logs that can be replayed to rebuild the state after a crash.

Common message broker implementations

Feature / Broker	ActiveMQ	Kafka	Azure Service Bus	Amazon SQS	RabbitMQ
Primary Use Cases	Enterprise Integration, IoT, General messaging	High-throughput event streaming, Log aggregation	Enterprise cloud integration, Decoupling applications in Azure	Decoupling cloud applications, Simple message queuing without management overhead	General messaging, Complex routing, High-speed processing
Protocols Supported	AMQP, MQTT, STOMP, OpenWire, JMS	Proprietary (client libraries for multiple languages)	AMQP, HTTP, SBMP	HTTP, HTTPS	AMQP, MQTT, STOMP, HTTP
Licensing	Apache License 2.0	Apache License 2.0	Proprietary (Microsoft)	Proprietary (Amazon)	Mozilla Public License
Adoption & Community Support	High, with a broad community and extensive documentation	Very high, widely used in industries for real-time data processing	High, especially among enterprises using Azure services	Very high, widely adopted for cloud-based applications	High, with a strong community and comprehensive documentation
Throughput & Scalability	Supports clustering for higher throughput and scalability	Designed for high throughput and scalability, with a distributed architecture	Scalable with Azure infrastructure, but can be limited by pricing tiers	Automatically scales with demand but can have higher latencies	Supports clustering and is highly scalable with efficient routing capabilities

A note about streaming platforms

The terms streaming platform and message broker are often used in the context of distributed systems and data processing, and while they share some similarities and can sometimes overlap in functionality, they are designed with different primary objectives in mind. While message brokers are designed to handle asynchronous messaging and support various messaging models, streaming platforms are specialized systems that focus on processing and analyzing continuous streams of data. Streaming platforms, such as Apache Kafka, are designed to handle high-throughput, real-time data processing, and event-driven architectures. They are particularly well-suited for use cases like log aggregation, real-time analytics, and event sourcing, where large volumes of data need to be processed in real time.

Message brokers play a pivotal role in distributed systems and microservices architectures, providing a flexible, reliable, and scalable foundation for communication between services. By decoupling components, supporting asynchronous messaging, and ensuring message delivery, they enable the development of resilient, responsive, and efficient systems.