Reference Architectures
Production-ready patterns for enterprise OSS.
Battle-tested high-availability architectures for RabbitMQ, Kafka, and PostgreSQL — designed by engineers who operate these systems at Fortune 500 scale and validated against real production topologies.
Need a custom architecture for your environment? Book an architecture review →
RabbitMQ 3-Node HA Cluster
Production-grade RabbitMQ high availability using Quorum Queues (Raft consensus), HAProxy load balancing, and NVMe ASYNCIO journals. Tolerates one node failure with zero message loss.
Architecture specifications
- 3 RabbitMQ nodes — 1 leader, 2 followers
- Quorum Queues for guaranteed delivery
- HAProxy for connection distribution
- NVMe journals with ASYNCIO mode
- Erlang/OTP 26.x patched builds
OSSeva coverage for this architecture
All components in this reference architecture receive active CVE patches from OSSeva. Patched builds are drop-in replacements for upstream artifacts — delivered via Helm, Docker, or your existing artifact repository. Compliance attestation letters are included with every patch release.
Kafka Multi-DC Active–Active
Bidirectional Kafka replication across two datacenters using MirrorMaker 2. Both DCs serve producers and consumers independently — replication runs in the background with sub-80ms latency.
Architecture specifications
- 3 brokers per datacenter (6 total)
- MirrorMaker 2 for bidirectional replication
- ZooKeeper ensemble (3 nodes per DC)
- Separate consumer group offset mirroring
- Kafka 2.8–3.7 patched builds
OSSeva coverage for this architecture
All components in this reference architecture receive active CVE patches from OSSeva. Patched builds are drop-in replacements for upstream artifacts — delivered via Helm, Docker, or your existing artifact repository. Compliance attestation letters are included with every patch release.
PostgreSQL HA with Patroni & etcd
Automated PostgreSQL failover using Patroni as the cluster manager and etcd as the distributed consensus store. HAProxy routes writes to the primary and reads to standbys.
Architecture specifications
- 1 primary + 2 streaming standbys
- Patroni for automated leader election
- etcd 3-node cluster for DCS
- HAProxy for read/write routing
- WAL-based streaming replication
OSSeva coverage for this architecture
All components in this reference architecture receive active CVE patches from OSSeva. Patched builds are drop-in replacements for upstream artifacts — delivered via Helm, Docker, or your existing artifact repository. Compliance attestation letters are included with every patch release.
Frequently asked questions
What does a production-ready RabbitMQ HA cluster look like?
A production-grade RabbitMQ HA cluster typically consists of 3 or 5 nodes (odd number for quorum) running behind a layer-4 load balancer (HAProxy or AWS NLB). Queues are configured as quorum queues (Raft-based replication, recommended for all new deployments) rather than classic mirrored queues. TLS is enabled for AMQP, AMQP 1.0, MQTT, and the management API. Each node runs on its own VM or bare-metal host for failure domain isolation. OSSeva's reference architecture covers this topology with specific configuration recommendations for Kubernetes (RabbitMQ Cluster Operator) and traditional deployments.
What is the recommended Kafka multi-datacenter architecture?
For multi-datacenter Kafka, OSSeva recommends MirrorMaker 2 (MM2) for active-passive replication, or the stretch cluster pattern (3 DCs with Kraft controller quorum) for active-active with automatic failover. The stretch cluster requires ultra-low inter-DC latency (< 30ms RTT). For most enterprises, active-passive with MM2 and a controlled failover runbook is more operationally practical. OSSeva provides reference configurations for both patterns including offset translation and consumer group migration procedures.
What is the PostgreSQL failover architecture OSSeva recommends?
OSSeva recommends Patroni for PostgreSQL HA, using etcd or Consul for distributed consensus on leader election. A typical setup includes 1 primary and 2 standby nodes with synchronous replication to at least one standby, plus PgBouncer for connection pooling with automatic failover to the new primary. HAProxy provides a single VIP for applications. This architecture achieves RPO ≈ 0 (no data loss) and RTO of 20–40 seconds for automated failover.
Need a custom architecture for your environment?
OSSeva Assure and Operate customers get reference architectures tailored to their topology, cloud provider, and compliance requirements.