Bottled Water: PostgreSQL to Kafka replication - Mailing list pgsql-announce

From Martin Kleppmann
Subject Bottled Water: PostgreSQL to Kafka replication
Date
Msg-id 797DF957-CE33-407F-99DB-7C7125E37ACE@kleppmann.com
Whole thread Raw
List pgsql-announce
Hi PostgreSQL world,

I'd like to announce a new open source project, called "Bottled Water", for getting data from PostgreSQL into Kafka:
http://blog.confluent.io/2015/04/23/bottled-water-real-time-integration-of-postgresql-and-kafka/
https://github.com/confluentinc/bottledwater-pg/

In case you're not aware of Kafka (http://kafka.apache.org/), it's an open source message broker that was originally
developedat LinkedIn and is now a lively Apache project. Unlike many other messaging systems (AMQP, JMS etc), it is
structuredas a commit log, which makes it well suited for replicating data from one system to another. 

Bottled Water uses PostgreSQL 9.4's logical decoding feature to extract a consistent snapshot of a database, plus an
ongoingstream of logical changes. Data is encoded in Avro (http://avro.apache.org/), a language-independent
serializationformat, with schemas that are automatically derived from the PostgreSQL table schemas. Once the data is in
Kafka,it's easier to import into downstream systems, such as full-text search indexes, caches, data warehouses, stream
analyticssystems, auditing and monitoring tools, etc. 

The blog post above has more detail on the design and the rationale behind it. This is an alpha release that is not yet
fitfor production use, but it's ready for experimentation. Feedback and contributions welcome! 

Martin



pgsql-announce by date:

Previous
From: Jeff Ferguson
Date:
Subject: Streaming-SQL Analytics Database PipelineDB Launches Beta Program
Next
From: Luis Dosso
Date:
Subject: Call for papers - PGDay Campinas 2015 (Brazil)