Dmitriy Ursegov's Blog

Our blog is where our Postgres Pro experts share their knowledge with the community. Blog posts cover a variety of topics, including Postgres internals, extensions and monitoring solutions.

Recent posts

September 2, 2022   •   PostgreSQL

There are several ways to build a sharded database on top of distributed postgres instances. One of the most interesting and general approach is a built-in support for sharding. Historically postgres has fdw and partitioning features that can be used together to build a sharded database. There were concerns in the past about adoption them for a complete solution. We will review the current state of postgres fdw along with patches, that fix some significant bottlenecks in the planner and demonstrate the latest results in TPC-C with comparison to existing sharding solutions.

Despite the promising results, our experiments with postgres fdw revealed the fundamental issues that are still exist and make it hard to build an efficient system for most of the workloads. We'll discuss these issues and show a general approach that solves them for a cluster of homogeneous postgres instances. In the same time it is based on fdw and partitioning and most of the changes are implemented as extension. It consists of two components. The first part is a transport that allows to use only single connection between each of the nodes. It leads to M+N connections in the cluster in total instead of M*N where M is a number of client connections and N is a number of nodes. We'll show the implementation of such a multiplexing transport that achieves performance of 1 million pings/s between nodes as a single background worker process. The second part is an integration of postgres fdw, execution model and transaction support with the new transport. The implementation provides more than 2 times lower latency for short transactional queries. We achieved a single instance performance on two node cluster for simple queries with near linear scalability. The unmodified postgres fdw setup gives a single instance performance only on 8 nodes cluster in this test. Our approach also allows efficient transfers of data in binary format, that doesn't require a conversion into intermediate representation. The scalability test results for these transport heavy cases will be shown in the article.