Design and development of mission-critical high-performance systems based on PostgreSQL
PGConf.Online wrapped up 10 days ago, and we want to share some important stats with you. We were glad to have welcomed 2000+ registrants this year with 700-900 of them joining us online each day! The total attendance/registration rate for all three days was near 75%, which is amazing!
In early March, before International Women’s Day, we asked our female tech professionals a few questions and were surprised by some of them providing us answers :) Let’s celebrate women who code working here at Postgres Pro and have a look at what they are up to!
Postgres Professional has become one of the main sponsors backing the development of the Psycopg library, the most popular PostgreSQL adapter for the Python programming language.
It’s this time of the year again, and FOSDEM is coming! In 2021, Postgres Professional is really well-represented in the PostgreSQL devroom, as we have 5 talks accepted by the Committee. Let’s take a look at all presentations to be given by our team at this conference.
Last time we got acquainted with the structure of an important component of the shared memory — the buffer cache. A risk of losing information from RAM is the main reason why we need techniques to recover data after failure. Now we will discuss these techniques.
Sadly, there's no such thing as miracles: to survive the loss of information in RAM, everything needed must be duly saved to disk (or other nonvolatile media).
Therefore, the following was done. Along with changing data, the log of these changes is maintained. When we change something on a page in the buffer cache, we create a record of this change in the log. The record contains the minimum information sufficient to redo the change if the need arises.
For this to work, the log record must obligatory get to disk before the changed page gets there. And this explains the name: write-ahead log (WAL).
In case of failure, the data on disk appear to be inconsistent: some pages were written earlier, and others later. But WAL remains, which we can read and redo the operations that were performed before the failure but their result was late to reach the disk.
The previous series addressed isolation and multiversion concurrency control, and now we start a new series: on write-ahead logging. To remind you, the material is based on training courses on administration that Pavel Luzanov and I are creating (mostly in Russian, although one course is available in English), but does not repeat them verbatim and is intended for careful reading and self-experimenting.
This series will consist of four parts:
Many thanks to Elena Indrupskaya for the translation of these articles into English.
Part of the data that a DBMS works with is stored in RAM and gets written to disk (or other nonvolatile storage) asynchronously, i. e., writes are postponed for some time. The more infrequently this happens the less is the input/output and the faster the system operates.
But what will happen in case of failure, for example, power outage or an error in the code of the DBMS or operating system? All the contents of RAM will be lost, and only data written to disk will survive (disks are not immune to certain failures either, and only a backup copy can help if data on disk are affected). In general, it is possible to organize input/output in such a way that data on disk are always consistent, but this is complicated and not that much efficient (to my knowledge, only Firebird chose this option).
Usually, and specifically in PostgreSQL, data written to disk appear to be inconsistent, and when recovering after failure, special actions are required to restore data consistency. Write-ahead logging (WAL) is just a feature that makes it possible.
Now we've reached the last topic of this series. We will talk on the transaction id wraparound and freezing.
PostgreSQL uses 32-bit transaction IDs. This is a pretty large number (about 4 billion), but with intensive work of the server, this number is not unlikely to get exhausted. For example: with the workload of 1000 transactions a second, this will happen as early as in one month and a half of continuous work.
But we've mentioned that multiversion concurrency control relies on the sequential numbering, which means that of two transactions the one with a smaller number can be considered to have started earlier. Therefore, it is clear that it is not an option to just reset the counter and start the numbering from scratch.
Functions for defining and using variables in client sessions.
Pg_probackup is a utility to manage backup and recovery of PostgreSQL database clusters. It is designed to perform periodic backups of the PostgreSQL instance that enable you to restore the server in case of a failure.
An active monitoring agent for Postgres Pro. Based on zabbix, mamonsu provides an extensible cross-platform solution that can collect and visualize multiple Postgres Pro and system metrics.
JsQuery – is a language to query jsonb data type, introduced in PostgreSQL release 9.4. It's primary goal is to provide an additional functionality to jsonb (currently missing in PostgreSQL), such as a simple and effective way to search in nested objects and arrays, more comparison operators with indexes support.
PGLZ compression of individual values is used now in PostgreSQL. That’s good, but sometimes it’s possible to achieve significant compression only when compressing multiple values together. This is why we’re considering page-level data compression.
pg_pathman extension had been developed by Postgres Professional several years ago. It provides the most effective way to partition tables in a database, capable of having thousands of partitions per table. Later a similar partitioning functionality had appeared in community PostgreSQL, being easier in administration, but remarkably less effective. Now we are working on the unification of both partitioning engines.
We achieved a significant progress in PostgreSQL extendability: FDWs, custom access methods, generic WAL. And we’re not so far from having pluggable storage engines. Concept of API was presented at PGCon 2016.
Postgres Professional is the Russian PostgreSQL company founded by Russian PostgreSQL contributors. Company has 50+ employees, among them three Major PostgreSQL Contributors.
Postgres Professional is an active member of international PostgreSQL community, developers had committed 93 patches to the latest release of PostgreSQL 10.0.
Our company had successfully performed large PostgreSQL projects including database migration projects for well-known Russian and international companies. We provide industrial PostgreSQL services: vendor technical support, migration, custom extensions and core patches development, migration-related consulting, training and certification.