Re: [PATCH] Compression dictionaries for JSONB - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: [PATCH] Compression dictionaries for JSONB |
Date | |
Msg-id | CAEze2WjyxVnKEDv8s5eP=dwYgtBKrQzVF4wvWk2Nb1FRgGGmTw@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Compression dictionaries for JSONB (Aleksander Alekseev <aleksander@timescale.com>) |
Responses |
Re: [PATCH] Compression dictionaries for JSONB
|
List | pgsql-hackers |
On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev <aleksander@timescale.com> wrote: > > Hi, > > I see your point regarding the fact that creating dictionaries on a > training set is too beneficial to neglect it. Can't argue with this. > > What puzzles me though is: what prevents us from doing this on a page > level as suggested previously? The complexity of page-level compression is significant, as pages are currently a base primitive of our persistency and consistency scheme. TOAST builds on top of these low level primitives and has access to catalogs, but smgr doesn't do that and can't have that, respectively, because it need to be accessible and usable without access to the catalogs during replay in recovery. I would like to know how you envision we would provide consistency when page-level compression would be implemented - wouldn't it increase WAL overhead (and WAL synchronization overhead) when writing out updated pages to a new location due to it changing compressed size? > More similar data you compress the more space and disk I/O you save. > Additionally you don't have to compress/decompress the data every time > you access it. Everything that's in shared buffers is uncompressed. > Not to mention the fact that you don't care what's in pg_attribute, > the fact that schema may change, etc. There is a table and a > dictionary for this table that you refresh from time to time. Very > simple. You cannot "just" refresh a dictionary used once to compress an object, because you need it to decompress the object too. Additionally, I don't think block-level compression is related to this thread in a meaningful way: TOAST and datatype -level compression reduce the on-page size of attributes, and would benefit from improved compression regardless of the size of pages when stored on disk, but a page will always use 8kB when read into memory. A tuple that uses less space on pages will thus always be the better option when you're optimizing for memory usage, while also reducing storage size. > Of course the disadvantage here is that we are not saving the memory, > unlike the case of tuple-level compression. But we are saving a lot of > CPU cycles Do you have any indication for how attribute-level compares against page-level compression in cpu cycles? > and doing less disk IOs. Less IO bandwidth, but I doubt it uses less operations, as each page would still need to be read; which currently happens on a page-by-page IO operation. 10 page read operations use 10 syscalls to read data from disk - 10 IO ops. > I would argue that saving CPU > cycles is generally more preferable. CPUs are still often a bottleneck > while the memory becomes more and more available, e.g there are > relatively affordable (for a company, not an individual) 1 TB RAM > instances, etc. But not all systems have that 1TB RAM, and we cannot expect all users to increase their RAM. > So it seems to me that doing page-level compression would be simpler > and more beneficial in the long run (10+ years). Don't you agree? Page-level compression can not compress patterns that have a length of more than 1 page. TOAST is often used to store values larger than 8kB, which we'd prefer to compress to the greatest extent possible. So, a value-level compression method specialized to the type of the value does make a lot of sense, too. I'm not trying to say that compressing pages doesn't make sense or is useless, I just don't think that we should ignore attribute-level compression just because page-level compression could at some point be implemented too. Kind regards, Matthias van de Meent
pgsql-hackers by date: