Re: Physical append-only tables - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Physical append-only tables
Date
Msg-id CABUevEyVp5GTc+dPAZtTDajhFP6-_-8QOdF-3D1MxmsD=JR=Yw@mail.gmail.com
Whole thread Raw
In response to Re: Physical append-only tables  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On Mon, Nov 14, 2016 at 9:43 PM, Greg Stark <stark@mit.edu> wrote:
On Sun, Nov 13, 2016 at 3:45 PM, Magnus Hagander <magnus@hagander.net> wrote:
> For a scenario like this, would it make sense to have an option that could
> be set on an individual table making it physical append only? Basically
> VACUUM would run as normal and clean up the old space when rows are deleted
> back in history, but when new space is needed for a row the system would
> never look at the old blocks, and only append to the end.

I don't think "appending" is the right way to think about this. It
happens to address the problem but only accidentally and only
partially. More generally what you have is two different kinds of data
with two different access patterns and storage requirements in the
same table. They're logically similar but have different practical
requirements.

If there was some way to teach the database that your table is made of
two different types of data and how to distinguish the two types then
when the update occurs it could move the row to the right section of
storage... This might be something the new partitioning could handle
or it might need something more low-level and implicit.

Agreed, though in the cases I've looked at this has not been a static thing. Some of it might be driven off dynamic data somewhere else, some of it may be "this data has to be deleted for regulatory reasons". That can show up on a case-by-case basis. I don't think the partitioning can really be *that* flexible, though it might be able to pick up some of the issues.

The problem with the partitioning is also that it only works if you can ensure the partitioning key is in every query, which often doesn't work out against this.

 
That said, I don't think the "maintain clustering a bit better using
BRIN" is a bad idea. It's just the bit about turning a table
append-only to deal with update-once data that I think is overreach.

In the use-cases I've had it's really the DELETE that's the problem. In all those cases UPDATEs only happen on fairly recent data, so it doesn't really screw with the BRIN. It's DELETEs of old data that's the big issue.
 
--

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: A bug in UCS_to_most.pl
Next
From: Jim Nasby
Date:
Subject: Re: Danger of automatic connection reset in psql