Re: Why we lost Uber as a user - Mailing list pgsql-hackers

From Mark Kirkwood
Subject Re: Why we lost Uber as a user
Date
Msg-id 3be7f8ea-f978-3c50-df70-c70a0c1c9cc0@catalyst.net.nz
Whole thread Raw
In response to Re: Why we lost Uber as a user  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Why we lost Uber as a user  (Alfred Perlstein <alfred@freebsd.org>)
List pgsql-hackers
On 03/08/16 02:27, Robert Haas wrote:
>
> Personally, I think that incremental surgery on our current heap
> format to try to fix this is not going to get very far.  If you look
> at the history of this, 8.3 was a huge release for timely cleanup of
> dead tuple.  There was also significant progress in 8.4 as a result of
> 5da9da71c44f27ba48fdad08ef263bf70e43e689.   As far as I can recall, we
> then made no progress at all in 9.0 - 9.4.  We made a very small
> improvement in 9.5 with 94028691609f8e148bd4ce72c46163f018832a5b, but
> that's pretty niche.  In 9.6, we have "snapshot too old", which I'd
> argue is potentially a large improvement, but it was big and invasive
> and will no doubt pose code maintenance hazards in the years to come;
> also, many people won't be able to use it or won't realize that they
> should use it.  I think it is likely that further incremental
> improvements here will be quite hard to find, and the amount of effort
> will be large relative to the amount of benefit.  I think we need a
> new storage format where the bloat is cleanly separated from the data
> rather than intermingled with it; every other major RDMS works that
> way.  Perhaps this is a case of "the grass is greener on the other
> side of the fence", but I don't think so.
>
Yeah, I think this is a good summary of the state of play.

The only other new db development to use a non-overwriting design like 
ours that I know of was Jim Starky's Falcon engine for (ironically) 
Mysql 6.0. Not sure if anyone is still progressing that at all now.

I do wonder if Uber could have successfully tamed dead tuple bloat with 
aggressive per-table autovacuum settings (and if in fact they tried), 
but as I think Robert said earlier, it is pretty easy to come up with a 
highly update (or insert + delete) workload that makes for a pretty ugly 
bloat component even with real aggressive autovacuuming.

Cheers

Mark




pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: regression test for extended query protocol
Next
From: Masahiko Sawada
Date:
Subject: Quorum commit for multiple synchronous replication.