Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Further pg_upgrade analysis for many tables
Date
Msg-id CAMkU=1zTFEyUN7J4kOb3BTHqMAEAQ7vok+eDP80q6rWwMcCxJg@mail.gmail.com
Whole thread Raw
In response to Re: Further pg_upgrade analysis for many tables  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Further pg_upgrade analysis for many tables
Re: Further pg_upgrade analysis for many tables
List pgsql-hackers
On Wednesday, January 9, 2013, Simon Riggs wrote:
On 23 November 2012 22:34, Jeff Janes <jeff.janes@gmail.com> wrote:

> I got rid of need_eoxact_work entirely and replaced it with a short
> list that fulfills the functions of indicating that work is needed,
> and suggesting which rels might need that work.  There is no attempt
> to prevent duplicates, nor to remove invalidated entries from the
> list.   Invalid entries are skipped when the hash entry is not found,
> and processing is idempotent so duplicates are not a problem.
>
> Formally speaking, if MAX_EOXACT_LIST were 0, so that the list
> overflowed the first time it was accessed, then it would be identical
> to the current behavior or having only a flag.  So formally all I did
> was increase the max from 0 to 10.

...

> It is not obvious what value to set the MAX list size to.

A few questions, that may help you...

Why did you pick 10, when your create temp table example needs 110?

The 110 is the size of the RelationIdCache at the end of my augmented pgbench transaction.  But, only one of those entries needs any work, so for that example a MAX of 1 would suffice.  But 1 seems to be cutting it rather close, so I picked the next largest power of 10.

The downsides of making the MAX larger are:

1) For ordinary work loads, each backend needs a very little bit more memory for the static array. (this would change if we want to extend this to EOsubXACT as well as EOXACT, beause there can be only 1 XACT but an unlimited number of SubXACT)

2) For pathological work loads that add the same relation to the list over and over again thousands of times, they have to
grovel through that list at EOX, which in theory could be more work than going through the entire non-redundant RelationIdCache hash.  (I have no idea what a pathological work load might actually look like in practice, but it seems like a good idea to assume that one might exist)

We could prevent duplicates from being added to the list in the first place, but the overhead need to do that seems like a sure loser for ordinary work loads.

By making the list over-flowable, we fix a demonstrated pathological workload (restore of huge schemas); we impose no detectable penalty to normal workloads; and we fail to improve, but also fail to make worse, a hypothetical pathological workload.  All at the expense of a few bytes per backend.

If the list overflowed at 100 rather than 10, the only cost would probably be the extra bytes used per process.  (Because the minimum realistic size of RelationIdCache is 100, and I assume iterating over 100 hash tags which may or may not exist and/or be redundant is about the same amount of work as iterating over a hash which has at least 100 entries)

If we increase the overflow above 100, we might be making things worse for some pathological workload whose existence is entirely hypothetical--but the workload that benefits from setting it above 100 is also hypothetical.  So while 10 might be too small, above 100 doesn't seem to be defensible in the absence of known cases.
 

Why does the list not grow as needed?

It would increase the code complexity for no concretely-known benefit.

If we are concerned about space, the extra bytes of compiled code needed to implement dynamic growth would certainly exceed the bytes need to just jack the MAX setting up to static setting 100 or 500 or so.

For dynamic growth to be a win, would have to have a work-load that satisfies these conditions:

1) It would have to have some transactions that cause >10 or >100 of relations to need clean up.
2) It would have to have even more hundreds of relations in RelationIdCache but which don't need cleanup (otherwise, if most of RelationIdCache needs cleanup then iterating over that hash would be just as efficient as iterating over a list which contains most of the said hash)
3) The above described transaction would have to happen over and over again, because if it only happens once there is no point in worrying about a little inefficiency.

Cheers,

Jeff 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Lock levels for ALTER TABLE
Next
From: Tom Lane
Date:
Subject: Re: PL/perl should fail on configure, not make