Re: Hash id in pg_stat_statements - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Hash id in pg_stat_statements
Date
Msg-id 20121001170557.GW1267@tamriel.snowman.net
Whole thread Raw
In response to Re: Hash id in pg_stat_statements  (Peter Geoghegan <peter@2ndquadrant.com>)
Responses Re: Hash id in pg_stat_statements  (Peter Geoghegan <peter@2ndquadrant.com>)
List pgsql-hackers
Peter,

* Peter Geoghegan (peter@2ndquadrant.com) wrote:
> That won't really help matters. There'd still be duplicate entries,
> from before and after the change, even if we make it immediately
> obvious which is which. The only reasonable solution in that scenario
> is to bump PGSS_FILE_HEADER, which will cause all existing entries to
> be invalidated.

You're going to have to help me here, 'cause I don't see how there can
be duplicates if we include the PGSS_FILE_HEADER as part of the hash,
unless we're planning to keep PGSS_FILE_HEADER constant while we change
what the hash value is for a given query, yet that goes against the
assumptions that were laid out, aiui.

If there's a change that results in a given query X no longer hashing to
a value A, we need to change PGSS_FILE_HEADER to invalidate statistics
which were collected for value A (or else we risk an independent query Y
hashing to value A and ending up with completely invalid stats..).
Provided we apply that consistently and don't reuse PGSS_FILE_HEADER
values along the way, a combination of PGSS_FILE_HEADER and the hash
value for a given query should be unique over time.

We do need to document that the hash value for a given query may
change..
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: embedded list v3
Next
From: Jeff Davis
Date:
Subject: Re: WIP checksums patch