Re: Sample rate added to pg_stat_statements - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: Sample rate added to pg_stat_statements
Date
Msg-id CAA5RZ0v+o0cmebUALqxf5ocamgztiGcJqb14HZMFTDWHh855fg@mail.gmail.com
Whole thread Raw
In response to Re: Sample rate added to pg_stat_statements  (Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>)
Responses Re: Sample rate added to pg_stat_statements
List pgsql-hackers
> Unfortunately, these changes do not achieve the intended sampling goal.
> I looked into this more deeply: while the sampled-out queries do not
> appear in pg_stat_statements, an entry is still allocated in the hash
> table after normalization, which, in my view, should not happen when
> sampling is in effect. Therefore, patch v9 is unlikely to meet our needs.

pg_stat_statements creates entries as "sticky" initially to give them
more time to stay in the hash before the first execution completes.
It's not perfect, but it works for the majority of cases. So, what you
are observing is how pg_stat_statements currently works.

If an entry is popular enough, we will need it anyways ( even
with the proposed sampling ). An entry that's not popular will
eventually be aged out.

From my understanding, what the proposed sampling will do is
to reduce the overhead  of incrementing counters of popular entries,
because of the spinlock to update the counters. This is particularly
the case with high concurrency on large machines ( high cpu count ),
and especially when there is a small set of popular entries.
IMO, This patch should also have a benchmark that proves
that a user can benefit with sampling in those types of
workloads.

Regards,

Sami

Sami

Regards,

Sami



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Conflict detection for update_deleted in logical replication
Next
From: Thomas Munro
Date:
Subject: Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)