Re: sudden drop in statement turnaround latency -- yay!. - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: sudden drop in statement turnaround latency -- yay!.
Date
Msg-id 6EE64EF3AB31D5448D0007DD34EEB3412A758D@Herge.rcsinc.local
Whole thread Raw
Responses Re: sudden drop in statement turnaround latency -- yay!.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
> "Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> > I took advantage of the holidays to update a production server (dual
> > Opteron on win2k) from an 11/16 build (about beta5 or so) to the
latest
> > release candidate.  No configuration changes were made, just a
binary
> > swap and a server stop/start.
>
> > I was shocked to see that statement latency dropped by a fairly
large
> > margin.
>
> Hmm ... I trawled through the CVS logs since 11/16, and did not see
very
> many changes that looked like they might improve performance (list
> attached) --- and even of those, hardly any looked like the change
would
> be significant.  Do you know whether the query plans changed?  Are you
> running few enough queries per connection that backend startup
overhead
> might be an issue?

No, everything is run over persistent connections and prepared
statements.  All queries boil down to an index scan of some sort, so the
planner is not really a factor.  It's all strictly execution times, and
data is almost always read right off of the cache.  The performance of
the ISAM driver is driven by 3 factors (in order).
1. network latency (including o/s overhead context switches, etc.)
2. i/o factors (data read from cache, disk, etc).
3. overhead for pg to execute trivial transaction.

#1 & #2 are well understood problems.  It's #3 that improved
substantially and without warning.  See my comments below:

>             regards, tom lane
>
>
> 2004-12-15 14:16  tgl
>
>     * src/backend/access/nbtree/nbtutils.c: Calculation of
>     keys_are_unique flag was wrong for cases involving redundant
>     cross-datatype comparisons.  Per example from Merlin Moncure.

Not likely to have a performance benefit.

> 2004-12-02 10:32  momjian
>
>     * configure, configure.in, doc/src/sgml/libpq.sgml,
>     doc/src/sgml/ref/copy.sgml, src/interfaces/libpq/fe-connect.c,
>     src/interfaces/libpq/fe-print.c,
src/interfaces/libpq/fe-secure.c,
>     src/interfaces/libpq/libpq-fe.h,
src/interfaces/libpq/libpq-int.h:
>     Rework libpq threaded SIGPIPE handling to avoid interference
with
>     calling applications.  This is done by blocking sigpipe in the
>     libpq thread and using sigpending/sigwait to possibily discard
any
>     sigpipe we generated.

Doubtful.

> 2004-12-01 20:34  tgl
>
>     * src/: backend/optimizer/path/costsize.c,
>     backend/optimizer/util/plancat.c,
>     test/regress/expected/geometry.out,
>     test/regress/expected/geometry_1.out,
>     test/regress/expected/geometry_2.out,
>     test/regress/expected/inherit.out,
test/regress/expected/join.out,
>     test/regress/sql/inherit.sql, test/regress/sql/join.sql: Make
some
>     adjustments to reduce platform dependencies in plan selection.
In
>     particular, there was a mathematical tie between the two
possible
>     nestloop-with-materialized-inner-scan plans for a join (ie, we
>     computed the same cost with either input on the inside),
resulting
>     in a roundoff error driven choice, if the relations were both
small
>     enough to fit in sort_mem.  Add a small cost factor to ensure we
>     prefer materializing the smaller input.  This changes several
>     regression test plans, but with any luck we will now have more
>     stability across platforms.

No.  The planner is not a factor.

> 2004-12-01 14:00  tgl
>
>     * doc/src/sgml/catalogs.sgml, doc/src/sgml/diskusage.sgml,
>     doc/src/sgml/perform.sgml, doc/src/sgml/release.sgml,
>     src/backend/access/nbtree/nbtree.c, src/backend/catalog/heap.c,
>     src/backend/catalog/index.c, src/backend/commands/vacuum.c,
>     src/backend/commands/vacuumlazy.c,
>     src/backend/optimizer/util/plancat.c,
>     src/backend/optimizer/util/relnode.c,
src/include/access/genam.h,
>     src/include/nodes/relation.h,
src/test/regress/expected/case.out,
>     src/test/regress/expected/inherit.out,
>     src/test/regress/expected/join.out,
>     src/test/regress/expected/join_1.out,
>     src/test/regress/expected/polymorphism.out: Change planner to
use
>     the current true disk file size as its estimate of a relation's
>     number of blocks, rather than the possibly-obsolete value in
>     pg_class.relpages.  Scale the value in pg_class.reltuples
>     correspondingly to arrive at a hopefully more accurate number of
>     rows.  When pg_class contains 0/0, estimate a tuple width from
the
>     column datatypes and divide that into current file size to
estimate
>     number of rows.  This improved methodology allows us to jettison
>     the ancient hacks that put bogus default values into pg_class
when
>     a table is first created.  Also, per a suggestion from Simon,
make
>     VACUUM (but not VACUUM FULL or ANALYZE) adjust the value it puts
>     into pg_class.reltuples to try to represent the mean tuple
density
>     instead of the minimal density that actually prevails just after
>     VACUUM.  These changes alter the plans selected for certain
>     regression tests, so update the expected files accordingly.  (I
>     removed join_1.out because it's not clear if it still applies;
we
>     can add back any variant versions as they are shown to be
needed.)

doesn't seem like this would apply.

> 2004-11-21 17:57  tgl
>
>     * src/backend/utils/hash/dynahash.c: Fix rounding problem in
>     dynahash.c's decision about when the target fill factor has been
>     exceeded.  We usually run with ffactor == 1, but the way the
test
>     was coded, it wouldn't split a bucket until the actual fill
factor
>     reached 2.0, because of use of integer division.  Change from >
to
>     >= so that it will split more aggressively when the table starts
to
>     get full.

Hmm. Not likely.

> 2004-11-21 17:48  tgl
>
>     * src/backend/utils/mmgr/portalmem.c: Reduce the default size of
>     the PortalHashTable in order to save a few cycles during
>     transaction exit.  A typical session probably wouldn't have as
many
>     as half a dozen portals open at once, so the original value of
64
>     seems far larger than needed.

Strong possibility...'few cycles' seems pretty small tho :).

> 2004-11-20 15:19  tgl
>
>     * src/backend/utils/cache/relcache.c: Avoid scanning the
relcache
>     during AtEOSubXact_RelationCache when there is nothing to do,
which
>     is most of the time.  This is another simple improvement to cut
>     subtransaction entry/exit overhead.

Not clear from the comments: does this apply to every transaction, or
only ones with savepoints?  If all transactions, it's a contender.

> 2004-11-20 15:16  tgl
>
>     * src/backend/storage/lmgr/lock.c: Reduce the default size of
the
>     local lock hash table.    There's usually no need for it to be
nearly
>     as big as the global hash table, and since it's not in shared
>     memory it can grow if it does need to be bigger.  By reducing
the
>     size, we speed up hash_seq_search(), which saves a significant
>     fraction of subtransaction entry/exit overhead.

Same comments as above.


> 2004-11-19 19:48  tgl
>
>     * src/backend/tcop/postgres.c: Move pgstat_report_tabstat() call
so
>     that stats are not reported to the collector until the
transaction
>     commits.  Per recent discussion, this should avoid confusing
>     autovacuum when an updating transaction runs for a long time.

Not likely.

> 2004-11-16 22:13  neilc
>
>     * src/backend/access/: hash/hash.c, nbtree/nbtree.c:
>     Micro-optimization of markpos() and restrpos() in btree and hash
>     indexes.  Rather than using ReadBuffer() to increment the
reference
>     count on an already-pinned buffer, we should use
>     IncrBufferRefCount() as it is faster and does not require
acquiring
>     the BufMgrLock.

Another contender...maybe the cost of acquiring the lock is higher on
some platforms than others.

> 2004-11-16 19:14  tgl
>
>     * src/: backend/main/main.c, backend/port/win32/signal.c,
>     backend/postmaster/pgstat.c, backend/postmaster/postmaster.c,
>     include/port/win32.h: Fix Win32 problems with signals and
sockets,
>     by making the forkexec code even uglier than it was already :-(.
>     Also, on Windows only, use temporary shared memory segments
instead
>     of ordinary files to pass over critical variable values from
>     postmaster to child processes.    Magnus Hagander

As I understand it, this only affects backend startup time, so, no.
I'll benchmark some more until I get a better answer.

Merlin

pgsql-performance by date:

Previous
From: Mitch Pirtle
Date:
Subject: Re: Hardware purchase question
Next
From: "Adam Palmblad"
Date:
Subject: Bad Index Choices with user defined data type