Home > mailing lists

Re: [HACKERS] [WIP] Zipfian distribution in pgbench - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: [HACKERS] [WIP] Zipfian distribution in pgbench
Date	July 8, 2017 00:19:44
Msg-id	CAH2-WznMsUdEvQbpHjox+Gqjz8bn7DgxkEjUGp6iL-0O6UTzGg@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] [WIP] Zipfian distribution in pgbench (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On Fri, Jul 7, 2017 at 5:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> How is that possible?  In a Zipfian distribution, no matter how big
> the table is, almost all of the updates will be concentrated on a
> handful of rows - and updates to any given row are necessarily
> serialized, or so I would think.  Maybe MongoDB can be fast there
> since there are no transactions, so it can just lock the row slam in
> the new value and unlock the row, all (I suppose) without writing WAL
> or doing anything hard.

If you're not using the Wired Tiger storage engine, than the locking
is at the document level, which means that a Zipfian distribution is
no worse than any other as far as lock contention goes. That's one
possible explanation. Another is that indexed organized tables
naturally have much better locality, which matters at every level of
the memory hierarchy.

> I'm more curious about why we're performing badly than I am about a
> general-purpose random_zipfian function.  :-)

I'm interested in both. I think that a random_zipfian function would
be quite helpful for modeling certain kinds of performance problems,
like CPU cache misses incurred at the page level.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: "Wong, Yi Wen"
Date: 07 July 2017, 23:19:41
Subject: [HACKERS] replication_slot_catalog_xmin not explicitly initialized whencreating procArray

From: Peter Geoghegan
Date: 08 July 2017, 00:53:13
Subject: Re: [HACKERS] [WIP] Zipfian distribution in pgbench

Re: [HACKERS] [WIP] Zipfian distribution in pgbench - Mailing list pgsql-hackers

Previous

Next