Home > mailing lists

Re: pgbench - allow to specify scale as a size - Mailing list pgsql-hackers

From	Fabien COELHO
Subject	Re: pgbench - allow to specify scale as a size
Date	February 19, 2018 20:27:51
Msg-id	alpine.DEB.2.20.1802191811190.21372@lancre Whole thread Raw
In response to	Re: pgbench - allow to specify scale as a size (Mark Wong <mark@2ndQuadrant.com>)
Responses	Re: pgbench - allow to specify scale as a size
List	pgsql-hackers

Tree view

Hello Mark,

> What if we consider using ascii (utf8?) text file sizes as a reference
> point, something independent from the database?

Why not.

TPC-B basically specifies that rows (accounts, tellers, branches) are all 
padded to 100 bytes, thus we could consider (i.e. document) that 
--scale=SIZE refers to the amount of useful data hold, and warn that 
actual storage would add various overheads for page and row headers, free 
space at the end of pages, indexes...

Then one scale step is 100,000 accounts, 10 tellers and 1 branch, i.e.
100,011 * 100 ~ 9.5 MiB of useful data per scale step.

> I realize even if a flat file size can be used as a more consistent
> reference across platforms, it doesn't help with accurately determining
> the final data file sizes due to any architecture specific nuances or
> changes in the backend.  But perhaps it might still offer a little more
> meaning to be able to say "50 gigabytes of bank account data" rather
> than "10 million rows of bank accounts" when thinking about size over
> cardinality.

Yep.

Now the overhead is really 60-65%. Although the specification is 
unambiguous, but we still need some maths to know whether it fits in 
buffers or memory... The point of Karel regression is to take this into 
account.

Also, whether this option would be more admissible to Tom is still an open 
question. Tom?

-- 
Fabien.

pgsql-hackers by date:

From: Tom Lane
Date: 19 February 2018, 20:09:02
Subject: Re: extern keyword incorrectly used in some function definitions

From: Jaime Casanova
Date: 19 February 2018, 21:16:47
Subject: Re: unique indexes on partitioned tables

Re: pgbench - allow to specify scale as a size - Mailing list pgsql-hackers

Previous

Next