Re: TPC-R benchmarks - Mailing list pgsql-performance

From Oleg Lebedev
Subject Re: TPC-R benchmarks
Date
Msg-id 993DBE5B4D02194382EC8DF8554A52731E782A@postoffice.waterford.org
Whole thread Raw
In response to TPC-R benchmarks  (Oleg Lebedev <oleg.lebedev@waterford.org>)
Responses Re: TPC-R benchmarks
Re: TPC-R benchmarks
List pgsql-performance
I left my TPC-R query #17 working over the weekend and it took 3988 mins
~ 10 hours to complete. And this is considering that I am using a TPC-R
database created with a scale factor of 1, which corresponds to ~1 GB of
data. I am running RedHat 8.0 on a dual 1 GHz processor, 512 MB RAM.

Here is an excerpt from my postgresql.conf file (the rest of the
settings are commented out):

#
#    Shared Memory Size
#
shared_buffers = 16384        # 2*max_connections, min 16, typically
8KB each

#
#    Non-shared Memory Sizes
#
sort_mem = 32768

#
#    Optimizer Parameters
#
effective_cache_size = 32000    # typically 8KB each

Any suggestions on how to optimize these settings?

I agree with Jenny that declaring additional indexes on the TPC-R tables
may alter the validity of the benchmarks. Are there any official TPC
benchmarks submitted by PostgreSQL?

Thanks.

Oleg

-----Original Message-----
From: Mary Edie Meredith [mailto:maryedie@osdl.org]
Sent: Friday, September 26, 2003 10:12 AM
To: Tom Lane
Cc: Oleg Lebedev; Jenny Zhang; pgsql-performance
Subject: Re: [PERFORM] TPC-R benchmarks


The TPC-H/R rules allow only minor changes to the SQL that are necessary
due to SQL implementation differences. They do not allow changes made to
improve performance.  It is their way to test optimizer's ability to
recognize an inefficient SQL statement and do the rewrite.

The rule makes sense for the TPC-H, which is supposed to represent
ad-Hoc query.  One might argue that for TPC-R, which is suppose to
represent "Reporting" with pre-knowledge of the query, that re-write
should be allowed. However, that is currently not the case. Since the
RDBMS's represented on the TPC council are competing with TPC-H, their
optimizers already do the re-write, so (IMHO) there is no motivation to
relax the rules for the TPC-R.


On Thu, 2003-09-25 at 21:28, Tom Lane wrote:
> Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> > Seems like in your case postgres uses an i_l_partkey index on
> > lineitem table. I have a foreign key constraint defined between the
> > lineitem and part table, but didn't create an special indexes. Here
> > is my query plan:
>
> The planner is obviously unhappy with this plan (note the large cost
> numbers), but it can't find a way to do better.  An index on
> lineitem.l_partkey would help, I think.
>
> The whole query seems like it's written in a very inefficient fashion;

> couldn't the estimation of '0.2 * avg(l_quantity)' be amortized across

> multiple join rows?  But I dunno whether the TPC rules allow for
> significant manual rewriting of the given query.
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that
your
>       message can get through to the mailing list cleanly
--
Mary Edie Meredith <maryedie@osdl.org>
Open Source Development Lab

*************************************

This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.

*************************************


pgsql-performance by date:

Previous
From: Gaetano Mendola
Date:
Subject: Re: avoiding seqscan?
Next
From: Shridhar Daithankar
Date:
Subject: Re: TPC-R benchmarks