Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets - Mailing list pgsql-hackers

From Lawrence, Ramon
Subject Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Date
Msg-id 6EEA43D22289484890D119821101B1DF2C16C1@exchange20.mercury.ad.ubc.ca
Whole thread Raw
In response to Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> What alternatives are there for people who do not run Windows?
>
>             regards, tom lane

The TPC-H generator is a standard code base provided at
http://www.tpc.org/tpch/.  We have been able to compile this code on
Linux.

However, we were unable to get the Microsoft modifications to this code
to compile on Linux (although they are supposed to be portable).  So, we
just used the Windows version with wine on our test Debian machine.

I have also posted the text files for the TPC-H 1G 1Z data set at:

http://people.ok.ubc.ca/rlawrenc/tpch1g1z.zip

Note that you need to trim the extra characters at the end of the lines
for PostgreSQL to read them properly.

Since the data takes a while to generate and load, we can also provide a
compressed version of the PostgreSQL data directory of the databases
with the data already loaded.

--
Ramon Lawrence


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Next
From: Jeff Davis
Date:
Subject: Re: array_agg and array_accum (patch)