Thread: The need for clustered indexes to boost TPC-V performance

The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:

Following the earlier email introducing the TPC-V benchmark, and that we are developing an industry standard benchmarking kit for TPC-V using PostgreSQL, here is a specific performance issue we have run into.

 

In running a TPC-E prototype of the benchmark on an 8-core Nehalem blade and a disk array with 14 SSDs and 90 spinning drives, we noticed that we are doing a lot more I/O than the TPC-E benchmark is supposed to produce.  Digging deeper, we noticed that the I/O rate (around 28K IOPS) was not unreasonable for our combination of SQL queries/table and index sizes/buffer pool size. What was unreasonable was the large size of the tables, and especially, of the indexes.

 

To put this in perspective, let us compare our situation to a published TPC-E result on MS SQL at http://bit.ly/QeWXhE. This was run on a similar server, and the database size is close to ours. Our table and index sizes should be 32.5% of the MS SQL size (for those who care, we populated the database with 300,000 customers and 125 Initial Trade Days; they built with 385,000 customers and 300 ITD). Look at page 34 of the disclosure for the table and index sizes, and focus on the large tables. For our large tables, this is what I am seeing:

 

 

                           List of relations

Schema |        Name        | Type  | Owner |    Size    | Description

--------+--------------------+-------+-------+------------+-------------

public | trade              | table | tpce  | 402 GB     |

 public | cash_transaction   | table | tpce  | 309 GB     |

 public | trade_history      | table | tpce  | 291 GB     |

 public | settlement         | table | tpce  | 203 GB     |

 public | holding_history    | table | tpce  | 183 GB     |

 public | daily_market       | table | tpce  | 21 GB      |

 public | holding            | table | tpce  | 15 GB      |

 

                               List of relations

Schema |  Name   | Type  | Owner |       Table        |  Size   | Description

--------+---------+-------+-------+--------------------+---------+-------------

public | idx_th  | index | tpce  | trade_history      | 186 GB  |

 public | idx_hh2 | index | tpce  | holding_history    | 133 GB  |

 public | idx_hh  | index | tpce  | holding_history    | 126 GB  |

 public | idx_t2  | index | tpce  | trade              | 119 GB  |

 public | idx_t3  | index | tpce  | trade              | 110 GB  |

 public | idx_se  | index | tpce  | settlement         | 63 GB   |

 public | idx_t   | index | tpce  | trade              | 62 GB   |

 public | idx_ct  | index | tpce  | cash_transaction   | 55 GB   |

 public | idx_h2  | index | tpce  | holding            | 12 GB   |

 

I don’t know Dell’s exact I/O rate, but judging by their storage configuration and what’s expected of the benchmark, we are several times too high. (Even after cutting the database size by a factor of 10, we are around twice the IOPS rate we should be at.)

 

Comparing the table sizes, we are close to 2X larger (more on this in a later note). But the index size is what stands out. Our overall index usage (again, after accounting for different numbers of rows) is 4.8X times larger. 35% of our I/Os are to the index space. I am guessing that the 4.8X ballooning has something to do with this, and that in itself explains a lot about our high I/O rate, as well as higher CPU/tran cycles compared to MS SQL (we are  2.5-3 times slower).

 

So I looked more closely at the indexes. I chose the CASH_TRANSACTION table since it has a single index, and we can compare it more directly to the Dell data. If you look at page 34 of http://bit.ly/QeWXhE, the index size of CT is 1,278,720KB for 6,120,529,488 rows. That’s less than one byte of index per data row!  How could that be?  Well, MS SQL used a “clustered index” for CT, i.e., the data is held in the leaf pages of the index B-Tree. The data and index are in one data structure. Once you lookup the index, you also have the data at zero additional cost. For PGSQL, we had to create a regular index, which took up 55GB. Once you do the math, this works out to around 30 bytes per row. I imagine we have the 15-byte key along with a couple of 4-byte or 8-byte pointers.

 

So MS SQL beats PGSQL by a) having a lower I/O rate due to no competition for the buffer pool from indexes (except for secondary indexes); and b) by getting the data with a free lookup, whereas we have to work our way down both the index and the data trees.

 

Dell created a clustered index for every single one of the 33 tables. Folks, past experiences with relational databases and TPC benchmarks tells me this could affect the bottom line performance of the benchmark by as much as 2X.

 

Checking online, the subject of clustered indexes for PostgreSQL comes up often. PGSQL does have a concept called “clustered table”, which means a table has been organized in the order of an index. This would help with sequential accesses to a table, but has nothing to do with this problem. PGSQL folks sometimes refer to what we want as “integrated index”.

 

Is the PGSQL community willing to invest in a feature that a) has been requested by many others already; and b) can make a huge difference in a benchmark that can lend substantial credibility to PGSQL performance?


Thanks,
Reza

 

Re: The need for clustered indexes to boost TPC-V performance

From
Craig Ringer
Date:
On 07/04/2012 07:13 AM, Reza Taheri wrote:

Following the earlier email introducing the TPC-V benchmark, and that we are developing an industry standard benchmarking kit for TPC-V using PostgreSQL, here is a specific performance issue we have run into.


Which version of PostgreSQL are you using?

How has it been tuned beyond the defaults - autovacuum settings, shared_buffers, effective_cache_size, WAL settings, etc?

How much RAM is on the blade? What OS and version are on the blade?

Comparing the table sizes, we are close to 2X larger (more on this in a later note). But the index size is what stands out. Our overall index usage (again, after accounting for different numbers of rows) is 4.8X times larger. 35% of our I/Os are to the index space. I am guessing that the 4.8X ballooning has something to do with this, and that in itself explains a lot about our high I/O rate, as well as higher CPU/tran cycles compared to MS SQL (we are  2.5-3 times slower).

This is making me wonder about bloat issues and whether proper vacuuming is being done. If the visibility map and free space map aren't maintained by proper vaccum operation everything gets messy, fast.

Well, MS SQL used a “clustered index” for CT, i.e., the data is held in the leaf pages of the index B-Tree. The data and index are in one data structure. Once you lookup the index, you also have the data at zero additional cost.

[snip]

Is the PGSQL community willing to invest in a feature that a) has been requested by many others already; and b) can make a huge difference in a benchmark that can lend substantial credibility to PGSQL performance?


while PostgreSQL doesn't support covering indexes or clustered indexes at this point, 9.2 has added support for index-only scans, which are a half-way point of sorts. See:

  http://rhaas.blogspot.com.au/2011/10/index-only-scans-weve-got-em.html
  http://rhaas.blogspot.com.au/2010/11/index-only-scans.html
  http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2822fb9337a21f98ac4ce850bb4145acf47ca27

If at all possible please see how your test is affected by this PostgreSQL 9.2 enhancement. It should make a big difference, and if it doesn't it's important to know why.

(CC'd Robert Haas)

I'm not sure what the best option for getting a 9.2 beta build for Windows is.


As for the "invest" side - that's really a matter for EnterpriseDB, Command Prompt, Red Hat, and the other backers who're employing people to work on the DB. Consider asking on pgsql-hackers, too; if nothing else you'll get a good explanation of the current state and progress toward clustered indexes.

Some links that may be useful to you are:

  http://wiki.postgresql.org/wiki/Todo
  Things that it'd be good to support/implement at some point. Surprisingly, covering/clustered indexes aren't on there or at least aren't easily found. It's certainly a much-desired feature despite its apparent absence from the TODO.

  http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Development_Plan
  http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items

--
Craig Ringer

Re: The need for clustered indexes to boost TPC-V performance

From
Thomas Kellerer
Date:
Craig Ringer, 04.07.2012 07:43:

> I'm not sure what the best option for getting a 9.2 beta build for Windows is.

Download the ZIP from here:

http://www.enterprisedb.com/products-services-training/pgbindownload

Unzip, initdb, pg_ctl start

Regards
Thomas



Re: The need for clustered indexes to boost TPC-V performance

From
Craig Ringer
Date:
On 07/04/2012 03:57 PM, Thomas Kellerer wrote:
> Craig Ringer, 04.07.2012 07:43:
>
>> I'm not sure what the best option for getting a 9.2 beta build for
>> Windows is.
>
> Download the ZIP from here:
>
> http://www.enterprisedb.com/products-services-training/pgbindownload

Gah, I'm blind. I looked at that page twice and failed to see the
entries for the beta. Sorry.

--
Craig Ringer

Re: The need for clustered indexes to boost TPC-V performance

From
Daniel Farina
Date:
On Tue, Jul 3, 2012 at 10:43 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
> On 07/04/2012 07:13 AM, Reza Taheri wrote:
>
> Following the earlier email introducing the TPC-V benchmark, and that we are
> developing an industry standard benchmarking kit for TPC-V using PostgreSQL,
> here is a specific performance issue we have run into.
>
>
> Which version of PostgreSQL are you using?
>
> How has it been tuned beyond the defaults - autovacuum settings,
> shared_buffers, effective_cache_size, WAL settings, etc?
>
> How much RAM is on the blade? What OS and version are on the blade?
>
>
> Comparing the table sizes, we are close to 2X larger (more on this in a
> later note). But the index size is what stands out. Our overall index usage
> (again, after accounting for different numbers of rows) is 4.8X times
> larger. 35% of our I/Os are to the index space. I am guessing that the 4.8X
> ballooning has something to do with this, and that in itself explains a lot
> about our high I/O rate, as well as higher CPU/tran cycles compared to MS
> SQL (we are  2.5-3 times slower).
>
> This is making me wonder about bloat issues and whether proper vacuuming is
> being done. If the visibility map and free space map aren't maintained by
> proper vaccum operation everything gets messy, fast.
>
> Well, MS SQL used a “clustered index” for CT, i.e., the data is held in the
> leaf pages of the index B-Tree. The data and index are in one data
> structure. Once you lookup the index, you also have the data at zero
> additional cost.
>
> [snip]
>
>
>
> Is the PGSQL community willing to invest in a feature that a) has been
> requested by many others already; and b) can make a huge difference in a
> benchmark that can lend substantial credibility to PGSQL performance?
>
>
> while PostgreSQL doesn't support covering indexes or clustered indexes at
> this point, 9.2 has added support for index-only scans, which are a half-way
> point of sorts. See:
>
>   http://rhaas.blogspot.com.au/2011/10/index-only-scans-weve-got-em.html
>   http://rhaas.blogspot.com.au/2010/11/index-only-scans.html
>
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2822fb9337a21f98ac4ce850bb4145acf47ca27
>
> If at all possible please see how your test is affected by this PostgreSQL
> 9.2 enhancement. It should make a big difference, and if it doesn't it's
> important to know why.
>
> (CC'd Robert Haas)
>
> I'm not sure what the best option for getting a 9.2 beta build for Windows
> is.
>
>
> As for the "invest" side - that's really a matter for EnterpriseDB, Command
> Prompt, Red Hat, and the other backers who're employing people to work on
> the DB. Consider asking on pgsql-hackers, too; if nothing else you'll get a
> good explanation of the current state and progress toward clustered indexes.
>
> Some links that may be useful to you are:
>
>   http://wiki.postgresql.org/wiki/Todo
>   Things that it'd be good to support/implement at some point. Surprisingly,
> covering/clustered indexes aren't on there or at least aren't easily found.
> It's certainly a much-desired feature despite its apparent absence from the
> TODO.

I think there is, deservingly, a lot of hesitation to implement a
strictly ordered table construct.  A similar feature that didn't quite
get finished -- but maybe can be beaten into shape -- is the
grouped-index-tuple implementation:

http://community.enterprisedb.com/git/

It is mentioned on the TODO page.  It's under the category that is
perhaps poorly syntactically overloaded in the world "cluster".

--
fdr

Re: The need for clustered indexes to boost TPC-V performance

From
Claudio Freire
Date:
On Tue, Jul 3, 2012 at 8:13 PM, Reza Taheri <rtaheri@vmware.com> wrote:
> So I looked more closely at the indexes. I chose the CASH_TRANSACTION
> table since it has a single index, and we can compare it more directly to the
> Dell data. If you look at page 34 of http://bit.ly/QeWXhE, the index size of CT
> is 1,278,720KB for 6,120,529,488 rows. That’s less than one byte of index
> per data row!  How could that be?  Well, MS SQL used a “clustered index”
> for CT, i.e., the data is held in the leaf pages of the index B-Tree.
> The data and index are in one data structure. Once you lookup the index,
> you also have the data at zero additional cost. For PGSQL, we had to create
> a regular index, which took up 55GB. Once you do the math, this works out
> to around 30 bytes per row. I imagine we have the 15-byte key along with a
> couple of 4-byte or 8-byte pointers.
...
> So MS SQL beats PGSQL by a) having a lower I/O rate due to no competition
> for the buffer pool from indexes (except for secondary indexes); and b) by
> getting the data with a free lookup, whereas we have to work our way down
> both the index and the data trees.

15-byte key?

What about not storing the keys, but a hash, for leaf nodes?

Assuming it can be made to work for both "range" and "equality" scans,
holding only hashes on leaf nodes would reduce index size, but how
much?

I think it's doable, and I could come up with a spec if it's worth it.
It would have to scan the heap for only two extra index pages (the
extremes that cannot be ruled out) and hash collisions, which doesn't
seem like a big loss versus the reduced index.

Re: The need for clustered indexes to boost TPC-V performance

From
Robert Klemme
Date:
On Wed, Jul 4, 2012 at 1:13 AM, Reza Taheri <rtaheri@vmware.com> wrote:

> Checking online, the subject of clustered indexes for PostgreSQL comes up
> often. PGSQL does have a concept called “clustered table”, which means a
> table has been organized in the order of an index. This would help with
> sequential accesses to a table, but has nothing to do with this problem.
> PGSQL folks sometimes refer to what we want as “integrated index”.

I do understand this correctly that we are speaking about the concept
which is known under the term "index organized table" (IOT) in Oracle
land, correct?

http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#CBBJEBIH

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:
Hi Daniel,
Yes, it sounds like GIT will take us half the way there by getting rid of much of the index I/O if we cluster the
tables.We can set the fillfactor parameter to keep tables sorted after updates. I am not sure what impact inserts will
havesince the primary key keeps growing with new inserts, so perhaps the table will maintain the cluster order and the
benefitsof GIT for new rows, too. GIT won't save CPU cycles the way a clustered/integrated index would, and actually
addsto the CPU cost since the data page has to be searched for the desired tuple. 

Thanks,
Reza

> -----Original Message-----
> From: Daniel Farina [mailto:daniel@heroku.com]
> Sent: Wednesday, July 04, 2012 6:40 AM
> To: Craig Ringer
> Cc: Reza Taheri; pgsql-performance@postgresql.org; Robert Haas
> Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V
> performance
>
> On Tue, Jul 3, 2012 at 10:43 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
> > On 07/04/2012 07:13 AM, Reza Taheri wrote:
> >
> > Following the earlier email introducing the TPC-V benchmark, and that
> > we are developing an industry standard benchmarking kit for TPC-V
> > using PostgreSQL, here is a specific performance issue we have run into.
> >
> >
> > Which version of PostgreSQL are you using?
> >
> > How has it been tuned beyond the defaults - autovacuum settings,
> > shared_buffers, effective_cache_size, WAL settings, etc?
> >
> > How much RAM is on the blade? What OS and version are on the blade?
> >
> >
> > Comparing the table sizes, we are close to 2X larger (more on this in
> > a later note). But the index size is what stands out. Our overall
> > index usage (again, after accounting for different numbers of rows) is
> > 4.8X times larger. 35% of our I/Os are to the index space. I am
> > guessing that the 4.8X ballooning has something to do with this, and
> > that in itself explains a lot about our high I/O rate, as well as
> > higher CPU/tran cycles compared to MS SQL (we are  2.5-3 times slower).
> >
> > This is making me wonder about bloat issues and whether proper
> > vacuuming is being done. If the visibility map and free space map
> > aren't maintained by proper vaccum operation everything gets messy,
> fast.
> >
> > Well, MS SQL used a "clustered index" for CT, i.e., the data is held
> > in the leaf pages of the index B-Tree. The data and index are in one
> > data structure. Once you lookup the index, you also have the data at
> > zero additional cost.
> >
> > [snip]
> >
> >
> >
> > Is the PGSQL community willing to invest in a feature that a) has been
> > requested by many others already; and b) can make a huge difference in
> > a benchmark that can lend substantial credibility to PGSQL performance?
> >
> >
> > while PostgreSQL doesn't support covering indexes or clustered indexes
> > at this point, 9.2 has added support for index-only scans, which are a
> > half-way point of sorts. See:
> >
> >   http://rhaas.blogspot.com.au/2011/10/index-only-scans-weve-got-
> em.html
> >   http://rhaas.blogspot.com.au/2010/11/index-only-scans.html
> >
> > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2822fb9
> > 337a21f98ac4ce850bb4145acf47ca27
> >
> > If at all possible please see how your test is affected by this
> > PostgreSQL
> > 9.2 enhancement. It should make a big difference, and if it doesn't
> > it's important to know why.
> >
> > (CC'd Robert Haas)
> >
> > I'm not sure what the best option for getting a 9.2 beta build for
> > Windows is.
> >
> >
> > As for the "invest" side - that's really a matter for EnterpriseDB,
> > Command Prompt, Red Hat, and the other backers who're employing
> people
> > to work on the DB. Consider asking on pgsql-hackers, too; if nothing
> > else you'll get a good explanation of the current state and progress toward
> clustered indexes.
> >
> > Some links that may be useful to you are:
> >
> >   http://wiki.postgresql.org/wiki/Todo
> >   Things that it'd be good to support/implement at some point.
> > Surprisingly, covering/clustered indexes aren't on there or at least aren't
> easily found.
> > It's certainly a much-desired feature despite its apparent absence
> > from the TODO.
>
> I think there is, deservingly, a lot of hesitation to implement a strictly
> ordered table construct.  A similar feature that didn't quite get finished --
> but maybe can be beaten into shape -- is the grouped-index-tuple
> implementation:
>
> http://community.enterprisedb.com/git/
>
> It is mentioned on the TODO page.  It's under the category that is perhaps
> poorly syntactically overloaded in the world "cluster".
>
> --
> fdr

Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:
Hi Robert,
Yes, the same concept. Oracle's IOT feature is used often with TPC benchmarks.

Thanks,
Reza

> -----Original Message-----
> From: Robert Klemme [mailto:shortcutter@googlemail.com]
> Sent: Thursday, July 05, 2012 5:30 AM
> To: Reza Taheri
> Cc: pgsql-performance@postgresql.org; Andy Bond (abond@redhat.com);
> Greg Kopczynski; Jignesh Shah
> Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V
> performance
>
> On Wed, Jul 4, 2012 at 1:13 AM, Reza Taheri <rtaheri@vmware.com> wrote:
>
> > Checking online, the subject of clustered indexes for PostgreSQL comes
> > up often. PGSQL does have a concept called "clustered table", which
> > means a table has been organized in the order of an index. This would
> > help with sequential accesses to a table, but has nothing to do with this
> problem.
> > PGSQL folks sometimes refer to what we want as "integrated index".
>
> I do understand this correctly that we are speaking about the concept which
> is known under the term "index organized table" (IOT) in Oracle land,
> correct?
>
> http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#CBB
> JEBIH
>
> Kind regards
>
> robert
>
> --
> remember.guy do |as, often| as.you_can - without end
> http://blog.rubybestpractices.com/

Re: The need for clustered indexes to boost TPC-V performance

From
Samuel Gendler
Date:


On Thu, Jul 5, 2012 at 12:13 PM, Reza Taheri <rtaheri@vmware.com> wrote:
Hi Robert,
Yes, the same concept. Oracle's IOT feature is used often with TPC benchmarks.


Reza, it would be very helpful if you were to provide the list with a lot more information about your current software and hardware configuration before coming to the conclusion that the only possible way forward is with a significant architectural change to the db engine itself.  Not only is it not at all clear that you are extracting maximum performance from your current hardware and software, but I doubt anyone is particularly interested in doing a bunch of development purely to game a benchmark.  There has been significant discussion of the necessity and viability of the feature you are requesting in the past, so you should probably start where those discussions left off rather than starting the discussion all over again from the beginning.  Of course, if vmware were to sponsor development of the feature in question, it probably wouldn't require nearly as much buy-in from the wider community.

Getting back to the current performance issues -  I have little doubt that the MS SQL benchmark was set up and run by people who were intimately familiar with MS SQL performance tuning.  You stated in your earlier email that your team doesn't have significant postgresql-specific experience, so it isn't necessarily surprising that your first attempt at tuning didn't get the results that you are looking for. You stated that you have 14 SSDs and 90 spinning drives, but you don't specify how they are combined and how the database is laid out on top of them.  There is no mention of how much memory is available to the system. We don't know how you've configured postgresql's memory allocation or how your config weights the relative costs of index lookups, sequential scans, etc.  The guidelines for this mailing list include instructions for what information should be provided when asking about performance improvements.  http://archives.postgresql.org/pgsql-performance/  Let's start by ascertaining how your benchmark results can be improved without engaging in a significant development effort on the db engine itself.


 

Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:

Hi Samuel,

The SSDs were used as a cache for the spinning drives. Here is a 30-second iostat sample representative of the whole run:

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          24.87    0.00   12.54   62.39    0.00    0.20

 

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util

sdd               0.00   137.37 3058.40  106.17 34691.60   974.13    22.54    15.75    4.98   0.32 100.00

sde               0.00   136.07 3063.37  107.70 35267.07   975.07    22.86    15.58    4.92   0.32 100.00

sdf               0.00   135.37 3064.23  109.53 35815.60   979.60    23.19    15.82    4.99   0.32 100.00

sdg               0.00   136.97 3066.57  116.67 35196.53  1014.53    22.75    15.87    4.99   0.31 100.00

sdi               0.00  2011.03    0.00   87.90     0.00  8395.73   191.03     0.13    1.45   1.42  12.51

sdk               0.00   136.63 3066.83  107.53 35805.07   976.67    23.17    16.01    5.04   0.32 100.00

sdm               0.00   138.50 3054.40  111.10 34674.27   998.40    22.54    15.52    4.91   0.32 100.00

sdj               0.00   136.73 3058.70  118.20 35227.20  1019.73    22.82    15.81    4.98   0.31 100.00

sdl               0.00   137.53 3044.97  109.33 34448.00   987.47    22.47    15.78    5.00   0.32 100.00

 

The data and index tablespaces were striped across the 8 LUNs, and saw an average 5ms response. We can beef up the storage to handle more I/Os so that our utilization doesn’t stay below 40%, but that misses the point: we have an I/O rate twice the commercial database because they used clustered indexes.


I provided more config details in an earlier email.

 

As for asking for development to game a benchmark, no one is asking for benchmark specials. The question of enhancements in response to benchmark needs is an age old question. We can get into that, but it’s really a different discussion. Let me just expose the flip side of it: are we willing to watch people use other databases to run benchmarks but feel content that no features were developed specifically in response to benchmark results?

 

I am trying to engage with the community. We can drown the mailing list with details. So I decided to open the discussion with the high level points, and we will give you all the details that you want as we move forward.


Thanks,
Reza

 

From: Samuel Gendler [mailto:sgendler@ideasculptor.com]
Sent: Thursday, July 05, 2012 12:46 PM
To: Reza Taheri
Cc: Robert Klemme; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V performance

 

 

On Thu, Jul 5, 2012 at 12:13 PM, Reza Taheri <rtaheri@vmware.com> wrote:

Hi Robert,
Yes, the same concept. Oracle's IOT feature is used often with TPC benchmarks.

 

Reza, it would be very helpful if you were to provide the list with a lot more information about your current software and hardware configuration before coming to the conclusion that the only possible way forward is with a significant architectural change to the db engine itself.  Not only is it not at all clear that you are extracting maximum performance from your current hardware and software, but I doubt anyone is particularly interested in doing a bunch of development purely to game a benchmark.  There has been significant discussion of the necessity and viability of the feature you are requesting in the past, so you should probably start where those discussions left off rather than starting the discussion all over again from the beginning.  Of course, if vmware were to sponsor development of the feature in question, it probably wouldn't require nearly as much buy-in from the wider community.

 

Getting back to the current performance issues -  I have little doubt that the MS SQL benchmark was set up and run by people who were intimately familiar with MS SQL performance tuning.  You stated in your earlier email that your team doesn't have significant postgresql-specific experience, so it isn't necessarily surprising that your first attempt at tuning didn't get the results that you are looking for. You stated that you have 14 SSDs and 90 spinning drives, but you don't specify how they are combined and how the database is laid out on top of them.  There is no mention of how much memory is available to the system. We don't know how you've configured postgresql's memory allocation or how your config weights the relative costs of index lookups, sequential scans, etc.  The guidelines for this mailing list include instructions for what information should be provided when asking about performance improvements.  http://archives.postgresql.org/pgsql-performance/  Let's start by ascertaining how your benchmark results can be improved without engaging in a significant development effort on the db engine itself.

 

 

 

Re: The need for clustered indexes to boost TPC-V performance

From
Samuel Gendler
Date:


On Thu, Jul 5, 2012 at 1:37 PM, Reza Taheri <rtaheri@vmware.com> wrote:


I provided more config details in an earlier email.

 



I hate to disagree, but unless I didn't get a message sent to the list, you haven't provided any details about your postgresql config or otherwise adhered to the guidelines for starting a discussion of a performance problem around here.  I just searched my mailbox and no email from you has any such details.  Several people have asked for them, including myself.  You say you will give any details we want, but this is at least the 3rd or 4th request for such details and they have not yet been forthcoming.

Re: The need for clustered indexes to boost TPC-V performance

From
Andy Colson
Date:
On 07/05/2012 03:52 PM, Samuel Gendler wrote:
>
>
> On Thu, Jul 5, 2012 at 1:37 PM, Reza Taheri <rtaheri@vmware.com <mailto:rtaheri@vmware.com>> wrote:
>
>
>     I provided more config details in an earlier email.____
>
>     __ __
>
>
>
> I hate to disagree, but unless I didn't get a message sent to the list, you haven't provided any details about your
postgresqlconfig or otherwise adhered to the guidelines for starting a discussion of a performance problem around here.
I just searched my mailbox and no email from you has any such details.  Several people have asked for them, including
myself. You say you will give any details we want, but this is at least the 3rd or 4th request for such details and
theyhave not yet been forthcoming. 


Reza, I went back and looked myself.  I see no specs on OS, or hardware.... unless you mean this:


> http://bit.ly/QeWXhE. This was run on a similar server, and the database size is close to ours.


You're running on windows then?  Server is 96Gig ram, 8 cores, (dell poweredge T610).
with two powervault MD1120 NAS's?

But then I assume you were not running on that, were you.  You were running vmware on it, probably?


-Andy

Re: The need for clustered indexes to boost TPC-V performance

From
Craig Ringer
Date:
First: Please do try 9.2 beta if you're upgrading from 8.4. It'll be out as a final release soon enough, and index only scans may make a big difference for the problem you're currently having.

Looking at your configuration I have a few comments, but it's worth noting that I don't work with hardware at that scale, and I'm more used to tuning I/O bottlenecked systems with onboard storage rather than CPU-bottlenecked ones on big SANs. Hopefully now that you've posted your configuration and setup there might be interest from others.

If you're able to post an EXPLAIN ANALYZE or two for a query you feel is slow that certainly won't hurt. Using http://explain.depesz.com/ saves you the hassle of dealing with word-wrapping when posting them, btw.

As for your config:

I notice that your autovacuum settings are at their defaults. With heavy UPDATE / DELETE load this'll tend to lead to table and index bloat, so the DB has to scan more useless data to get what it needs. It also means table stats won't be maintained as well, potentially leading to poor planner decisions. The following fairly scary query can help identify bloat, as the database server doesn't currently have anything much built in to help you spot such issues:

   http://wiki.postgresql.org/wiki/Show_database_bloat

It might be helpful to set effective_cache_size and effective_io_concurrency so Pg has more idea of the scale of your hardware. The defaults are very conservative - it's supposed to be easy for people to use for simple things without melting their systems, and it's expected that anyone doing bigger work will tune the database.

http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html

It looks like you've already tweaked many of the critical points for big installs - your checkpoint_segments, wal_buffers, shared_buffers, etc. I lack the big hardware experience to know if they're appropriate, but they're not the extremely conservative defaults, which is a start.

Your random_page_cost and seq_page_cost are probably dead wrong for a SAN with RAM and SSD cache in front of fast disks. Their defaults are for local uncached spinning HDD media where seeks are expensive. The typical advice on such hardware is to set them to something more like seq_page_cost = 0.1  random_page_cost  = 0.15 - ie cheaper relative to the cpu cost, and with random I/O only a little more expensive than sequential I/O. What's right for your situation varies a bit based on DB size vs hardware size, etc; Greg discusses this more in his book.

What isolation level do your transactions use? This is significant because of the move to true serializable isolation with predicate locking in 9.0; it made serializable transactions a bit slower in some circumstances in exchange for much stronger correctness guarantees. The READ COMMITTED default was unchanged.



It also looks like you might not have seen the second part of my earlier reply:


while PostgreSQL doesn't support covering indexes or clustered indexes at this point, 9.2 has added support for index-only scans, which are a half-way point of sorts. See:

  http://rhaas.blogspot.com.au/2011/10/index-only-scans-weve-got-em.html
  http://rhaas.blogspot.com.au/2010/11/index-only-scans.html
  http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2822fb9337a21f98ac4ce850bb4145acf47ca27

If at all possible please see how your test is affected by this PostgreSQL 9.2 enhancement. It should make a big difference, and if it doesn't it's important to know why.

(CC'd Robert Haas)



As for the "invest" side - that's really a matter for EnterpriseDB, Command Prompt, Red Hat, and the other backers who're employing people to work on the DB. Consider asking on pgsql-hackers, too; if nothing else you'll get a good explanation of the current state and progress toward clustered indexes.

Some links that may be useful to you are:

  http://wiki.postgresql.org/wiki/Todo
  Things that it'd be good to support/implement at some point. Surprisingly, covering/clustered indexes aren't on there or at least aren't easily found. It's certainly a much-desired feature despite its apparent absence from the TODO.

  http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Development_Plan
  http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items

--
Craig Ringer




Re: The need for clustered indexes to boost TPC-V performance

From
Craig Ringer
Date:
On 07/06/2012 04:52 AM, Samuel Gendler wrote:


On Thu, Jul 5, 2012 at 1:37 PM, Reza Taheri <rtaheri@vmware.com> wrote:


I provided more config details in an earlier email.

 



I hate to disagree, but unless I didn't get a message sent to the list

It looks like that might be the case. I got a message with Message-ID 66CE997FB523C04E9749452273184C6C137CB88CDD@exch-mbx-113.vmware.com sent at Thu, 5 Jul 2012 11:33:46 -0700 that contained the basic info, postgresql.conf, etc. Belated, but it was sent. I can't find this message in the archives and the copy I got came direct to me via cc, so I suspect our friendly mailing list system has silently held it for moderation due to size/attachment.


I'll reproduce the content below, followed by an inline copy of the postgresql.conf with only changed lines:

On 07/06/2012 02:33 AM, Reza Taheri wrote:

OK, some config details.

We are using:

 

·         Two blades of an HP BladeSystem c-Class c7000 with 2-socket Intel E5520 (Nehalem-EP) processors and 48GB of memory per blade

o   8 cores, 16 threads per blade

o   48GB of RAM per blade

·         Storage was an EMC VNX5700 with 14 SSDs fronting 32 15K RPM drives

·         The Tier B database VM was alone on a blade with 16 vCPUs, 40GB of memory, 4 virtual drives with various RAID levels

·         The driver and Tier A VMs were on the second blade

o   So we set PGHOST on the client system to point to the server

·         RHEL 6.1

·         PostgreSQL 8.4

·         unixODBC 2.3.2

 

We stuck with PGSQL 8.4 since it is the stock version shipped with RHEL 6. I am building a new, larger testbed, and will switch to PGSQL 9 with that.

 


postgresql.conf:

[craig@ayaki ~]$ egrep -v '(^\s*#)|(^\s*$)' /tmp/postgresql2.conf  | cut -d '#' -f 1
listen_addresses = '*'           
max_connections = 320           
shared_buffers = 28GB           
temp_buffers = 200MB           
work_mem = 10MB               
maintenance_work_mem = 10MB       
bgwriter_delay = 10ms           
bgwriter_lru_maxpages = 20       
wal_buffers = 16MB           
checkpoint_segments = 128       
checkpoint_timeout = 30min       
checkpoint_completion_target = 0.9   
default_statistics_target = 10000   
logging_collector = on           
log_directory = 'pg_log'       
log_filename = 'postgresql-%a.log'   
log_truncate_on_rotation = on       
log_rotation_age = 1d           
log_rotation_size = 0           
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'           
lc_monetary = 'en_US.UTF-8'           
lc_numeric = 'en_US.UTF-8'           
lc_time = 'en_US.UTF-8'               
default_text_search_config = 'pg_catalog.english'


Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:
Well, I keep failing to send an email with an attachment. Do I need a moderator's approval?

Yes, running on VMs and a lower bin processor. With the virtualization overhead, etc., I figure we would be running
rightaround 2/3 of the Dell throughput if we were running the same DBMS. 

I sent the following message twice today with attachments (postgresql.conf, etc.), and it hasn't been posted yet. Here
itis without an attachment. 

****************************************************

From: Reza Taheri
Sent: Thursday, July 05, 2012 11:34 AM
To: 'Craig Ringer'
Cc: pgsql-performance@postgresql.org; Robert Haas
Subject: RE: [PERFORM] The need for clustered indexes to boost TPC-V performance

OK, some config details.
We are using:

*    Two blades of an HP BladeSystem c-Class c7000 with 2-socket Intel E5520 (Nehalem-EP) processors and 48GB of memory
perblade 
o    8 cores, 16 threads per blade
o    48GB of RAM per blade
*    Storage was an EMC VNX5700 with 14 SSDs fronting 32 15K RPM drives
*    The Tier B database VM was alone on a blade with 16 vCPUs, 40GB of memory, 4 virtual drives with various RAID
levels
*    The driver and Tier A VMs were on the second blade
o    So we set PGHOST on the client system to point to the server
*    RHEL 6.1
*    PostgreSQL 8.4
*    unixODBC 2.3.2

We stuck with PGSQL 8.4 since it is the stock version shipped with RHEL 6. I am building a new, larger testbed, and
willswitch to PGSQL 9 with that. 

Postgres.conf is attached.

Thanks,
Reza

> -----Original Message-----
> From: Andy Colson [mailto:andy@squeakycode.net]
> Sent: Thursday, July 05, 2012 5:42 PM
> To: Samuel Gendler
> Cc: Reza Taheri; Robert Klemme; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V
> performance
>
> On 07/05/2012 03:52 PM, Samuel Gendler wrote:
> >
> >
> > On Thu, Jul 5, 2012 at 1:37 PM, Reza Taheri <rtaheri@vmware.com
> <mailto:rtaheri@vmware.com>> wrote:
> >
> >
> >     I provided more config details in an earlier email.____
> >
> >     __ __
> >
> >
> >
> > I hate to disagree, but unless I didn't get a message sent to the list, you
> haven't provided any details about your postgresql config or otherwise
> adhered to the guidelines for starting a discussion of a performance
> problem around here.  I just searched my mailbox and no email from you
> has any such details.  Several people have asked for them, including myself.
> You say you will give any details we want, but this is at least the 3rd or 4th
> request for such details and they have not yet been forthcoming.
>
>
> Reza, I went back and looked myself.  I see no specs on OS, or hardware....
> unless you mean this:
>
>
> > http://bit.ly/QeWXhE. This was run on a similar server, and the database
> size is close to ours.
>
>
> You're running on windows then?  Server is 96Gig ram, 8 cores, (dell
> poweredge T610).
> with two powervault MD1120 NAS's?
>
> But then I assume you were not running on that, were you.  You were
> running vmware on it, probably?
>
>
> -Andy

Re: The need for clustered indexes to boost TPC-V performance

From
Craig Ringer
Date:
On 07/06/2012 08:41 AM, Andy Colson wrote:

You're running on windows then?  Server is 96Gig ram, 8 cores, (dell poweredge T610).
with two powervault MD1120 NAS's?

Thankfully they're running Pg on Linux (RHEL 6) . It seems that tests to date have been run against 8.4 which is pretty elderly, but hopefully it'll be brought up to 9.1 or 9.2beta soon.

While the original poster should've given a reasonable amount of information to start with when asking performance questions - as per the mailing list guidance and plain common sense - more info was sent later on but the lists.postgresql.org mailman ate it - or held it for moderation, anyway. The OP can't be blamed when Pg's mailing list manager eats mesages with attachments! Also, remember that not everyone uses community mailing lists regularly; it takes a little learning to get used to keeping track of conversations, to inline reply style, etc.

--
Craig Ringer

Re: The need for clustered indexes to boost TPC-V performance

From
Greg Smith
Date:
On 07/03/2012 07:13 PM, Reza Taheri wrote:
> Is the PGSQL community willing to invest in a feature that a) has been
> requested by many others already; and b) can make a huge difference in a
> benchmark that can lend substantial credibility to PGSQL performance?

Larger PostgreSQL features usually get built because companies sponsor
their development, they pass review as both useful & correct, and then
get committed.  Asking the community to invest in a new feature isn't
quite the right concept.  Yes, everyone would like one of the smaller
index representations.  I'm sure we can find reviewers willing to look
at such a feature and committers who would also be interested enough to
commit it, on a volunteer basis.  But a feature this size isn't going to
spring to life based just on volunteer work.  The most useful questions
would be "who would be capable of writing that feature?" and "how can we
get them sponsored to focus on it?"  I can tell from your comments yet
what role(s) in that process VMWare wants to take on internally, and
which it's looking for help with.  The job of convincing people it's a
useful feature isn't necessary--we know that's true.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Re: The need for clustered indexes to boost TPC-V performance

From
Craig Ringer
Date:
On 07/06/2012 09:00 AM, Reza Taheri wrote:
> Well, I keep failing to send an email with an attachment. Do I need a moderator's approval?

Probably. If so, it's really annoying that mailman isn't telling you
this via a "held for moderation" auto-reply. It should be.

> We stuck with PGSQL 8.4 since it is the stock version shipped with RHEL 6. I am building a new, larger testbed, and
willswitch to PGSQL 9 with that. 

Just so you know, as per PostgreSQL versioning policy major releases are
x.y, eg "8.4", "9.0" and "9.1" are distinct major releases.

http://www.postgresql.org/support/versioning/

I've always found that pretty odd and wish major releases would just
increment the first version part, but the policy states how it's being
done. It's important to realize this when you're talking about Pg
releases, because 8.4, 9.0, 9.1 and 9.2 are distinct releases with
different feature sets, so "postgresql 9" doesn't mean much.

--
Craig Ringer



Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:
Just to be clear, we have a number of people from different companies working on the kit. This is not a VMware project,
itis a TPC project. But I hear you regarding coming in from the cold and asking for a major db engine feature. I know
thatI have caused a lot of rolling eyes. Believe me, I have had the same (no, worse!) reaction from every one of the
commercialdatabase companies in response to similar requests over the past 25 years. 

We have our skin in the game, and as long as the community values the benchmark and wants to support us, we will figure
outthe details as we go forward. 

Thanks,
Reza

> -----Original Message-----
> From: Greg Smith [mailto:greg@2ndQuadrant.com]
> Sent: Thursday, July 05, 2012 6:42 PM
> To: Reza Taheri
> Cc: pgsql-performance@postgresql.org; Andy Bond (abond@redhat.com);
> Greg Kopczynski; Jignesh Shah
> Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V
> performance
>
> On 07/03/2012 07:13 PM, Reza Taheri wrote:
> > Is the PGSQL community willing to invest in a feature that a) has been
> > requested by many others already; and b) can make a huge difference in
> > a benchmark that can lend substantial credibility to PGSQL performance?
>
> Larger PostgreSQL features usually get built because companies sponsor
> their development, they pass review as both useful & correct, and then get
> committed.  Asking the community to invest in a new feature isn't quite the
> right concept.  Yes, everyone would like one of the smaller index
> representations.  I'm sure we can find reviewers willing to look at such a
> feature and committers who would also be interested enough to commit it,
> on a volunteer basis.  But a feature this size isn't going to spring to life based
> just on volunteer work.  The most useful questions would be "who would
> be capable of writing that feature?" and "how can we get them sponsored
> to focus on it?"  I can tell from your comments yet what role(s) in that
> process VMWare wants to take on internally, and which it's looking for help
> with.  The job of convincing people it's a useful feature isn't necessary--we
> know that's true.
>
> --
> Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:

Hi Craig,

I used the tool at depesz.com extensively during our early prototyping. It helped uncover ~10 problems that we solved by fixing issues in the code, adding or changing indexes, etc. Right now, I believe all our query plans look like what I would expect.

 

Yes, you are right, I did miss the link to the index-only scans. From what I can tell, it will do exactly what we want, but only as long as the index has all the columns in the query. I don’t know what percentage of our queries have this property. But it does help.

 

The two main kit developers are out this week. We’ll put our heads together next week to see what version to use when I switch to a larger testbed I am preparing.

 

Thanks,

Reza

 

From: Craig Ringer [mailto:ringerc@ringerc.id.au]
Sent: Thursday, July 05, 2012 5:46 PM
To: Reza Taheri
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V performance

 

First: Please do try 9.2 beta if you're upgrading from 8.4. It'll be out as a final release soon enough, and index only scans may make a big difference for the problem you're currently having.

Looking at your configuration I have a few comments, but it's worth noting that I don't work with hardware at that scale, and I'm more used to tuning I/O bottlenecked systems with onboard storage rather than CPU-bottlenecked ones on big SANs. Hopefully now that you've posted your configuration and setup there might be interest from others.

If you're able to post an EXPLAIN ANALYZE or two for a query you feel is slow that certainly won't hurt. Using http://explain.depesz.com/ saves you the hassle of dealing with word-wrapping when posting them, btw.

As for your config:

I notice that your autovacuum settings are at their defaults. With heavy UPDATE / DELETE load this'll tend to lead to table and index bloat, so the DB has to scan more useless data to get what it needs. It also means table stats won't be maintained as well, potentially leading to poor planner decisions. The following fairly scary query can help identify bloat, as the database server doesn't currently have anything much built in to help you spot such issues:

   http://wiki.postgresql.org/wiki/Show_database_bloat

It might be helpful to set effective_cache_size and effective_io_concurrency so Pg has more idea of the scale of your hardware. The defaults are very conservative - it's supposed to be easy for people to use for simple things without melting their systems, and it's expected that anyone doing bigger work will tune the database.

http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html

It looks like you've already tweaked many of the critical points for big installs - your checkpoint_segments, wal_buffers, shared_buffers, etc. I lack the big hardware experience to know if they're appropriate, but they're not the extremely conservative defaults, which is a start.

Your random_page_cost and seq_page_cost are probably dead wrong for a SAN with RAM and SSD cache in front of fast disks. Their defaults are for local uncached spinning HDD media where seeks are expensive. The typical advice on such hardware is to set them to something more like seq_page_cost = 0.1  random_page_cost  = 0.15 - ie cheaper relative to the cpu cost, and with random I/O only a little more expensive than sequential I/O. What's right for your situation varies a bit based on DB size vs hardware size, etc; Greg discusses this more in his book.

What isolation level do your transactions use? This is significant because of the move to true serializable isolation with predicate locking in 9.0; it made serializable transactions a bit slower in some circumstances in exchange for much stronger correctness guarantees. The READ COMMITTED default was unchanged.



It also looks like you might not have seen the second part of my earlier reply:


while PostgreSQL doesn't support covering indexes or clustered indexes at this point, 9.2 has added support for index-only scans, which are a half-way point of sorts. See:

  http://rhaas.blogspot.com.au/2011/10/index-only-scans-weve-got-em.html
  http://rhaas.blogspot.com.au/2010/11/index-only-scans.html
  http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2822fb9337a21f98ac4ce850bb4145acf47ca27

If at all possible please see how your test is affected by this PostgreSQL 9.2 enhancement. It should make a big difference, and if it doesn't it's important to know why.

(CC'd Robert Haas)



As for the "invest" side - that's really a matter for EnterpriseDB, Command Prompt, Red Hat, and the other backers who're employing people to work on the DB. Consider asking on pgsql-hackers, too; if nothing else you'll get a good explanation of the current state and progress toward clustered indexes.

Some links that may be useful to you are:

  http://wiki.postgresql.org/wiki/Todo
  Things that it'd be good to support/implement at some point. Surprisingly, covering/clustered indexes aren't on there or at least aren't easily found. It's certainly a much-desired feature despite its apparent absence from the TODO.

  http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Development_Plan
  http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items

--
Craig Ringer

 

 

Re: The need for clustered indexes to boost TPC-V performance

From
Merlin Moncure
Date:
On Thu, Jul 5, 2012 at 10:33 PM, Reza Taheri <rtaheri@vmware.com> wrote:
> Just to be clear, we have a number of people from different companies working on the kit. This is not a VMware
project,it is a TPC project. But I hear you regarding coming in from the cold and asking for a major db engine feature.
Iknow that I have caused a lot of rolling eyes. Believe me, I have had the same (no, worse!) reaction from every one of
thecommercial database companies in response to similar requests over the past 25 years. 

No rolling of eyes from me.  Clustered indexes work and if your table
access mainly hits the table through that index you'll see enormous
reductions in i/o.  Index only scans naturally are a related
optimization in the same vein.  Denying that is just silly.  BTW,
putting postgres through a standard non trivial benchmark suite over
reasonable hardware, reporting results, identifying bottlenecks, etc.
is incredibly useful.  Please keep it up, and don't be afraid to ask
for help here.  (one thing I'd love to see is side by side results
comparing 8.4 to 9.1 to 9.2).

merlin

Re: The need for clustered indexes to boost TPC-V performance

From
Reza Taheri
Date:
Hi Merlin,
We are moving up to a larger testbed, and are planning to use 9.2. But the results will not comparable to our 8.4
resultsdue to differences in hardware. But that comparison is a useful one. I'll try for a quick test on the new
hardwarewith 8.4 before moving to 9.2. 

Thanks,
Reza

> -----Original Message-----
> From: Merlin Moncure [mailto:mmoncure@gmail.com]
> Sent: Tuesday, July 10, 2012 12:06 PM
> To: Reza Taheri
> Cc: Greg Smith; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V
> performance
>
> On Thu, Jul 5, 2012 at 10:33 PM, Reza Taheri <rtaheri@vmware.com> wrote:
> > Just to be clear, we have a number of people from different companies
> working on the kit. This is not a VMware project, it is a TPC project. But I
> hear you regarding coming in from the cold and asking for a major db engine
> feature. I know that I have caused a lot of rolling eyes. Believe me, I have
> had the same (no, worse!) reaction from every one of the commercial
> database companies in response to similar requests over the past 25 years.
>
> No rolling of eyes from me.  Clustered indexes work and if your table access
> mainly hits the table through that index you'll see enormous reductions in
> i/o.  Index only scans naturally are a related optimization in the same vein.
> Denying that is just silly.  BTW, putting postgres through a standard non
> trivial benchmark suite over reasonable hardware, reporting results,
> identifying bottlenecks, etc.
> is incredibly useful.  Please keep it up, and don't be afraid to ask for help
> here.  (one thing I'd love to see is side by side results comparing 8.4 to 9.1 to
> 9.2).
>
> merlin