Home > mailing lists

Re: TB-sized databases - Mailing list pgsql-performance

From	Russell Smith
Subject	Re: TB-sized databases
Date	November 30, 2007 02:42:38
Msg-id	474FB0B1.6070900@pws.com.au Whole thread Raw
In response to	Re: TB-sized databases (Simon Riggs <simon@2ndquadrant.com>)
Responses	Re: TB-sized databases
List	pgsql-performance

Tree view

Simon Riggs wrote:
> On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote:
>
>> Simon Riggs wrote:
>>
>>> All of those responses have cooked up quite a few topics into one. Large
>>> databases might mean text warehouses, XML message stores, relational
>>> archives and fact-based business data warehouses.
>>>
>>> The main thing is that TB-sized databases are performance critical. So
>>> it all depends upon your workload really as to how well PostgreSQL, or
>>> another other RDBMS vendor can handle them.
>>>
>>>
>>> Anyway, my reason for replying to this thread is that I'm planning
>>> changes for PostgreSQL 8.4+ that will make allow us to get bigger and
>>> faster databases. If anybody has specific concerns then I'd like to hear
>>> them so I can consider those things in the planning stages
>>>
>> it would be nice to do something with selects so we can recover a rowset
>> on huge tables using a criteria with indexes without fall running a full
>> scan.
>>
>> In my opinion, by definition, a huge database sooner or later will have
>> tables far bigger than RAM available (same for their indexes). I think
>> the queries need to be solved using indexes enough smart to be fast on disk.
>>
>
> OK, I agree with this one.
>
> I'd thought that index-only plans were only for OLTP, but now I see they
> can also make a big difference with DW queries. So I'm very interested
> in this area now.
>
>
If that's true, then you want to get behind the work Gokulakannan
Somasundaram
(http://archives.postgresql.org/pgsql-hackers/2007-10/msg00220.php) has
done with relation to thick indexes.  I would have thought that concept
particularly useful in DW.  Only having to scan indexes on a number of
join tables would be a huge win for some of these types of queries.

My tiny point of view would say that is a much better investment than
setting up the proposed parameter.  I can see the use of the parameter
though.  Most of the complaints about indexes having visibility is about
update /delete contention.  I would expect in a DW that those things
aren't in the critical path like they are in many other applications.
Especially with partitioning and previous partitions not getting may
updates, I would think there could be great benefit.  I would think that
many of Pablo's requests up-thread would get significant performance
benefit from this type of index.  But as I mentioned at the start,
that's my tiny point of view and I certainly don't have the resources to
direct what gets looked at for PostgreSQL.

Regards

Russell Smith

pgsql-performance by date:

From: Josh Berkus
Date: 30 November 2007, 01:50:07
Subject: Re: Configuring a Large RAM PostgreSQL Server

From: Simon Riggs
Date: 30 November 2007, 04:40:06
Subject: Re: TB-sized databases

Re: TB-sized databases - Mailing list pgsql-performance

Previous

Next