Re: Question about indexes - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Question about indexes |
Date | |
Msg-id | 003701c3e5ec$44306250$efb887d9@LaptopDellXP Whole thread Raw |
In response to | Re: Question about indexes (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Some potentially helpful background comments on the discussion so far... >Tom Lane writes >>Greg Stark writes >> Note that the space saving of bitmap indexes is still a substantial >> factor. >I think you are still confusing what I'm talking about with a bitmap index, >ie, a persistent structure on-disk. It's not that at all, but a transient >structure built in-memory during an index scan. Oracle allows the creation of bitmap indices as persistent data structures. The "space saving" of bitmap indices is only a saving when compared with btree indices. If you don't have them at all because they are built dynamically when required, as Tom is suggesting, then you "save" even more space. Maintaining the bitmap index is a costly operation. You tend to want to build them on "characteristic" columns, of which there tends to be more of in a database than "partial/full identity" columns on which you build btrees (forgive the vagueness of that comment), so you end up with loads of the damn things, so the space soon adds up. It can be hard to judge which ones are the important ones, especially when each is used by a different user/group. Building them dynamically is a good way of solving the question "which ones are needed?". Ever seen 58 indices on a table? Don't go there. My vote would be implement the dynamic building capability, then return to implement a persisted structure later if that seems like it would be a further improvement. [The option would be nice] If we do it dynamically, as Tom suggests, then we don't have to code the index maintenance logic at all and the functionality will be with us all the sooner. Go Tom! >Tom Lane writes > In any case, this discussion is predicated on the assumption that the > operations involving the bitmap are a significant fraction of the total > time, which I think is quite uncertain. Until we build it and profile > it, we won't know that. Dynamically building the bitmaps has been the strategy in use by Teradata for nearly a decade on many large datawarehouses. I can personally vouch for the effectiveness of this approach - I was surprised when Oracle went for the persistent option. Certainly in that case building the bitmaps adds much less time than is saved overall by the better total query strategy. >Greg Stark writes > > To build the in-memory bitmap you effectively have to do a sort. Not sure on this latter point: I think I agree with Greg on that point, but want to believe Tom because requiring a sort will definitely add time. To shed some light in this area, some other major implementations are: In Teradata, tables are stored based upon a primary index, which is effectively an index-organised table. The index pointers are stored in sorted order lock step with the blocks of the associated table - No sort required. (The ordering is based upon a hashed index, but that doesn't change the technique). Oracle's tables/indexes use heaps/btrees also, though they do provide an index-organised table feature similar to Teradata. Maybe the lack of heap/btree consistent ordering in Oracle and their subsequent design choice of persistent bitmap indices is an indication for PostgreSQL too? In Oracle, bitmap indices are an important precursor to the star join technique. AFAICS it is still possible to have a star join plan without having persistent bitmap indices. IMHO, the longer term goal of a good star join plan is an important one - that may influence the design selection for this discussion. Hope some of that helps, Best regards, Simon Riggs
pgsql-hackers by date: