Re: 9.3 Pre-proposal: Range Merge Join - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: 9.3 Pre-proposal: Range Merge Join |
Date | |
Msg-id | CAM-w4HPM5ScSy3wGfQLXS8t6mjzfH6rj80i_haWykBMUiZFYUw@mail.gmail.com Whole thread Raw |
In response to | Re: 9.3 Pre-proposal: Range Merge Join (Simon Riggs <simon@2ndQuadrant.com>) |
List | pgsql-hackers |
On Mon, Apr 16, 2012 at 10:20 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > The thing I like most about temp indexes is that they needn't be temporary. > > I'd like to see something along the lines of demand-created optional > indexes, that we reclaim space/maintenance overhead on according to > some cache management scheme. More space you have, the more of the > important ones hang around. The rough same idea applies to > materialised views. I find this a) really scary, b) a huge increase in scope, and c) kind of pointless. a) DDL and shared caches require all kinds of additional synchronization that would be really annoying to be incurring in the middle of DML queries unexpectedly. If these indexes are to be useful for other transactions they would also impose a hidden and uncontrollable cost on all updates and inserts on the table. b) I think it would be a lot harder to get working cleanly. You would need to invent a UI to control the lifetime and resources used by these objects and deal with duplication between manually created and dynamically created objects. It's fundamentally a huge shift in the design of the database changing what things the user maintains as part of their schema and what things the database maintains transparently. This is a big project on its own aside from the technical side of things. c) If these indexes are useful for many queries then the user can choose to create them themself. These indexes you're proposing would be just as expensive as any those indexes and offer few advantages aside from their automaticness. The same could be accomplished with some external demon that just looked at the query logs and determined which indexes to create or not. The main advantage of creating dynamic indexes as part of the query would be lost. Namely that these would be local data structures that don't need to be synchronized with other transactions and don't need to be updated by other transactions. They're just part of the query execution the way spilled hash tables and tuplesort tapes are. You could build indexes on materialized data resulting from earlier joins or aggregates and so on. The point is that if you make them equivalent to normal indexes just dynamically maintained then all you've done is change the way normal indexes work but haven't really changed the set of queries they're useful for. That might be a neat UI Feature but If you want to change the set of queries postgres can handle efficiently at all then you need something that's fundamentally different from a table index. As an aside I think some of what you're looking for could be better handled with some kind of query result cache that could keep around the materialized data from plan nodes until an event happens that changes the data. This might be related to the underlying infrastructure needed for materialized views but I haven't thought too deeply about it. It seems like a lot of work and a big change from the current very isolated per-process execution model. -- greg
pgsql-hackers by date: