Re: Sequential scans - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Sequential scans |
Date | |
Msg-id | 1178219087.28383.286.camel@dogma.v10.wvs Whole thread Raw |
In response to | Re: Sequential scans (Heikki Linnakangas <heikki@enterprisedb.com>) |
List | pgsql-hackers |
On Thu, 2007-05-03 at 19:27 +0100, Heikki Linnakangas wrote: > I understand that the data structure keeps track of relations being > scanned, with one entry per such relation. I think that's very good > design, and I'm not suggesting to change that. > > But what's the right size for that? We don't know how many large tables > there is in the database, so we can't use that. A hard-coded constant > maybe? What would it be? I figured we could use NBackends, because there > can't realistically be more than one large seq scan per backend going on > at any point in time. That's an upper bound, in any real world > application there's far less than that. Ok, I understand you now. We'll choose the maximum size of the table/list as the max_connections on startup. > > Part of the reason I did this is because, with a Sync Scan, it seems > > like there may be possible reasons to do multiple seq scans concurrently > > in the same backend. This is completely contrived, but consider: > > > > SELECT * FROM bigtable UNION ALL SELECT * FROM bigtable; > > > > I'm not trying to argue for such a plan to exist right now, but if there > > is a possibility that we'd want to create such plans in the future, I > > think it's better to track by relation rather than by backend. > > Those two seqscans would be executed one after each other, not in > parallel, so you would still have just one seq scan at any given point > in time. Right, I was just throwing out a hypothetical situation (if highly contrived) in which it may help to do multiple scans at once. I think this can be safely ignored for now though, since the planner would never generate such a plan, and I can't think of a reasonable use case. > >>> 100 is of course arbitrary, and that could grow or shrink automatically > >>> at runtime. > >> Except that it can't shrink, because we don't have any convenient moment > >> to remove an entry from the list, other than dropping the least recently > >> used entry out of the list when inserting a new one. And since it's in > >> shared mem, the allocated size is fixed at boot up, so we can't grow it > >> either. > >> > > > > It can shrink when a new scan looks through the list and passes by > > entries that don't need to be in the list. > > How do you identify an entry that doesn't need to be there? > Some type of clock-like algorithm where a counter was decremented when an element is passed up and incremented when it's chosen might work. It's probably easier to just use a hash table though. > > I don't want to over-complicate things, so if you think having a hash > > table is a better, simpler, or more readable design, I'll go with that > > instead of the list-only approach. The only reason I suggested the list- > > only approach was to see if the hash table was really gaining anything > > for us. In practice, it will be quite rare that more than 10 different > > big relations are being scanned at once, and I don't know how much of a > > gain a hash table is when there are (in all but exceptional cases) only > > a few entries. > > How about implementing the list-only approach first, since you're going > to need the LRU list anyway, and we can add consider adding the hash > table later? > Sounds good. Will do. Regards,Jeff Davis
pgsql-hackers by date: