Re: [HACKERS] What about LIMIT in SELECT ? - Mailing list pgsql-hackers
From | Oleg Bartunov |
---|---|
Subject | Re: [HACKERS] What about LIMIT in SELECT ? |
Date | |
Msg-id | Pine.GSO.3.96.SK.981015193853.19322D-100000@ra Whole thread Raw |
In response to | Re: [HACKERS] What about LIMIT in SELECT ? (jwieck@debis.com (Jan Wieck)) |
Responses |
Re: [HACKERS] What about LIMIT in SELECT ?
|
List | pgsql-hackers |
This is a little bit off-topic, I did some timings with latest cvs on my real database ( all output redirected to /dev/null ), table contains 8798 records, 31 columns, order key have indices. 1.select count(*) from work_flats; 0.02user 0.00system 0:00.18elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (131major+21minor)pagefaults 0swaps 2.select * from work_flats order by rooms, metro_id; 2.35user 0.25system 0:10.11elapsed 25%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (131major+2799minor)pagefaults 0swaps 3.set query_limit to '150'; SET VARIABLE select * from work_flats order by rooms, metro_id; 0.06user 0.00system 0:02.75elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (131major+67minor)pagefaults 0swaps 4.begin; declare tt cursor for select * from work_flats order by rooms, metro_id; fetch 150 in tt; end; 0.05user 0.01system 0:02.76elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (131major+67minor)pagefaults 0swaps As you can see timings for query_limit and cursor are very similar, I didn't expected this. So, in principle, enhanced version of fetch (with offset) would cover all we need from LIMIT, but query_limit would be still useful, for example to restrict loadness of server. Will all enhancements you discussed go to the 6.4 ? I'm really interested in testing this stuff because I begin new project and everything we discussed here are badly needed. Regards, Oleg On Thu, 15 Oct 1998, Jan Wieck wrote: > Date: Thu, 15 Oct 1998 14:23:43 +0200 (MET DST) > From: Jan Wieck <jwieck@debis.com> > To: t-ishii@sra.co.jp > Cc: jwieck@debis.com, oleg@sai.msu.su, hackers@postgreSQL.org > Subject: Re: [HACKERS] What about LIMIT in SELECT ? > > Tatsuo Ishii wrote: > > > I think we have understanded your point. set query_limit is just a > > easy alternative of using cursor and fetch. > > > > > I haven't looked at Tatsuo's patch very well. But if it > > > limits the amount of data going into the sort (on ORDER BY), > > > it will break it! The requested ordering could be different > > > from what the choosen index might return. The used index is > > > choosen by the planner upon the qualifications given, not the > > > ordering wanted. > > > > I think it limits the final result. When query_limit is set, > > the arg "numberTuples" of ExecutePlan() is set to it instead of 0 > > (this means no limit). > > > > Talking about "offset," it shouldn't be very difficult. I guess all we > > have to do is adding a new arg "offset" to ExecutePlan() then making > > obvious modifications. (and of course we have to modify set > > query_limit syntax but it's trivial) > > The offset could become > > FETCH n IN cursor [OFFSET n]; > > and > > SELECT ... [LIMIT offset,count]; > > The FETCH command already calls ExecutorRun() with the given > count (the tuple limit). Telling it the offset too is really > simple. And ExecutorRun() could check if the toplevel > executor node is an index scan. Skipping tuples during the > index scan requires, that all qualifications are in the > indexqual, thus any tuple returned by it will become a final > result row (as it would be in the simple 1-table-queries we > discussed). If that isn't the case, the executor must > fallback to skip the final result tuples and that is after an > eventually processed sort/merge of the complete result set. > That would only reduce communication to the client and memory > required there to buffer the result set (not a bad thing > either). > > ProcessQueryDesc() in tcop/pquery.c also calls ExecutorRun() > but with a constant 0 tuple count. Having offset and count in > the parsetree would make it without any state variables or > SET command. And it's the only clean way to restrict LIMIT to > SELECT queries. Any thrown in LIMIT to ExecutorRun() from > another place could badly hurt the rewrite system. Remember > that non-instead actions on insert/update/delete are > processed before the original query! And what about SQL > functions that get processed during the evaluation of another > query (view using an SQL function for count(*))? > > A little better would it be to make the LIMIT values able to > be parameter nodes. C or PL functions use the prepared plan > feature of the SPI manager for performance reasons. > Especially the offset value might there need to be a > parameter that the executor has to pick out first. If we > change the count argument of ExecutorRun to a List *limit, > this one could be NIL (to mean the old 0 count 0 offset > behaviour) or a list of two elements that both can be either > a Const or a Param of type int4. Easy for the executor to > evaluate. > > The only places where ExecutorRun() is called are > tcop/pquery.c (queries from frontend), commands/command.c > (FETCH command), executor/functions.c (SQL functions) and > executor/spi.c (SPI manager). So it is easy to change the > call interface too. > > > Jan > > -- > > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #======================================== jwieck@debis.com (Jan Wieck) # > > _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
pgsql-hackers by date: