Thread: Advice Wanted on Selecting Multi-row Data Requests in 10-Row Blocks
What I want to do is present the results of a query in a web page, but only 10 rows at a time. My PostgreSQL table has millions of records and if I don't add a LIMIT 10 to the SQL selection, the request can take too long. The worst case scenario is when the user requests all records without adding filtering conditions (e.g. SELECT * FROM MyTable;) That can take 10-15 minutes, which won't work on a web application. What I'm wondering is how in PostgreSQL do you select only the first 10 records from a selection, then the next 10, then the next, and possibly go back to a previous 10? Or do you do the full selection into a temporary table once, adding a row number to the columns and then performing sub-selects on that temporary table using the row id? Or do you run the query with Limit 10 set and then run another copy with no limit into a temporary table while you let the user gaze thoughtfully at the first ten records? I know how to get records form the database into a web page, and I know how to sense user actions (PageDown, PageUp, etc.) so I'm basically looking for techniques to extract the data quickly. Also, if this isn't the best forum to ask this sort of question, I'd appreciate being pointed to a more appropriate one. TIA, - Bill Thoen
> > What I want to do is present the results of a query in a web page, but > only 10 rows at a time. My PostgreSQL table has millions of records and > if I don't add a LIMIT 10 to the SQL selection, the request can take > too > long. The worst case scenario is when the user requests all records > without adding filtering conditions (e.g. SELECT * FROM MyTable;) That > can take 10-15 minutes, which won't work on a web application. > > What I'm wondering is how in PostgreSQL do you select only the first 10 > records from a selection, then the next 10, then the next, and possibly > go back to a previous 10? Or do you do the full selection into a > temporary table once, adding a row number to the columns and then > performing sub-selects on that temporary table using the row id? Or do > you run the query with Limit 10 set and then run another copy with no > limit into a temporary table while you let the user gaze thoughtfully > at > the first ten records? > > I know how to get records form the database into a web page, and I know > how to sense user actions (PageDown, PageUp, etc.) so I'm basically > looking for techniques to extract the data quickly. > In addition to LIMIT, Postgresql has an OFFSET clause: http://www.postgresql.org/docs/8.3/interactive/sql-select.html#SQL-LIMIT So if you want to show the records in pages of 10, your queries would look like this: SELECT * FROM MyTable ORDER BY my_key LIMIT 10 OFFSET 0; SELECT * FROM MyTable ORDER BY my_key LIMIT 10 OFFSET 10; SELECT * FROM MyTable ORDER BY my_key LIMIT 10 OFFSET 20; The "offset" clause tells postgresql how many rows to skip. Note that you always need an "order by" clause in there as well to get meaningful results.
use a dynamic select in the web page $1 = 10 $2 = 5 select * from mytable limit $1 OFFSET $2 --- On Fri, 6/27/08, Bill Thoen <bthoen@gisnet.com> wrote: From: Bill Thoen <bthoen@gisnet.com> |
Re: Advice Wanted on Selecting Multi-row Data Requests in 10-Row Blocks
From
"Olexandr Melnyk"
Date:
On 6/27/08, Bill Thoen <bthoen@gisnet.com> wrote:
Also note, that huge OFFSET value can slow down the query as much as if you weren't using LIMIT at all.
--
Sincerely yours,
Olexandr Melnyk <><
http://omelnyk.net/
What I want to do is present the results of a query in a web page, but only 10 rows at a time. My PostgreSQL table has millions of records and if I don't add a LIMIT 10 to the SQL selection, the request can take too long. The worst case scenario is when the user requests all records without adding filtering conditions (e.g. SELECT * FROM MyTable;) That can take 10-15 minutes, which won't work on a web application.
Also note, that huge OFFSET value can slow down the query as much as if you weren't using LIMIT at all.
--
Sincerely yours,
Olexandr Melnyk <><
http://omelnyk.net/
Hello apart from the increasing OFFSET method, you only need to traverse the results sequentially, you can do a variant of this: let us assume your resultset has a a unique column pk, and is ordered on column o: initial select: select * from foo order by o limit 10; next page select * from foo where (o,pk)>(o,?) order by o limit 10; (where the ? is the last pk value in previous select) this method will be able to make use of an index on (o,pk) gnari On fös, 2008-06-27 at 14:14 -0600, Bill Thoen wrote: > What I want to do is present the results of a query in a web page, but > only 10 rows at a time. My PostgreSQL table has millions of records and > if I don't add a LIMIT 10 to the SQL selection, the request can take too > long. The worst case scenario is when the user requests all records > without adding filtering conditions (e.g. SELECT * FROM MyTable;) That > can take 10-15 minutes, which won't work on a web application. > > What I'm wondering is how in PostgreSQL do you select only the first 10 > records from a selection, then the next 10, then the next, and possibly > go back to a previous 10? Or do you do the full selection into a > temporary table once, adding a row number to the columns and then > performing sub-selects on that temporary table using the row id? Or do > you run the query with Limit 10 set and then run another copy with no > limit into a temporary table while you let the user gaze thoughtfully at > the first ten records? > > I know how to get records form the database into a web page, and I know > how to sense user actions (PageDown, PageUp, etc.) so I'm basically > looking for techniques to extract the data quickly. > > Also, if this isn't the best forum to ask this sort of question, I'd > appreciate being pointed to a more appropriate one. > > TIA, > > - Bill Thoen > > >
Thanks for tip on OFFSET. That's just what I needed. It's so easy when you know the command you're looking for, and so hard when you know what you want to do but don't know what the command is called! Thanks, - Bill Thoen
Re: Advice Wanted on Selecting Multi-row Data Requests in 10-Row Blocks
From
"Richard Broersma"
Date:
On Fri, Jun 27, 2008 at 2:09 PM, Bill Thoen <bthoen@gisnet.com> wrote: > Thanks for tip on OFFSET. That's just what I needed. It's so easy when you > know the command you're looking for, and so hard when you know what you want > to do but don't know what the command is called! I would strongly suggest taking a second look at Ragnar's suggestion. It may be a bit more difficult to implement, but I is a clear performance winner when you start to deal with large datasets and the OFFSET get really big. -- Regards, Richard Broersma Jr. Visit the Los Angles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug
Bill Thoen wrote: > What I'm wondering is how in PostgreSQL do you select only the first 10 > records from a selection, then the next 10, then the next, and possibly > go back to a previous 10? LIMIT with OFFSET has already been mentioned. There's another option if your web app is backed by an application server or some other environment that can retain resources across client queries: You can use a scrollable database cursor to access the results. This won't do you much (any?) good if your web app has to establish a connection or get one from the pool for every request. It's only really useful if you can store the connection in the user's session information. Using cursors probably isn't very good for very high user counts, because abandoned sessions will hold their database connections until the session times out and is destroyed. For more complex apps with fewer users, though, cursors could be a big win. Note that OFFSET isn't free either. The database server must still execute all of the query up to OFFSET+LIMIT results. With a high offset, that can get very slow. A cursor will be OK here if you still start from the beginning, but if you ever begin with a high offset you'll want to look into using one of the methods suggested in other replies that permit you to use an index. -- Craig Ringer
On Fri, Jun 27, 2008 at 08:22:35PM +0000, Ragnar wrote: > let us assume your resultset has a a unique column pk, and is ordered on > column o: > > next page > select * from foo where (o,pk)>(o,?) order by o limit 10; > (where the ? is the last pk value in previous select) > > this method will be able to make use of an index on (o,pk) Hum, I think I must be missing something. I'm not sure why you're comparing 'o' to itself and you're not putting any ordering constraint on the primary key. I think the query should look closer to: SELECT * FROM foo WHERE (o,pk)>($1,$2) ORDER BY o,pk LIMIT 10; Or am I going mad? I'm expecting a table structure somewhat like: CREATE TABLE foo ( pk TEXT PRIMARY KEY, value TEXT, o INT NOT NULL ); CREATE INDEX foo_ord_idx ON foo (o,pk); Sam
use a dynamic select in the web page $1 = 10 $2 = 5 select * from mytable limit $1 OFFSET $2 --- On Fri, 6/27/08, Bill Thoen <bthoen@gisnet.com> wrote: From: Bill Thoen <bthoen@gisnet.com> |
On Mon, Jun 30, 2008 at 8:51 AM, Sam Mason <sam@samason.me.uk> wrote: >> >> select * from foo where (o,pk)>(o,?) order by o limit 10; > > Hum, I think I must be missing something. I'm not sure why you're > comparing 'o' to itself and you're not putting any ordering constraint > on the primary key. I think the query should look closer to: > > SELECT * FROM foo WHERE (o,pk)>($1,$2) ORDER BY o,pk LIMIT 10; > > Or am I going mad? yes, you are correct. you need to supply at least one value for each ordered field. I think this is what the OP was tring to say. usually it's much simpler than this: select * from foo where pk > $1 order by pk limit 1; This will pull up table in pk order which is usually fine. Any ordering will do as long as the combination of fields being ordered are unique. Adding pk as the second criteria is only needed if you want to order by a non duplicate field. If 'o' is a candidate key this is not required. btw, the use of OFFSET for this type of problem is actually fairly terrible...it's almost never a good idea. merlin