Re: Select where id in (LARGE LIST) ? - Mailing list pgsql-jdbc
From | Csaba Nagy |
---|---|
Subject | Re: Select where id in (LARGE LIST) ? |
Date | |
Msg-id | 1122885611.2837.209.camel@coppola.muc.ecircle.de Whole thread Raw |
In response to | Re: Select where id in (LARGE LIST) ? (Jasper Potts <jasper@jasperpotts.com>) |
Responses |
Re: Select where id in (LARGE LIST) ?
|
List | pgsql-jdbc |
I don't quite understand what you're doing here, but smells to me for something which shouldn't be interactive in the first place. If some query expectedly exceeds a few seconds, we make the operation asynchronous, i.e. the user starts it, gets a page saying the thing is currently being done, and then he receives some kind of notification when the thing is finished, either on the web page (by periodically reloading) or by email for really long lasting things. Now the chunking method does have an overhead indeed, but it has a set of good properties too: it's easily interruptible, it won't bug your database down (it will let some other things go too between the chunks), and you won't get connection timeouts for really long operations... it's just more manageable in many respects. And if you do have some long lasting things, make sure you won't let the user hit reload 50 times... an asynchronous way of doing it will help in this regard too, cause then you know what the user did and reload will only show the status. HTH, Csaba. On Fri, 2005-07-29 at 21:18, Jasper Potts wrote: > I have been trying the chunk method and it wins in cases where the > number of ids is not too high. I was comparing: > > (1) joining the main table with a table that had a column with ids and a > boolean column for selected, both id columns have indexes. > (2) select over main table where id in (...) done in 100s with prepared > statement and results accumulated in java > > with a select that did a sum() of another column the results were: > > No. of Selected Items | Join Select Time in sec | Chunk Select > Time in sec > 30 | 0.4 > | 0.007 > 4000 | 0.5 > | 0.24 > 30000 | 0.7 | > 1.12 > > All of these were with 30,000 rows in main table from JDBC. > > These results don't take into account the time it takes to clear then > populate the select table with selection data from client. At the moment > this can take 1-2 seconds which is far from interactive for the user. > Which is why I am looking for a better method. > > Any other suggestions? :-) > > I would like to push the data size up to a million, which means the user > could in theory select a million rows and apply operation. This is going > to get very slow with the chunk method. > > Many Thanks > > Jasper > > Csaba Nagy wrote: > > >Jasper, > > > >You can chunk your operation. That means to only use 100 entries in one > >run, and repeat it until all ids were processed. Use a prepared > >statement, that will save you some overhead. The last chunk will have > >less entries than the parameter placeholders, so you will have to build > >a special last statement, or to set the superfluous parameters to null > >or to one of the values from the last chunk, depends on what kind of > >query you have. We do all our data import/export this way, and it works > >fine. > > > >Cheers, > >Csaba. > > > > > >On Fri, 2005-07-29 at 17:30, Jasper Potts wrote: > > > > > >>I am working on a gui application which has a list of ids of selected > >>items. To perform an operation on the selected items I do a > >>"select/update .. where id in(...)". There seems to be a limit of > >>100-200 items in the list. Is there a way of doing this with large > >>lists, maybe 10,000, 100,000 long? > >> > >>The best solution I have so far is to create a selection table and write > >>the select out to that and perform a join but the 100,000 inserts are slow. > >> > >>The ids are int8(long), so not talking large amounts of data, couple Mb > >>at most. Database and client running on same machine over localhost. > >> > >>Many Thanks > >> > >>Jasper > >> > >>---------------------------(end of broadcast)--------------------------- > >>TIP 9: In versions below 8.0, the planner will ignore your desire to > >> choose an index scan if your joining column's datatypes do not > >> match > >> > >> > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend
pgsql-jdbc by date: