Thread: server side resultset search (for performance reasons)
I have a requirement that don't know how to accomplish: The application is returning 300,000 records in some SELECT clause (a ResultSet) this finish the client's JVM memory, so my solution to this was to use the SELECT's LIMIT clause returning 600 records and making the internal logic that when the user gets past the 600th record the application automatically issues another query with LIMIT and OFFSET clauses causing a little delay but this is better than finishing the memory in the client's machine (if you know a better solution or have some advices, please let me know). My request here is, that if the user makes the record 250,000 the current record, because in the grphical interface (a JTable by the way), and selects to ORDER BY a different column, the record number 250,000 will be completely a different record, and the application have to point to the same record before the order by. Because the JTable is always showing a postgres table, and my postgres tables always have a primary key, I can identify (by means of the primary key) one record, I can get the Primary Key of the actual record, and after the ORDER BY, I can search (secuentially in the resultset) the record pointer (this is actually the row index in the rresultset) until the PK matches. Before I change to the LIMIT version of my SELECTs, this was no problem, the search was in memory within the entire ResultSet, but this is prohibitive in very large resultsets, because of the memory constraints. I have looked that in the server side the query is made very efficiently, because the response time for the first 600 records is very fast, ( the last 600 records are no as fasts as the firsts). So I was trying to find some way to make the search of the record index in the server side if possible, because actually, I make the search on the client side, but it is very slow, the client gets 600 records, make the search and if the record is not found gets another 600 records (other SELECT clause) and searches again, and if not found, gets another 600 records...... and so on, making this process very slow. Some ideas that came to me reading the docs, are a Stored Procedure that can reach the resultset produced in the server side, that receives the target record primary key, and makes the search returning -1 or the record index. Other idea is send with the SELECT a function like "SELECT *, getIndex(currRecordPK) FROM mytable WHERE mycondition ORDER BY micolumn" in this case getIndex is a function that compares the Primary Key of the current record with the one in the parameter (I have never made a Postgresql function so maybe the way I say this works is wrong), and put the index of the record that matches (if in the resultset is the record) in somewhere that I can retrieve to make the current Record of the new ResultSet. I hope someone can help me further in this. Thanks in advance Jesus Sandoval
Why don't you count the records where the sort column is smaller/bigger (depending on sort order) than the current record's sort column ? Something like: select count(*) from mytable where mycondition and sortcolumn < crtvalue (or > in case of descending ordering) This of course will still possibly leave you multiple records with the same value of the sort column as the value of the current record, but in this case you anyway should select all of those records I guess... HTH, Csaba. On Fri, 2003-08-01 at 13:12, Jesus Sandoval wrote: > I have a requirement that don't know how to accomplish: > > The application is returning 300,000 records in some SELECT clause (a > ResultSet) this finish the client's JVM memory, so my solution to this > was to use the SELECT's LIMIT clause returning 600 records and making > the internal logic that when the user gets past the 600th record the > application automatically issues another query with LIMIT and OFFSET > clauses causing a little delay but this is better than finishing the > memory in the client's machine (if you know a better solution or have > some advices, please let me know). > > My request here is, that if the user makes the record 250,000 the > current record, because in the grphical interface (a JTable by the way), > and selects to ORDER BY a different column, the record number 250,000 > will be completely a different record, and the application have to point > to the same record before the order by. > Because the JTable is always showing a postgres table, and my postgres > tables always have a primary key, I can identify (by means of the > primary key) one record, I can get the Primary Key of the actual record, > and after the ORDER BY, I can search (secuentially in the resultset) the > record pointer (this is actually the row index in the rresultset) until > the PK matches. > > Before I change to the LIMIT version of my SELECTs, this was no problem, > the search was in memory within the entire ResultSet, but this is > prohibitive in very large resultsets, because of the memory constraints. > > I have looked that in the server side the query is made very > efficiently, because the response time for the first 600 records is very > fast, ( the last 600 records are no as fasts as the firsts). > > So I was trying to find some way to make the search of the record index > in the server side if possible, because actually, I make the search on > the client side, but it is very slow, the client gets 600 records, make > the search and if the record is not found gets another 600 records > (other SELECT clause) and searches again, and if not found, gets another > 600 records...... and so on, making this process very slow. > > Some ideas that came to me reading the docs, are a Stored Procedure that > can reach the resultset produced in the server side, that receives the > target record primary key, and makes the search returning -1 or the > record index. > > Other idea is send with the SELECT a function like > "SELECT *, getIndex(currRecordPK) FROM mytable WHERE mycondition ORDER > BY micolumn" > > in this case getIndex is a function that compares the Primary Key of the > current record with the one in the parameter (I have never made a > Postgresql function so maybe the way I say this works is wrong), and put > the index of the record that matches (if in the resultset is the record) > in somewhere that I can retrieve to make the current Record of the new > ResultSet. > > I hope someone can help me further in this. > > Thanks in advance > > Jesus Sandoval > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Csaba Nagy escribió: > Why don't you count the records where the sort column is smaller/bigger > (depending on sort order) than the current record's sort column ? > Something like: > select count(*) from mytable where mycondition and sortcolumn < crtvalue > (or > in case of descending ordering) > This of course will still possibly leave you multiple records with the > same value of the sort column as the value of the current record, but in > this case you anyway should select all of those records I guess... > > HTH, > Csaba. > > On Fri, 2003-08-01 at 13:12, Jesus Sandoval wrote: > > I have a requirement that don't know how to accomplish: > > > > The application is returning 300,000 records in some SELECT clause (a > > ResultSet) this finish the client's JVM memory, so my solution to this > > was to use the SELECT's LIMIT clause returning 600 records and making > > the internal logic that when the user gets past the 600th record the > > application automatically issues another query with LIMIT and OFFSET > > clauses causing a little delay but this is better than finishing the > > memory in the client's machine (if you know a better solution or have > > some advices, please let me know). > > > > My request here is, that if the user makes the record 250,000 the > > current record, because in the grphical interface (a JTable by the way), > > and selects to ORDER BY a different column, the record number 250,000 > > will be completely a different record, and the application have to point > > to the same record before the order by. > > Because the JTable is always showing a postgres table, and my postgres > > tables always have a primary key, I can identify (by means of the > > primary key) one record, I can get the Primary Key of the actual record, > > and after the ORDER BY, I can search (secuentially in the resultset) the > > record pointer (this is actually the row index in the rresultset) until > > the PK matches. > > > > Before I change to the LIMIT version of my SELECTs, this was no problem, > > the search was in memory within the entire ResultSet, but this is > > prohibitive in very large resultsets, because of the memory constraints. > > > > I have looked that in the server side the query is made very > > efficiently, because the response time for the first 600 records is very > > fast, ( the last 600 records are no as fasts as the firsts). > > > > So I was trying to find some way to make the search of the record index > > in the server side if possible, because actually, I make the search on > > the client side, but it is very slow, the client gets 600 records, make > > the search and if the record is not found gets another 600 records > > (other SELECT clause) and searches again, and if not found, gets another > > 600 records...... and so on, making this process very slow. > > > > Some ideas that came to me reading the docs, are a Stored Procedure that > > can reach the resultset produced in the server side, that receives the > > target record primary key, and makes the search returning -1 or the > > record index. > > > > Other idea is send with the SELECT a function like > > "SELECT *, getIndex(currRecordPK) FROM mytable WHERE mycondition ORDER > > BY micolumn" > > > > in this case getIndex is a function that compares the Primary Key of the > > current record with the one in the parameter (I have never made a > > Postgresql function so maybe the way I say this works is wrong), and put > > the index of the record that matches (if in the resultset is the record) > > in somewhere that I can retrieve to make the current Record of the new > > ResultSet. > > > > I hope someone can help me further in this. > > > > Thanks in advance > > > > Jesus Sandoval > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org Interesting approach, I was thinking about it but, and did some tests, but I think that is innaplicable because, before knowing the relative position, the resultset must be ordered. I think that maybe elaborating more the query is possible to know it but I think that the solution may be too complicated. I think that making a server side function will be the solution, but I don't know if somebody can discuss with me this kind of solution, maybe this is subject of other list, if it is so, please let me know to start posting in. I have been reading the docs, and it appears that you never cah get what you need to solve the problem, I have the following ideas: The easiest way to implement a server side function is SQL but is too restrictive, the next guess is PL/pgSQL or PL/Tcl (as I know a little bit of Tcl). I am trying PL/pgSQL but I don't know how to do this. I want to declare a CURSOR with only primary-key columns with the condition and order by, the same as the DATA CURSOR (this is because I expect that the user queries the contents of this cursor many times, and I don't want to make the SELECT every time I am searching for a key value, I don't know if this is possibly, anyway) CREATE a FUNCTION that starts a counter with 1, Start the CURSOR at the beginning, and makes a LOOP until the value received (as a parameter) matches the primary-key column in the CURSOR (and, ofcourse increment the counter with 1 in each iteration). Return the value of the counter or -1 if counter > totalrows. As far as I have reached, if I declare a cursor outside the PL/pgSQL function, I can't see it from inside the function, besides the PL/pgSQL CURSOR instructions does not include MOVE and I don't know how to make the cursor start at the beginning each time the function is called. Definitely this is not the rigth list to discuss this, so point me where can I post this with better luck. Thanks Jesus Sandoval