Thread: using CURSOR with PHP
Does anyone has a solution to use the CURSOR function with PHP. The main problem - in my opinion - is, that the CURSOR must be closed and if the database is accessed from a web browser, this can not be ensured. -- --//--\\-- Eckhard Hoeffner e-hoeffner@fifoost.org Tal 44 D-80331 München
Eckhard Hoeffner wrote: > Does anyone has a solution to use the CURSOR function with PHP. The > main problem - in my opinion - is, that the CURSOR must be closed > and if the database is accessed from a web browser, this can not be > ensured. > I don't use cursors in web applications myself, since PostgreSQL supports the excellent "SELECT ... LIMIT <n> OFFSET <m>" syntax in all of the cases where I might otherwise have done so. That's my experience to date, anyway - I'm prepared to find somewhere a cursor might be a requirement - I know that if you wanted to write cross-database portable code you would probably need to work with them. Regardless, unless you open your database connection as a persistent connection, the database connection will be closed when the script finishes, and that should clean up all that is necessary. Regards, Andrew. -- -------------------------------------------------------------------- Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 Are you enrolled at http://schoolreunions.co.nz/ yet?
* Andrew McMillan <andrew@catalyst.net.nz> [18 05 02 12:36]: >Eckhard Hoeffner wrote: >>Does anyone has a solution to use the CURSOR function with PHP. The >>main problem - in my opinion - is, that the CURSOR must be closed >>and if the database is accessed from a web browser, this can not be >>ensured. >> > >I don't use cursors in web applications myself, since PostgreSQL >supports the excellent "SELECT ... LIMIT <n> OFFSET <m>" syntax in all >of the cases where I might otherwise have done so. That's what I wanted to avoid as this results in a little more work. >That's my experience to date, anyway - I'm prepared to find somewhere a >cursor might be a requirement - I know that if you wanted to write >cross-database portable code you would probably need to work with them. > >Regardless, unless you open your database connection as a persistent >connection, the database connection will be closed when the script >finishes, and that should clean up all that is necessary. I used persistent connections, however, I got lots of <defunct> processes. If the CURSOR is closed when the script has finished (pg_connect), it makes no sense (for me) using it, because I can not move within the results. -- --//--\\-- Eckhard Hoeffner e-hoeffner@fifoost.org Tal 44 D-80331 München
On Sat, 2002-05-18 at 23:03, Eckhard Hoeffner wrote: > > > >I don't use cursors in web applications myself, since PostgreSQL > >supports the excellent "SELECT ... LIMIT <n> OFFSET <m>" syntax in all > >of the cases where I might otherwise have done so. > > That's what I wanted to avoid as this results in a little more > work. > > >That's my experience to date, anyway - I'm prepared to find somewhere a > >cursor might be a requirement - I know that if you wanted to write > >cross-database portable code you would probably need to work with them. > > > >Regardless, unless you open your database connection as a persistent > >connection, the database connection will be closed when the script > >finishes, and that should clean up all that is necessary. > > I used persistent connections, however, I got lots of <defunct> > processes. > If the CURSOR is closed when the script has finished (pg_connect), > it makes no sense (for me) using it, because I can not move within > the results. Right, so your problem is you're trying to use CURSOR, which you are leaving open on the off-chance your visitor will returns and wants to see the 'next record' or something. I think that really you just should bite the bullet and use the better syntax on SELECT. You may have a little more programming work to do, but it has little impact on performance as far as I can see. PostgreSQL will optimise queries differently when they have LIMIT clauses too. Regards, Andrew. -- -------------------------------------------------------------------- Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 Are you enrolled at http://schoolreunions.co.nz/ yet?
on 5/18/02 6:09 AM, andrew@catalyst.net.nz purportedly said: >> I used persistent connections, however, I got lots of <defunct> >> processes. >> If the CURSOR is closed when the script has finished (pg_connect), >> it makes no sense (for me) using it, because I can not move within >> the results. > > Right, so your problem is you're trying to use CURSOR, which you are > leaving open on the off-chance your visitor will returns and wants to > see the 'next record' or something. > > I think that really you just should bite the bullet and use the better > syntax on SELECT. You may have a little more programming work to do, > but it has little impact on performance as far as I can see. > > PostgreSQL will optimise queries differently when they have LIMIT > clauses too. I'm afraid Andrew is correct here--with persistent connections, you have no control, over when the connection is closed, and what connection you are using at any one point. Thus you would still have to keep track of where the cursor is, and furthermore you would have to check for the existence of the cursor on every script invocation, which I don't believe there is a command for, so you would have to try to use the cursor and then capture any error. In the case of an error, you would need to re-define the cursor and move to the current point according to the stored cursor location info. Depending on your program logic, this could be more difficult than using LIMIT. Using LIMIT has the added benefit of avoiding the overhead of many unused cursors, which can build up quickly as you also have little or no control over the cursors since cursors are tied to the connection. The drawback to using LIMIT is that you are executing the query on every call, and Postgres cannot optimize on a LIMIT as was mentioned, because the entire query has to be collected before the LIMIT can be applied. However, IIRC, Postgres does query result caching, so subsequent calls on the same query will tend to be faster, providing there is enough memory allocated to support it. Keary Suska Esoteritech, Inc. "Leveraging Open Source for a better Internet"
* Keary Suska <hierophant@pcisys.net> [18 05 02 20:32]: >on 5/18/02 6:09 AM, andrew@catalyst.net.nz purportedly said: > >>> I used persistent connections, however, I got lots of <defunct> >>> processes. >>> If the CURSOR is closed when the script has finished (pg_connect), >>> it makes no sense (for me) using it, because I can not move within >>> the results. >> >> Right, so your problem is you're trying to use CURSOR, which you are >> leaving open on the off-chance your visitor will returns and wants to >> see the 'next record' or something. >> >> I think that really you just should bite the bullet and use the better >> syntax on SELECT. You may have a little more programming work to do, >> but it has little impact on performance as far as I can see. >> >> PostgreSQL will optimise queries differently when they have LIMIT >> clauses too. > >I'm afraid Andrew is correct here--with persistent connections, you have no >control, over when the connection is closed, and what connection you are >using at any one point. Thus you would still have to keep track of where the >cursor is, and furthermore you would have to check for the existence of the >cursor on every script invocation, which I don't believe there is a command >for, so you would have to try to use the cursor and then capture any error. >In the case of an error, you would need to re-define the cursor and move to >the current point according to the stored cursor location info. > >Depending on your program logic, this could be more difficult than using >LIMIT. Using LIMIT has the added benefit of avoiding the overhead of many >unused cursors, which can build up quickly as you also have little or no >control over the cursors since cursors are tied to the connection. > >The drawback to using LIMIT is that you are executing the query on every >call, and Postgres cannot optimize on a LIMIT as was mentioned, because the >entire query has to be collected before the LIMIT can be applied. However, >IIRC, Postgres does query result caching, so subsequent calls on the same >query will tend to be faster, providing there is enough memory allocated to >support it. > That's what I assumed, because the HTTP-protocol does not allow to establish persistent connections. The web server is sending the requested ducument and closes the connection. Each new request is a new connection as regards to the http protocol. If you are using non persistent database connections, the CURSOR is closed when the script has finished. If you are using persisten database connections you can establish an individual CURSOR for each user - using for example the user_id as name for the CURSOR (or a random number). Then, however, the user may shut down the browser and we still have the CURSOR. Thanks for your help. -- --//--\\-- Eckhard Hoeffner e-hoeffner@fifoost.org Tal 44 D-80331 München
On Sun, 2002-05-19 at 06:32, Keary Suska wrote: > > The drawback to using LIMIT is that you are executing the query on every > call, and Postgres cannot optimize on a LIMIT as was mentioned, because the > entire query has to be collected before the LIMIT can be applied. However, > IIRC, Postgres does query result caching, so subsequent calls on the same > query will tend to be faster, providing there is enough memory allocated to > support it. PostgreSQL will certainly take LIMIT into account when planning queries. EXPLAIN SELECT * FROM mybigtable WHERE x = y; and EXPLAIN SELECT * FROM mybigtable WHERE x = y LIMIT n; Will regularly show different query plans. For more complex queries, however, the plans are less likely to differ by much. OTOH PostgreSQL does _not_ do result caching, unless you have applied some patches that were floating around last year some time. Regards, Andrew. -- -------------------------------------------------------------------- Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 Are you enrolled at http://schoolreunions.co.nz/ yet?
Andrew McMillan writes: > PostgreSQL will certainly take LIMIT into account when planning queries. I would think most queries using LIMIT would also include an ORDER BY clause which would require PostgreSQL to sort all the results before applying the LIMIT. > OTOH PostgreSQL does _not_ do result caching, unless you have applied > some patches that were floating around last year some time. PostgreSQL might not do 'result caching' but if you have an adequate number of buffers configured would it not be likely that the results from the last query are still in memory? Michael J. Upchurch Partner2Partner Communications mike@partner2partner.com 615.286.2199
On Sun, 2002-05-19 at 11:18, mike wrote: > Andrew McMillan writes: > > PostgreSQL will certainly take LIMIT into account when planning queries. > > I would think most queries using LIMIT would also include an ORDER BY clause > which would require PostgreSQL to sort all the results before applying the > LIMIT. Then you would be wrong. ORDER BY can also be handled by an index, if there is an appropriate one, and this is where PostgreSQL will switch query plans because traversing an index to grab a few hundred records is much faster than a sequential scan and sort of many thousands. With 7.2 PostgreSQL is getting pretty smart with its query planning, but recognising limits and doing index scans is something that has been around since at least 7.0, possibly earlier. > > OTOH PostgreSQL does _not_ do result caching, unless you have applied > > some patches that were floating around last year some time. > > PostgreSQL might not do 'result caching' but if you have an adequate number > of buffers configured would it not be likely that the results from the last > query are still in memory? Yep. Or the tables could be in Linux' file cache too, but neither of those provides the same huge performance boost that result caching will in those few narrow situations in which it applies. I'd really like to see result caching incorporated into PostgreSQL at some point but there are many features like that. Being able to natively create full-text indexes remains my most wished-for feature. Cheers, Andrew. -- -------------------------------------------------------------------- Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 Are you enrolled at http://schoolreunions.co.nz/ yet?
on 5/18/02 4:24 PM, andrew@catalyst.net.nz purportedly said: > PostgreSQL will certainly take LIMIT into account when planning queries. > > EXPLAIN SELECT * FROM mybigtable WHERE x = y; > > and > > EXPLAIN SELECT * FROM mybigtable WHERE x = y LIMIT n; > > Will regularly show different query plans. Limit *is* taken into account, and a different query plan may be used depending on the LIMIT and OFFSET values. However, as you will see below, the query plans are all identical (aside from the report on LIMIT cost, and using an index because of sorting). Here is a real example with a table with 32k records: milewiz=> explain select * from activity; Seq Scan on activity (cost=0.00..20.00 rows=1000 width=39) milewiz=> explain select * from activity limit 200; Limit (cost=0.00..4.00 rows=200 width=39) -> Seq Scan on activity (cost=0.00..20.00 rows=1000 width=39) Here a sequential scan is used but because we want only the first 200 tuples, getting results is much faster. Identical results would likely appear from adding a WHERE clause. However, using LIMIT is next to useless unless the results are ordered, which would be necessary for the application in question: milewiz=> explain select * from activity order by date; Index Scan using act_date_idx on activity (cost=0.00..59.00 rows=1000 width=39) milewiz=> explain select * from activity order by date limit 200; NOTICE: QUERY PLAN: Limit (cost=0.00..11.80 rows=200 width=39) -> Index Scan using act_date_idx on activity (cost=0.00..59.00 rows=1000 width=39) Here the cost is much higher, partly because of the sort, and partly because the more results must be gathered to satisfy the ordering. But only the first call will LIMIT n--subsequent calls must use offset. Consider: milewiz=> explain select * from activity order by date limit 200 offset 200; Limit (cost=11.80..23.60 rows=200 width=39) -> Index Scan using act_date_idx on activity (cost=0.00..59.00 rows=1000 width=39) milewiz=> explain select * from activity order by date limit 200 offset 800; Limit (cost=47.20..59.00 rows=200 width=39) -> Index Scan using act_date_idx on activity (cost=0.00..59.00 rows=1000 width=39) milewiz=> explain select * from activity order by date limit 200 offset 10000; Limit (cost=59.00..59.00 rows=1 width=39) -> Index Scan using act_date_idx on activity (cost=0.00..59.00 rows=1000 width=39) The optimizer assumes a complete table return will only return 1000 rows, which in fact will return 32k rows, so as we progress through calls, at a certain point the optimizer will execute a plan with the same cost as a full table scan, regardless of the limit. Frequent VACUUM ANALYZE will help this somewhat. And as you can see, the query plans are all identical. The performance will depend mostly on the differences between field values, which will effect how many records must be scanned in order to fulfill the LIMIT. At a certain point, this becomes the same as a full scan. With smaller table results, this will happen sooner in the game. > For more complex queries, however, the plans are less likely to differ > Actually, as you can see with my examples above, the reverse is true, as the simple query produced the same plan. The more complex the query, the more choices the optimizer has for picking the best case, which unfortunately is not always the truly best case. > > OTOH PostgreSQL does _not_ do result caching, unless you have applied > some patches that were floating around last year some time. Yes, I recalled wrong. Postgres does not cache results, but caches tables. So subsequent queries on particular tables will improve, as long as there is enough shared memory to accommodate the tables, or may vary based on results. If only a partial table can be kept in cache, as long as Postgres can retrieve just from the cached part, queries will be optimized. Keary Suska Esoteritech, Inc. "Leveraging Open Source for a better Internet"
On Sun, 2002-05-19 at 13:07, Keary Suska wrote: > > > For more complex queries, however, the plans are less likely to differ > > > Actually, as you can see with my examples above, the reverse is true, as the > simple query produced the same plan. The more complex the query, the more > choices the optimizer has for picking the best case, which unfortunately is > not always the truly best case. No, I meant _complex_ plans: pcno=# explain select * from vwmbradr where centre_id = 56; NOTICE: QUERY PLAN: Subquery Scan vwmbradr (cost=636.68..636.68 rows=1 width=418) -> Sort (cost=636.68..636.68 rows=1 width=418) -> Nested Loop (cost=0.00..636.67 rows=1 width=418) -> Nested Loop (cost=0.00..630.67 rows=1 width=340) -> Nested Loop (cost=0.00..624.68 rows=1 width=254) -> Index Scan using primary_centre_id_index on constituents (cost=0.00..622.35 rows=1 width=212) -> Seq Scan on centres (cost=0.00..1.59 rows=59 width=42) -> Index Scan using constituents_pkey on constituents (cost=0.00..5.98 rows=1 width=86) -> Index Scan using constituents_pkey on constituents (cost=0.00..5.98 rows=1 width=78) EXPLAIN pcno=# explain select * from vwmbradr where centre_id = 56 limit 1; NOTICE: QUERY PLAN: Limit (cost=636.68..636.68 rows=1 width=418) -> Subquery Scan vwmbradr (cost=636.68..636.68 rows=1 width=418) -> Sort (cost=636.68..636.68 rows=1 width=418) -> Nested Loop (cost=0.00..636.67 rows=1 width=418) -> Nested Loop (cost=0.00..630.67 rows=1 width=340) -> Nested Loop (cost=0.00..624.68 rows=1 width=254) -> Index Scan using primary_centre_id_index on constituents (cost=0.00..622.35 rows=1 width=212) -> Seq Scan on centres (cost=0.00..1.59 rows=59 width=42) -> Index Scan using constituents_pkey on constituents (cost=0.00..5.98 rows=1 width=86) -> Index Scan using constituents_pkey on constituents (cost=0.00..5.98 rows=1 width=78) EXPLAIN :-) Cheers, Andrew. -- -------------------------------------------------------------------- Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 Are you enrolled at http://schoolreunions.co.nz/ yet?
On 19 May 2002, Andrew McMillan wrote: > On Sun, 2002-05-19 at 11:18, mike wrote: > > Andrew McMillan writes: > > > PostgreSQL will certainly take LIMIT into account when planning queries. > > > > I would think most queries using LIMIT would also include an ORDER BY clause > > which would require PostgreSQL to sort all the results before applying the > > LIMIT. > > Then you would be wrong. ORDER BY can also be handled by an index, if > there is an appropriate one, and this is where PostgreSQL will switch > query plans because traversing an index to grab a few hundred records is > much faster than a sequential scan and sort of many thousands. It's not so easy. Often a single seq. scan (assuming it turns out into very few seq. read(2) from a file) provides MUCH more bandwidth than a set of random reads, given how modern disk performs (I've seen >40MB/s seq. read rates, paired with <2MB/s for random reads). Say you have a 40MB table, with 10000 records. Accessing 500 records in random order may take longer that reading all 10000 sequentially. This is expecially true for small tables, of course: the various cache systems (whole tracks by the disk firmware, and OS read ahead and page cache) may end up in reading the whole table anyway, even if you're doing an index scan for very few records. It could even be a good idea to delete indexes at all on small tables. But I agree that, on big tables, an index may speed up ORDER BY clause a lot. .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
On Sat, 18 May 2002, Eckhard Hoeffner wrote: > * Andrew McMillan <andrew@catalyst.net.nz> [18 05 02 12:36]: > > >Eckhard Hoeffner wrote: > >>Does anyone has a solution to use the CURSOR function with PHP. The > >>main problem - in my opinion - is, that the CURSOR must be closed > >>and if the database is accessed from a web browser, this can not be > >>ensured. > >> > > > >I don't use cursors in web applications myself, since PostgreSQL > >supports the excellent "SELECT ... LIMIT <n> OFFSET <m>" syntax in all > >of the cases where I might otherwise have done so. > > That's what I wanted to avoid as this results in a little more > work. > > >That's my experience to date, anyway - I'm prepared to find somewhere a > >cursor might be a requirement - I know that if you wanted to write > >cross-database portable code you would probably need to work with them. > > > >Regardless, unless you open your database connection as a persistent > >connection, the database connection will be closed when the script > >finishes, and that should clean up all that is necessary. > > I used persistent connections, however, I got lots of <defunct> > processes. > If the CURSOR is closed when the script has finished (pg_connect), > it makes no sense (for me) using it, because I can not move within > the results. AFAIK, a CURSOR may be used only inside a single transaction, not a single session. So it can't be used to 'page' the output of a query, for example. And, also AFAIK, even with pconnect, each connect creates a different session. Only the (TCP) connection is re-used. I mean, nothing changes from the PHP script standpoint when you switch from pg_connect() to pg_pconnect(). Quoting the PHP manual: " People who aren't thoroughly familiar with the way web servers work and distribute the load may mistake persistent connects for what they're not. In particular, they do not give you an ability to open 'user sessions' on the same SQL link, they do not give you an ability to build up a transaction efficently, and they don't do a whole lot of other things. In fact, to be extremely clear about the subject, persistent connections don't give you any functionality that wasn't possible with their non-persistent brothers. " As I read it, the SQL session is reset anyway, no matter if you use pg_connect() o pg_pconnect(). And any state saved on the DB backend is lost, even if you're lucky enough so that your HTTP client connects again to the same HTTP server process. This includes loosing any pending transaction (I guess they are rollbacked) and so any cursor state. I strongly suspect this includes any TEMP table as well (but I haven't tried it). .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
On 19 May 2002, Andrew McMillan wrote: > On Sun, 2002-05-19 at 13:07, Keary Suska wrote: > > > > > For more complex queries, however, the plans are less likely to differ > > > > > Actually, as you can see with my examples above, the reverse is true, as the > > simple query produced the same plan. The more complex the query, the more > > choices the optimizer has for picking the best case, which unfortunately is > > not always the truly best case. > > No, I meant _complex_ plans: > > pcno=# explain select * from vwmbradr where centre_id = 56; > NOTICE: QUERY PLAN: > > Subquery Scan vwmbradr (cost=636.68..636.68 rows=1 width=418) > -> Sort (cost=636.68..636.68 rows=1 width=418) > -> Nested Loop (cost=0.00..636.67 rows=1 width=418) > -> Nested Loop (cost=0.00..630.67 rows=1 width=340) > -> Nested Loop (cost=0.00..624.68 rows=1 > width=254) > -> Index Scan using primary_centre_id_index > on constituents (cost=0.00..622.35 rows=1 width=212) > -> Seq Scan on centres (cost=0.00..1.59 > rows=59 width=42) > -> Index Scan using constituents_pkey on > constituents (cost=0.00..5.98 rows=1 width=86) > -> Index Scan using constituents_pkey on constituents > (cost=0.00..5.98 rows=1 width=78) > > EXPLAIN > pcno=# explain select * from vwmbradr where centre_id = 56 limit 1; > NOTICE: QUERY PLAN: > > Limit (cost=636.68..636.68 rows=1 width=418) > -> Subquery Scan vwmbradr (cost=636.68..636.68 rows=1 width=418) > -> Sort (cost=636.68..636.68 rows=1 width=418) > -> Nested Loop (cost=0.00..636.67 rows=1 width=418) > -> Nested Loop (cost=0.00..630.67 rows=1 > width=340) > -> Nested Loop (cost=0.00..624.68 rows=1 > width=254) > -> Index Scan using > primary_centre_id_index on constituents (cost=0.00..622.35 rows=1 > width=212) > -> Seq Scan on centres > (cost=0.00..1.59 rows=59 width=42) > -> Index Scan using constituents_pkey on > constituents (cost=0.00..5.98 rows=1 width=86) > -> Index Scan using constituents_pkey on > constituents (cost=0.00..5.98 rows=1 width=78) > > EXPLAIN > > > :-) Remove the 'where centre_id = 56' clause, please, and post output. I'm curious. In you example I guess both select return one single row... It's rows=1 almost everywhere - it means no cost for many operations. .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
* Marco Colombo <marco@esi.it> [28 05 02 17:32]: >On Sat, 18 May 2002, Eckhard Hoeffner wrote: > >> * Andrew McMillan <andrew@catalyst.net.nz> [18 05 02 12:36]: >> >> >Eckhard Hoeffner wrote: >> >>Does anyone has a solution to use the CURSOR function with PHP. The >> >>main problem - in my opinion - is, that the CURSOR must be closed >> >>and if the database is accessed from a web browser, this can not be >> >>ensured. >> >> >> > >> >I don't use cursors in web applications myself, since PostgreSQL >> >supports the excellent "SELECT ... LIMIT <n> OFFSET <m>" syntax in all >> >of the cases where I might otherwise have done so. >> >> That's what I wanted to avoid as this results in a little more >> work. >> >> >That's my experience to date, anyway - I'm prepared to find somewhere a >> >cursor might be a requirement - I know that if you wanted to write >> >cross-database portable code you would probably need to work with them. >> > >> >Regardless, unless you open your database connection as a persistent >> >connection, the database connection will be closed when the script >> >finishes, and that should clean up all that is necessary. >> >> I used persistent connections, however, I got lots of <defunct> >> processes. >> If the CURSOR is closed when the script has finished (pg_connect), >> it makes no sense (for me) using it, because I can not move within >> the results. > >AFAIK, a CURSOR may be used only inside a single transaction, not a >single session. So it can't be used to 'page' the output of a query, >for example. > >And, also AFAIK, even with pconnect, each connect creates a different >session. Only the (TCP) connection is re-used. I mean, nothing changes >from the PHP script standpoint when you switch from pg_connect() to >pg_pconnect(). I think, you are right. The connection browser<->web server is cut after the web server has send all TCP/IP-packages, no matter how long it takes till they are received on the client side (browser). If there ist a request for a new database connection, the web server does not know whether this is the former user or a new user. For this reason the result I wanted to acchive is not possible. > >Quoting the PHP manual: > >" People who aren't thoroughly familiar with the way web servers > work and distribute the load may mistake persistent connects for what > they're not. In particular, they do not give you an ability to open > 'user sessions' on the same SQL link, they do not give you an ability > to build up a transaction efficently, and they don't do a whole lot > of other things. In fact, to be extremely clear about the subject, > persistent connections don't give you any functionality that wasn't > possible with their non-persistent brothers. " > >As I read it, the SQL session is reset anyway, no matter if you use >pg_connect() o pg_pconnect(). And any state saved on the DB backend >is lost, even if you're lucky enough so that your HTTP client connects >again to the same HTTP server process. This includes loosing any pending >transaction (I guess they are rollbacked) and so any cursor state. >I strongly suspect this includes any TEMP table as well (but I haven't >tried it). > >.TM. >-- > ____/ ____/ / > / / / Marco Colombo > ___/ ___ / / Technical Manager > / / / ESI s.r.l. > _____/ _____/ _/ Colombo@ESI.it > -- --//--\\-- Eckhard Hoeffner e-hoeffner@fifoost.org Tal 44 D-80331 München
On Tue, 28 May 2002, Eckhard Hoeffner wrote: > * Marco Colombo <marco@esi.it> [28 05 02 17:32]: > >AFAIK, a CURSOR may be used only inside a single transaction, not a > >single session. So it can't be used to 'page' the output of a query, > >for example. > > > >And, also AFAIK, even with pconnect, each connect creates a different > >session. Only the (TCP) connection is re-used. I mean, nothing changes > >from the PHP script standpoint when you switch from pg_connect() to > >pg_pconnect(). > > I think, you are right. The connection browser<->web server is cut > after the web server has send all TCP/IP-packages, no matter how > long it takes till they are received on the client side (browser). > If there ist a request for a new database connection, the web server > does not know whether this is the former user or a new user. For > this reason the result I wanted to acchive is not possible. Well, with HTTP keepalive, it may even keep the HTTP session open. But the problem is that you can't relay on it. Moreover, I suspect that PHP resets the SQL session when it performs an pconnect (that means, it asks the SQL backend to abort uncommitted transactions, drop TEMP relations, and so on). And, no matter what is the actual behaviour, the docs are clear: "An important summary. Persistent connections were designed to have one-to-one mapping to regular connections. That means that you should always be able to replace persistent connections with non-persistent connections, and it won't change the way your script behaves. It may (and probably will) change the efficiency of the script, but not its behavior!" so even if it worked, it's an undocumented feature... well, it's a bug, really. pconnect() is *documented* to have no side effects. B-) .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
> AFAIK, a CURSOR may be used only inside a single transaction, not a > single session. So it can't be used to 'page' the output of a query, > for example. > > And, also AFAIK, even with pconnect, each connect creates a different > session. Only the (TCP) connection is re-used. I mean, nothing changes > from the PHP script standpoint when you switch from pg_connect() to > pg_pconnect(). > > Quoting the PHP manual: > > " People who aren't thoroughly familiar with the way web servers > work and distribute the load may mistake persistent connects for what > they're not. In particular, they do not give you an ability to open > 'user sessions' on the same SQL link, they do not give you an ability > to build up a transaction efficently, and they don't do a whole lot > of other things. In fact, to be extremely clear about the subject, > persistent connections don't give you any functionality that wasn't > possible with their non-persistent brothers. " > Funny, I am now writing an analysis for myself about this problem. I have experience in databases, and I want to use my experience to build the same kinds of applications that I built on minicomputers. For this I got my requirements together : - I like postgreSQL a lot and includes transactions, this was an easy choice - I wanted to use HTML to build my front-ends (cross-platform!) - Because PHP is neatly integrated with HTML, it is a better choice than Perl or Python However, the only thing that seems to be missing is a layer between CGI and the database. If someone logs in to the system, this layer should return a session key and provide a process which opens a connection to the database and keeps this open until the user logs out. This should be transparent to the application programmer. Since normally all traffic between PHP and postgreSQL goes via the pgsql library, should it not be possible to add an interface which acts as pgsql and handles these sessions and is able to communicate with the real pgsql library, which will then be part of the running program ? In this way it should then be possible to open transactions across CGI invocations and to work with cursors. Does anyone know if something like this exists as Free/OSS ? Regards, Jurgen Defurne
* Marco Colombo <marco@esi.it> [28 05 02 18:26]: >On Tue, 28 May 2002, Eckhard Hoeffner wrote: > >> * Marco Colombo <marco@esi.it> [28 05 02 17:32]: > >> >AFAIK, a CURSOR may be used only inside a single transaction, not a >> >single session. So it can't be used to 'page' the output of a query, >> >for example. >> > >> >And, also AFAIK, even with pconnect, each connect creates a different >> >session. Only the (TCP) connection is re-used. I mean, nothing changes >> >from the PHP script standpoint when you switch from pg_connect() to >> >pg_pconnect(). >> >> I think, you are right. The connection browser<->web server is cut >> after the web server has send all TCP/IP-packages, no matter how >> long it takes till they are received on the client side (browser). >> If there ist a request for a new database connection, the web server >> does not know whether this is the former user or a new user. For >> this reason the result I wanted to acchive is not possible. > >Well, with HTTP keepalive, it may even keep the HTTP session open. >But the problem is that you can't relay on it. >Moreover, I suspect that PHP resets the SQL session when it performs >an pconnect (that means, it asks the SQL backend to abort uncommitted >transactions, drop TEMP relations, and so on). > >And, no matter what is the actual behaviour, the docs are clear: > > "An important summary. Persistent connections were designed to have > one-to-one mapping to regular connections. That means that you should > always be able to replace persistent connections with non-persistent > connections, and it won't change the way your script behaves. It may > (and probably will) change the efficiency of the script, but not its > behavior!" > >so even if it worked, it's an undocumented feature... well, it's a bug, I did not claim, that it worked and after having read the thread I also think, it will not work, however, I will try this at home. If it works, I will send the results. -- --//--\\-- Eckhard Hoeffner e-hoeffner@fifoost.org Tal 44 D-80331 München
on 5/28/02 10:26 AM, marco@esi.it purportedly said: > Well, with HTTP keepalive, it may even keep the HTTP session open. > But the problem is that you can't relay on it. > Moreover, I suspect that PHP resets the SQL session when it performs > an pconnect (that means, it asks the SQL backend to abort uncommitted > transactions, drop TEMP relations, and so on). Actually, no. According to the docs: "Warning There are a couple of additional caveats to keep in mind when using persistent connections. One is that when using table locking on a persistent connection, if the script for whatever reason cannot release the lock, then subsequent scripts using the same connection will block indefinitely and may require that you either restart the httpd server or the database server. Another is that when using transactions, a transaction block will also carry over to the next script which uses that connection if script execution ends before the transaction block does. In either case, you can use register_shutdown_function() to register a simple cleanup function to unlock your tables or roll back your transactions. Better yet, avoid the problem entirely by not using persistent connections in scripts which use table locks or transactions (you can still use them elsewhere)." I had heard that this auto-cleanup functionality was supposed to be added, and there is a directive "pgsql.auto_reset_persistent" that is undocumented, that perhaps implements this, but it may be considered experimental. Unless the docs aren't properly up to date, which is not uncommon. Keary Suska Esoteritech, Inc. "Leveraging Open Source for a better Internet"
On Wed, 2002-05-29 at 04:29, Jurgen Defurne wrote: > > However, the only thing that seems to be missing is a layer between CGI > and the database. If someone logs in to the system, this layer should > return a session key and provide a process which opens a connection to > the database and keeps this open until the user logs out. > > This should be transparent to the application programmer. Since normally > all traffic between PHP and postgreSQL goes via the pgsql library, > should it not be possible to add an interface which acts as pgsql and > handles these sessions and is able to communicate with the real pgsql > library, which will then be part of the running program ? > > In this way it should then be possible to open transactions across CGI > invocations and to work with cursors. > > Does anyone know if something like this exists as Free/OSS ? PHP implements a bunch of session handling functions. If you desire you can use the session ID from these to retrieve a record from the database. For myself, I have developed my own session handling, rather than use PHP's. This was perhaps not a good choice, but it is a path I went down some years ago now and it is hard to change :-) PHP sessions do not require the database to be there, of course, allowing you to stuff information into an in-memory session structure, but this limits the amount of information that can be associated easily. My own session management does use the database to hold session- and user- related information, retrieving (and possibly updating) for each page the user visits during their session. Regards, Andrew. -- -------------------------------------------------------------------- Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 Are you enrolled at http://schoolreunions.co.nz/ yet?
On Tue, 28 May 2002, Keary Suska wrote: > on 5/28/02 10:26 AM, marco@esi.it purportedly said: > > > Well, with HTTP keepalive, it may even keep the HTTP session open. > > But the problem is that you can't relay on it. > > Moreover, I suspect that PHP resets the SQL session when it performs > > an pconnect (that means, it asks the SQL backend to abort uncommitted > > transactions, drop TEMP relations, and so on). > > Actually, no. According to the docs: > > "Warning > There are a couple of additional caveats to keep in mind when using > persistent connections. One is that when using table locking on a persistent > connection, if the script for whatever reason cannot release the lock, then > subsequent scripts using the same connection will block indefinitely and may > require that you either restart the httpd server or the database server. > Another is that when using transactions, a transaction block will also carry > over to the next script which uses that connection if script execution ends > before the transaction block does. In either case, you can use > register_shutdown_function() to register a simple cleanup function to unlock > your tables or roll back your transactions. Better yet, avoid the problem > entirely by not using persistent connections in scripts which use table > locks or transactions (you can still use them elsewhere)." Looks like that paragraph is missing from my local copy of the manual: time to update it. B-) Anyway, if PHP isn't able to reset the existing connection to an initial state (just like a new one), it's a bug. Scripts shouldn't relay on that (mis)behaviour, IMHO. > I had heard that this auto-cleanup functionality was supposed to be added, > and there is a directive "pgsql.auto_reset_persistent" that is undocumented, > that perhaps implements this, but it may be considered experimental. Unless > the docs aren't properly up to date, which is not uncommon. > > > Keary Suska > Esoteritech, Inc. > "Leveraging Open Source for a better Internet" > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
Andrew McMillan wrote: > PHP implements a bunch of session handling functions. If you desire you > can use the session ID from these to retrieve a record from the > database. > > For myself, I have developed my own session handling, rather than use > PHP's. This was perhaps not a good choice, but it is a path I went down > some years ago now and it is hard to change :-) > > PHP sessions do not require the database to be there, of course, > allowing you to stuff information into an in-memory session structure, > but this limits the amount of information that can be associated easily. > > My own session management does use the database to hold session- and > user- related information, retrieving (and possibly updating) for each > page the user visits during their session. I know what you mean and I have written also this kind of session handling in Perl, storing session data between CGI invocations in the file system. One should even be able to define a generic interface and provide drivers to plug in one's favorite database tool (eg. a filesystem driver, a DBM driver, a mySQL driver, a postgreSQL driver). However, that is not what I meant. The type of session management I was referring to should create a process for every user logged on to the system. This process should keep the backend from the pgsql database alive and act as a gateway from PHP to postgreSQL, so that across CGI invocations you could do the following : - Startup your connection - Enter a screen which starts a transaction - Change data in several screens (still for the same transaction) - Finally in the last screen, COMMIT or ROLLBACK your transaction Starting a process, getting a socket and passing the TCP/IP number and socket back as a session key should not be that hard (I have written servers in Perl). Then, the interface from PHP should be able to pass it's pgsql requests to the process using the TCP/IP number and the socket. It should even be possible to do it synchronous or asynchronous. It is of course an itch to scratch, but for the moment I am planning the IT infrastructure of a school (we can start from nil, a good time to put everything under Linux and OOo). Regards, Jurgen Defurne
Andrew McMillan wrote: > > Sorry, I must have done too much web-development :-) > > I would always architect my application in such a way as for this not to > be needed, desired or able to happen! > > Hmmmm... I surely hope that I never itch in that way! > > After 20-odd years of application development I find the web-based > programs a great encouragement to stateless applications without > long-running transactions. Yeehar! At last! :-) > What do you do when there is a possibility that several people at once are trying to modify the same data ? Regards, Jurgen
> > > > What do you do when there is a possibility that several people at once > > are trying to modify the same data ? > > Restructure the application to avoid the conflict. That is what I understood from your previous message, but the following one, and what I also probably would do if possible. Your next answer ... > Failing that I will get the application to re-read the record prior to > saving, confirm that the existing record has not been modified in the > interim and present the situation to the user if it has. ... is really what I thought it would be : you duplicate a function which is already present in the database. I know that the postgreSQL transaction system works this way. On other systems you got locks and blocking transactions. If you can keep these things up outside CGI, then as an application programmer you don't have to reinvent the wheel. > Obviously "present the situation" can cover a number of approaches. > > Although I have written systems to do tricks like merge records where > they are changed by multiple people, I have actually never seen this > happen outside of testing! Well, I do not know about web applications, but I have worked in three different administrative environments (a bank, transports and quality control) and in each one of them I have encountered that two people at a time are trying to change the same data. > Systems I have worked on with few users have usually managed to have > narrow and separated responsibilities, and systems with many records > have always managed to have different people dealing with different sets > of them. The only time there is crossover is when people are off sick, > and of course there ends up being no conflict then either. Regards, Jurgen