Thread: Performance of the listen command
I cache a lot of data in the application, to invalidate that cache I have triggers that use notify to let new transactions eject the objects from the cache. My problem with listen is that when I start a new connection I run a listen command for each of my tables, that loop takes around 10 seconds to complete and I have around 150 tables. The strange thing is that just about any query runs in 1-4ms, but the listen command, which shouldn't even have to touch the disk takes around 60ms. Is it normal for listen to be so slow? ... and what can be done to help the situation? -- Regards Flemming Frandsen - http://dion.swamp.dk - YAPH
Flemming Frandsen wrote: > I cache a lot of data in the application, to invalidate that cache I > have triggers that use notify to let new transactions eject the objects > from the cache. > > My problem with listen is that when I start a new connection I run a > listen command for each of my tables, that loop takes around 10 seconds > to complete and I have around 150 tables. Do you regularly vacuum the pg_listener table? > The strange thing is that just about any query runs in 1-4ms, but the > listen command, which shouldn't even have to touch the disk takes around > 60ms. Actually, it does write to disk, because the listen information is kept on a table. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Fri, Jul 28, 2006 at 10:09:00PM -0400, Alvaro Herrera wrote: > Flemming Frandsen wrote: > > The strange thing is that just about any query runs in 1-4ms, but the > > listen command, which shouldn't even have to touch the disk takes around > > 60ms. > > Actually, it does write to disk, because the listen information is kept > on a table. Also, based on a 60ms-per-listen time I suspect you're not doing the listens in a transaction, so each listen is its own transaction that has to be committed, resulting in a disk hit. Try doing them all in one transaction. -- Michael Fuhr
Michael Fuhr wrote: > Also, based on a 60ms-per-listen time I suspect you're not doing > the listens in a transaction, so each listen is its own transaction > that has to be committed, resulting in a disk hit. Try doing them > all in one transaction. I think I am doing the listens in a transaction, as I connect via DBI with AutoCommit=>0, unless there is some bug that causes the listens to not start a new transaction when using DBI. I also see a problem with the first query I run in a transaction, it takes a very long time, even if it's simply a "select 6*7", I'm guessing this is because a new transaction is started, is there any way to improve performance of that? Alvaro Herrera wrote: > Do you regularly vacuum the pg_listener table? No, but this is on a system that has been running for a couple of days. -- Regards Flemming Frandsen - http://dion.swamp.dk - YAPH
On Sat, Jul 29, 2006 at 12:44:14PM +0200, Flemming Frandsen wrote: > Michael Fuhr wrote: > >Also, based on a 60ms-per-listen time I suspect you're not doing > >the listens in a transaction, so each listen is its own transaction > >that has to be committed, resulting in a disk hit. Try doing them > >all in one transaction. > > I think I am doing the listens in a transaction, as I connect via DBI > with AutoCommit=>0, unless there is some bug that causes the listens to > not start a new transaction when using DBI. Disabling autocommit should put all the listens in a single transaction unless you're committing each one, but it sounds like you aren't doing that. We can probably reject my hypothesis. > I also see a problem with the first query I run in a transaction, it > takes a very long time, even if it's simply a "select 6*7", I'm guessing > this is because a new transaction is started, is there any way to > improve performance of that? How long is "a very long time"? Does the first query's time include the time to connect or do you start timing after the connection has been made? > Alvaro Herrera wrote: > > Do you regularly vacuum the pg_listener table? > > No, but this is on a system that has been running for a couple of days. How often are you doing the listens? I just did a test in which I connected to a database 100 times and issued 150 listens in each connection. By the 100th connection the time to execute the listens had increased by an order of magnitude due to bloat in pg_listener. Vacuuming pg_listener brought the times down again. What's the output of "VACUUM VERBOSE pg_listener"? If you vacuum pg_listener do the listens run faster? -- Michael Fuhr
Michael Fuhr wrote: > How long is "a very long time"? More than 2ms, typically 50-150ms. The other queries (that actually fetch data from tables and do real work) execute in 1-4ms. The funny thing is that if I run the select 6*7 after ping (which does a select 'somestring') then the select 6*7 runs in normal time and the ping is slow, so it seems to me that it's the first query that's being hit. Maybe it's the overhead of starting a new transaction? > Does the first query's time include > the time to connect or do you start timing after the connection has > been made? This is on a used connection that I just pulled out of my connection pool, so it's simply the first query in the new transaction. > How often are you doing the listens? Every time an apache process connects to the DB. > I just did a test in which I > connected to a database 100 times and issued 150 listens in each > connection. By the 100th connection the time to execute the listens > had increased by an order of magnitude due to bloat in pg_listener. > Vacuuming pg_listener brought the times down again. Thank you *so* much for confirming that I'm not out of my mind, I just did the vacuum and it did indeed fix the problem, listens are now done in less than 1ms. I just looked at the pg_listener table: zepong-> \d+ pg_listener Table "pg_catalog.pg_listener" Column | Type | Modifiers | Description --------------+---------+-----------+------------- relname | name | not null | listenerpid | integer | not null | notification | integer | not null | Has OIDs: no ... and noticed the complete lack of indexen, surely this must be a bug? When trying to create the index I get told off by pg: create unique index pg_listeners on pg_listener (relname, listenerpid); ERROR: permission denied: "pg_listener" is a system catalog Any ideas, other than run VACUUM pg_listener every 10 minutes? > What's the output of "VACUUM VERBOSE pg_listener"? If you vacuum > pg_listener do the listens run faster? zepong=> VACUUM VERBOSE pg_listener; INFO: vacuuming "pg_catalog.pg_listener" INFO: "pg_listener": removed 243467 row versions in 3290 pages DETAIL: CPU 0.24s/0.38u sec elapsed 8.61 sec. INFO: "pg_listener": found 243467 removable, 113 nonremovable row versions in 3290 pages DETAIL: 5 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.27s/0.40u sec elapsed 8.65 sec. VACUUM zepong=> VACUUM VERBOSE pg_listener; INFO: vacuuming "pg_catalog.pg_listener" INFO: "pg_listener": removed 1207 row versions in 17 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_listener": found 1207 removable, 108 nonremovable row versions in 3290 pages DETAIL: 0 dead row versions cannot be removed yet. There were 242413 unused item pointers. 0 pages are entirely empty. CPU 0.02s/0.00u sec elapsed 0.03 sec. VACUUM -- Regards Flemming Frandsen - http://dion.swamp.dk - YAPH
Michael Fuhr wrote: > What's the output of "VACUUM VERBOSE pg_listener"? If you vacuum > pg_listener do the listens run faster? Actually now that You mentioned vacuming, I tried doing a VACUUM and after it was done request time was down to around a tenth of what it was before. iow. requests that took 500ms before are now down to around 25ms. I will now hang my head in shame, please forgive me. -- Regards Flemming Frandsen - http://dion.swamp.dk - YAPH
Flemming Frandsen <ff@partyticket.net> writes: > I just looked at the pg_listener table: > ... and noticed the complete lack of indexen, surely this must be a bug? No, that was intentional. It's been a long time but I think the argument was that the cost of updating the indexes would outweigh their usefulness. The listen/notify code is really not designed for a great deal of "flap" in the set of active listeners :-( I'm not particularly concerned about trying to add indexes --- would much rather get rid of the table approach entirely. There have been prior discussions of this but nothing's been done. regards, tom lane
A long time ago, in a galaxy far, far away, ff@partyticket.net (Flemming Frandsen) wrote: > I just looked at the pg_listener table: > > zepong-> \d+ pg_listener > Table "pg_catalog.pg_listener" > Column | Type | Modifiers | Description > --------------+---------+-----------+------------- > relname | name | not null | > listenerpid | integer | not null | > notification | integer | not null | > Has OIDs: no > > > ... and noticed the complete lack of indexen, surely this must be a bug? There has been intent to change pg_listener to be an in-memory structure rather than a table; that would have two benefits: a) No more disk access... b) Listen values all forcibly become obsolete any time the postmaster restarts; an in-memory structure would properly disappear at those time rather than persisting. There's a demerit: c) If there are a LOT of events, that might not fit in memory nicely. > When trying to create the index I get told off by pg: > > create unique index pg_listeners on pg_listener (relname, listenerpid); > ERROR: permission denied: "pg_listener" is a system catalog No point in adding an index if this is to be replaced. That being said, if it's not going away just yet, it might be a good addition... You can't add the index; altering system tables isn't permissible... > Any ideas, other than run VACUUM pg_listener every 10 minutes? Remarkably enough, that is exactly the solution that the Slony-I, which makes use of LISTEN/NOTIFY, uses. The Slony-I cleanup thread vacuums pg_listener (and some other tables) every 10 minutes. -- select 'cbbrowne' || '@' || 'gmail.com'; http://linuxfinances.info/info/emacs.html Instead of talking to your plants, if you yelled at them would they still grow, only to be troubled and insecure?
The world rejoiced as tgl@sss.pgh.pa.us (Tom Lane) wrote: > Flemming Frandsen <ff@partyticket.net> writes: >> I just looked at the pg_listener table: >> ... and noticed the complete lack of indexen, surely this must be a bug? > > No, that was intentional. It's been a long time but I think the > argument was that the cost of updating the indexes would outweigh > their usefulness. The listen/notify code is really not designed > for a great deal of "flap" in the set of active listeners :-( > > I'm not particularly concerned about trying to add indexes --- would > much rather get rid of the table approach entirely. There have been > prior discussions of this but nothing's been done. It would be a pretty easy change to add the obvious index; that could *somewhat* help things for people who use events a lot in the interim. Consider it stipulated that if lots of NOTIFY requests go through, that kills a lot of tuples both in table and index... -- (reverse (concatenate 'string "moc.liamg" "@" "enworbbc")) http://linuxdatabases.info/info/languages.html Signs of a Klingon Programmer - 9. "I have challenged the entire ISO-9000 quality assurance team to a Bat-Leth contest on the holodeck. They will not concern us again."
Christopher Browne wrote: > There's a demerit: > c) If there are a LOT of events, that might not fit in memory nicely. If you have that many events then the current implementation is going to suck hard as well:) Keeping the listener structure in memory wouldn't be too bad. I'd guess that each connection that listens only needs a single transaction pointer (64bit int or somesuch?) to point to the transaction that it has caught up with. Each transaction will need some way of recording what events were seen, like: * a bitmap of the events seen during that transaction (to be useful you'd need a global string->bit table, this needs some way of sharing/locking that global datastructure though). * an ordered list of events seen to allow a simple binary search. * a more complicated datastructure like a JudySL array If you really want to save space then a lump of the pr-transaction bitmaps can be merged if there are no listeners that point to those transactions. >>Any ideas, other than run VACUUM pg_listener every 10 minutes? > > Remarkably enough, that is exactly the solution that the Slony-I, > which makes use of LISTEN/NOTIFY, uses. The Slony-I cleanup thread > vacuums pg_listener (and some other tables) every 10 minutes. Damnit, I thought I was kidding:) -- Regards Flemming Frandsen - http://dion.swamp.dk - YAPH
Hi, I've stapped over the following magic: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Table "public.users" Column | Type | Modifiers ----------+--------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) username | text | not null firma | integer | email | text | state | character(1) | Indexes: "users_pkey" PRIMARY KEY, btree (id) "users_username_key" UNIQUE, btree (username) Foreign-key constraints: "users_firma_fkey" FOREIGN KEY (firma) REFERENCES firmy(id) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Table "public.ludzie" Column | Type | Modifiers ----------+--------------+----------------------------------------------------- id | integer | not null default nextval('ludzie_id_seq'::regclass) username | text | not null firma | integer | email | text | state | character(1) | Indexes: "ludzie_pkey" PRIMARY KEY, btree (id) "username_id_key" UNIQUE, btree (username) Foreign-key constraints: "$1" FOREIGN KEY (firma) REFERENCES firmy(id) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ INSERT INTO users SELECT * from ludzie; INSERT 0 14 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ strop=# EXPLAIN ANALYZE SELECT * from users where username = current_user; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Index Scan using users_username_key on users (cost=0.00..5.83 rows=1 width=80) (actual time=0.061..0.061 rows=0 loops=1) Index Cond: (username = ("current_user"())::text) Total runtime: 0.193 ms (3 rows) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ EXPLAIN ANALYZE SELECT * from ludzie where username = current_user; QUERY PLAN ------------------------------------------------------------------------------------------------- Seq Scan on ludzie (cost=0.00..1.19 rows=1 width=80) (actual time=0.096..0.096 rows=0 loops=1) Filter: (username = ("current_user"())::text) Total runtime: 0.185 ms (3 rows) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ As fak as I can tell, both TABLE users and TABLE ludzie are identical: 1. they have the same structure 2. they have the same contents (users was empty before INSERT) Still, the query plan on them differs, and quite signifficantly - one uses index, while the other doesn't. Can someone shred some light on why is that so? Note: I've created "TABLE users" after noticing, that "TABLE ludzie" is Seq-scanned over username. But I'm not able to reproduce the seq-scan on the freshly created table, which is *identical* to "TABLE ludzie". On the other hand, I have REINDEXED the database, to no avail. ludzie(username) is still seq-scanned. Any ideas? -- Rafal Pietrak <rafal@poczta.homelinux.com>
Rafal Pietrak wrote: > Hi, > > I've stapped over the following magic: > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Table "public.users" > Column | Type | > Modifiers > ----------+--------------+---------------------------------------------------- > id | integer | not null default > nextval('users_id_seq'::regclass) > username | text | not null > firma | integer | > email | text | > state | character(1) | > Indexes: > "users_pkey" PRIMARY KEY, btree (id) > "users_username_key" UNIQUE, btree (username) > Foreign-key constraints: > "users_firma_fkey" FOREIGN KEY (firma) REFERENCES firmy(id) > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Table "public.ludzie" > Column | Type | > Modifiers > ----------+--------------+----------------------------------------------------- > id | integer | not null default > nextval('ludzie_id_seq'::regclass) > username | text | not null > firma | integer | > email | text | > state | character(1) | > Indexes: > "ludzie_pkey" PRIMARY KEY, btree (id) > "username_id_key" UNIQUE, btree (username) > Foreign-key constraints: > "$1" FOREIGN KEY (firma) REFERENCES firmy(id) > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > INSERT INTO users SELECT * from ludzie; > INSERT 0 14 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > strop=# EXPLAIN ANALYZE SELECT * from users where username = > current_user; > QUERY > PLAN > --------------------------------------------------------------------------------------------------------------------------- > Index Scan using users_username_key on users (cost=0.00..5.83 rows=1 > width=80) (actual time=0.061..0.061 rows=0 loops=1) > Index Cond: (username = ("current_user"())::text) > Total runtime: 0.193 ms > (3 rows) > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > EXPLAIN ANALYZE SELECT * from ludzie where username = current_user; > QUERY > PLAN > ------------------------------------------------------------------------------------------------- > Seq Scan on ludzie (cost=0.00..1.19 rows=1 width=80) (actual > time=0.096..0.096 rows=0 loops=1) > Filter: (username = ("current_user"())::text) > Total runtime: 0.185 ms > (3 rows) > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > As fak as I can tell, both TABLE users and TABLE ludzie are identical: > 1. they have the same structure > 2. they have the same contents (users was empty before INSERT) > Still, the query plan on them differs, and quite signifficantly - one > uses index, while the other doesn't. > > Can someone shred some light on why is that so? Firstly always start a new thread instead of replying to an existing one, it is a lot easier to follow and not so easily missed. After your insert you need to do an 'analyze users' to get postgresql statistics up to date. Next - why does it need to use an index? In these examples your query is finishing in less than 2ms so are very quick. Postgresql doesn't always choose to use an index because it doesn't need to. In this case there are only 14 rows in the table so it could be quicker for the db to look at each row rather than using an index. Do you have an example where it's taking a lot longer? Post those results and you might get more of a response. -- Postgresql & php tutorials http://www.designmagick.com/
On Mon, 2006-07-31 at 12:55 +1000, Chris wrote: > Rafal Pietrak wrote: > > strop=# EXPLAIN ANALYZE SELECT * from users where username = > > current_user; > > QUERY > > PLAN > > --------------------------------------------------------------------------------------------------------------------------- > > Index Scan using users_username_key on users (cost=0.00..5.83 rows=1 > > width=80) (actual time=0.061..0.061 rows=0 loops=1) > > Index Cond: (username = ("current_user"())::text) > > Total runtime: 0.193 ms > > (3 rows) > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > EXPLAIN ANALYZE SELECT * from ludzie where username = current_user; > > QUERY > > PLAN > > ------------------------------------------------------------------------------------------------- > > Seq Scan on ludzie (cost=0.00..1.19 rows=1 width=80) (actual > > time=0.096..0.096 rows=0 loops=1) > > Filter: (username = ("current_user"())::text) > > Total runtime: 0.185 ms > > (3 rows) > > > Firstly always start a new thread instead of replying to an existing > one, it is a lot easier to follow and not so easily missed. Very sorry for that, 20-yrs old habits are really deap in your bones. I thought I've adjusted the subject accordingly, but I'll try to remember that for my future posts anyway. > After your insert you need to do an 'analyze users' to get postgresql > statistics up to date. OK. That did it. they are identical, now. > Next - why does it need to use an index? In these examples your query is To explain. With any 'programming exercise' I do, I 'start small' and try to see program behavior on small scale (both datasets and number of involved modules) before I roll out any larger setup for testing. In this case, tha DB will be used with 'TABLE ludzie' popolated with close to a milion entries, so when I noticed 'Seq-scan' I became warried. But this is a DBMS, which I can see now, and I should have trusted it a little more. After your suggestion I've generated some more rows, and the results are OK: +++++++++++++++++++++++++++++++++++++++++++ test_db# \i file-with-10-rows.sql test_db# EXPLAIN ANALYZE SELECT * from users where username = current_user; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Index Scan using username_id_key on ludzie (cost=0.00..5.99 rows=1 width=80) (actual time=0.194..0.194 rows=0 loops=1) Index Cond: (username = ("current_user"())::text) Total runtime: 0.344 ms (3 rows) ++++++++++++++++++++++++++++++++++++++++++ test_db# \i file-with-10000-rows.sql test_db# EXPLAIN ANALYZE SELECT * from users where username = current_user; QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on users (cost=0.00..1.25 rows=1 width=60) (actual time=0.177..0.177 rows=0 loops=1) Filter: (username = ("current_user"())::text) Total runtime: 0.288 ms (3 rows) +++++++++++++++++++++++++++++++++++++++++ So I don't have to warry about the omission of index use in this case. My real warry was the discrepancy of "TABLE users' v/s 'TABLE ludzie' results - this smelled like uncontrolable, unpredictible result. But obviosly, not being too proficient with DBMS, I didn't realise the query plan is build from trancient estimates of access cost. I've never before fell into the necesity to ANALYSE table, only relaying on self-estimates the DBMS gathers along the use of the system. Obviously that's totally wrong for pre-production system evaluation where datasets are cooked and swapped faster then any DB self-estimates have any chance to get collected. BTW: I'd really love to: "CREATE TABLE users (id oid references pg_catalog.pg_authid(iod), info text);" - why is that not possible? Or by a similar token, I long for the ability of a database (meaning DB dataset/aplication, as opposed to DBMS), to be able to extend the "SET <parameter> = <value>", with application speciffic parameters/values, so that a DB application could have it's private parameters (like user phone/ address) in one place, and avoid cludges like the "TABLE users". Thenx for the help. -- -R
On Jul 31, 2006, at 15:22 , Rafal Pietrak wrote: > Very sorry for that, 20-yrs old habits are really deap in your > bones. I > thought I've adjusted the subject accordingly, but I'll try to > remember > that for my future posts anyway. Just a reminder: just changing the subject is not sufficient to start a new thread. Please start with a completely new message. Some mail clients use the In-Reply-To: header for threading, and this does not get changed just by changing the subject. Thanks! Michael Glaesemann grzm seespotcode net
> To explain. With any 'programming exercise' I do, I 'start small' and > try to see program behavior on small scale (both datasets and number of > involved modules) before I roll out any larger setup for testing. > > In this case, tha DB will be used with 'TABLE ludzie' popolated with > close to a milion entries, so when I noticed 'Seq-scan' I became > warried. That's different to what you posted.. 5 rows in a table won't make a database use an index so there's no point testing that if you're expecting millions of rows. > My real warry was the discrepancy of "TABLE users' v/s 'TABLE ludzie' > results - this smelled like uncontrolable, unpredictible result. But > obviosly, not being too proficient with DBMS, I didn't realise the query > plan is build from trancient estimates of access cost. I've never before > fell into the necesity to ANALYSE table, only relaying on self-estimates > the DBMS gathers along the use of the system. Obviously that's totally > wrong for pre-production system evaluation where datasets are cooked and > swapped faster then any DB self-estimates have any chance to get > collected. Postgres has 'autovacuum' which keeps table stats up to date in most cases enough for you not to worry about this sort of thing. http://www.postgresql.org/docs/8.1/static/maintenance.html#AUTOVACUUM It won't fit every single case but give it a go. -- Postgresql & php tutorials http://www.designmagick.com/
Flemming Frandsen wrote: > Christopher Browne wrote: > >There's a demerit: > >c) If there are a LOT of events, that might not fit in memory nicely. > > If you have that many events then the current implementation is going to > suck hard as well:) The difference is that the current implementation *works* regardless of the number of active listeners. With the in-memory idea, you might have to drop some listeners in order to make them fit in memory, which makes that a non-starter, unless you find a solution to shave some stuff to disk. That's where the efficiency argument kicks in. This is a bit worse than it sounds because the memory we are talking about is shared memory, which cannot be grown after the server started (like you can with the kind of memory served by malloc()). Also, some people want the ability to stash messages with each NOTIFY. This makes the whole idea a lot more complicated. What this means is that nobody has tried *really hard* to make it work (really hard meaning, enough so that it actually works). Neil Conway had some nice ideas but I don't think they were ever fully realized. If you want to contribute, you're more than welcome. You're far from alone in wanting this thing "fixed." -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support