Thread: Query performance on session table
Hello, I am having performance issues with a table that holds session_data. This table is heavily updated and daily vacuumed. Please see the information below. I was not able to pinpoint the issue so any help would be appreciated. Burak #################### ### Table Structure ### #################### CREATE TABLE session ( session_id char(40) NOT NULL PRIMARY KEY, session_expires integer NOT NULL, session_data text ); ################## ### Sample Row ### ################## 1987b8db3ab36c18d0da7f9c2915194092f7bdf7 | 1183066733 | hkmTZblHHF+cY8g 6Dx/K0ioEc98QdmOKST ocRd P2gUfsTsMeMBV wGiGbhTJ0CuimDVpv hH8TdWjqMc3rJW7dHJ wjdsrNaqXUpEG+9HvnbKgngG9cqa p2IxjeTD7k8G/5ZIDrvk3+DSoFu2FB47qvacNmH+hzM U d1Fn8oKERa1qc+AcuLxLQKdwQUV H8ZE7WXNG etkGq/LSlgIOpTyb44oy5C5evlWSiT1 A2iwCIT8kxrCn5+Avrrg/zLQ muZkBqXd5 vvPcL5 mXEhnu4b96Zy/YpyEnUcV 8coPBw1p0s1i5lwjWyMHYo7H 64HfON8prMizrEoTNyTeMt7jbxo3v I0Ds+xP9QxY8hpv+4Hc+GN ........ encrypted data continues.... ###################################### ### EXPLAIN ANALYZE before VACUUM ### ###################################### # explain analyze select * from session; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Seq Scan on session (cost=0.00..373745.36 rows=736 width=269) (actual time=32824.964..75997.206 rows=710 loops=1) Total runtime: 75998.003 ms (2 rows) ##################################### ### EXPLAIN ANALYZE after VACUUM ### ##################################### # explain analyze select * from session; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Seq Scan on session (cost=0.00..373753.47 rows=747 width=282) (actual time=1357.697..60344.110 rows=731 loops=1) Total runtime: 60344.971 ms (2 rows) 60 seconds for 747 rows! ################# ### Other tables ### ################# Other tables in the database are performing just fine. Please see the below query plan for reference. # explain analyze select * from user; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Seq Scan on user (cost=0.00..2650.49 rows=25849 width=254) (actual time=22.087..3946.991 rows=25866 loops=1) Total runtime: 3971.146 ms (2 rows) ~4 secs for 25849 rows. ################ ### More tests ### ################ # explain analyze select * from session order by session_expires desc limit 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Limit (cost=373789.12..373789.12 rows=1 width=282) (actual time=49065.062..49065.064 rows=1 loops=1) -> Sort (cost=373789.12..373790.99 rows=747 width=282) (actual time=49065.058..49065.058 rows=1 loops=1) Sort Key: session_expires -> Seq Scan on session (cost=0.00..373753.47 rows=747 width=282) (actual time=5514.580..49063.397 rows=729 loops=1) Total runtime: 49065.152 ms (5 rows) # explain analyze select * from session where session_id='1987b8db3ab36c18d0da7f9c2915194092f7bdf7'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Index Scan using session_pkey on ce_session (cost=0.00..12.05 rows=1 width=282) (actual time=28.113..28.115 rows=1 loops=1) Index Cond: (session_id = '1987b8db3ab36c18d0da7f9c2915194092f7bdf7'::bpchar) Total runtime: 28.144 ms (3 rows) ################## ### Configuration ### ################## RedHat EL3, PostgreSQL 7.4 # - Memory - shared_buffers = 30000 sort_mem = 5120 vacuum_mem = 32768 # - Free Space Map - max_fsm_pages = 20000 max_fsm_relations = 1000 # - Kernel Resource Usage - max_files_per_process = 1000 preload_libraries = '' # - Settings - fsync = true wal_sync_method = fsync wal_buffers = 8 # - Checkpoints - checkpoint_segments = 20 checkpoint_timeout = 300 checkpoint_warning = 30 commit_delay = 0 commit_siblings = 5 # - Planner Cost Constants - effective_cache_size = 40000 random_page_cost = 4 cpu_tuple_cost = 0.01 cpu_index_tuple_cost = 0.001 cpu_operator_cost = 0.0025
On Jun 28, 2007, at 15:26 , Burak Seydioglu wrote: > I am having performance issues with a table that holds session_data. > This table is heavily updated and daily vacuumed. If it's heavily updated, vacuuming once a day may not be enough. You can elect to vacuum a single table. What's the output of VACUUM VERBOSE session ?It should tell you how many dead tuples were removed. You may also want to consider using autovacuum to help keep your session table fairly clean. Michael Glaesemann grzm seespotcode net
Here is the results of VACUUM VERBOSE and a query executed right after vacuum is complete. Still 30 sec for 700 rows. Do I need to do VACUUM FULL? Thanks, Burak # VACUUM VERBOSE session; INFO: vacuuming "public.session" INFO: index "session_pkey" now contains 854 row versions in 5255 pages DETAIL: 9212 index row versions were removed. 5028 index pages have been deleted, 5004 are currently reusable. CPU 0.09s/0.03u sec elapsed 4.65 sec. INFO: "session": removed 9212 row versions in 3086 pages DETAIL: CPU 0.08s/0.15u sec elapsed 1.47 sec. INFO: "session": found 9212 removable, 793 nonremovable row versions in 373746 pages DETAIL: 34 dead row versions cannot be removed yet. There were 2938041 unused item pointers. 0 pages are entirely empty. CPU 2.28s/0.82u sec elapsed 68.88 sec. INFO: "session": truncated 373746 to 147910 pages DETAIL: CPU 2.65s/0.51u sec elapsed 37.76 sec. INFO: vacuuming "pg_toast.pg_toast_123221198" INFO: index "pg_toast_123221198_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.02 sec. INFO: "pg_toast_123221198": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.02 sec. VACUUM citizenre=# EXPLAIN ANALYZE SELECT * FROM session; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Seq Scan on session (cost=0.00..147917.93 rows=793 width=282) (actual time=27260.391..29186.760 rows=708 loops=1) Total runtime: 29187.556 ms (2 rows) On 6/28/07, Michael Glaesemann <grzm@seespotcode.net> wrote: > > On Jun 28, 2007, at 15:26 , Burak Seydioglu wrote: > > > I am having performance issues with a table that holds session_data. > > This table is heavily updated and daily vacuumed. > > If it's heavily updated, vacuuming once a day may not be enough. You > can elect to vacuum a single table. > > What's the output of VACUUM VERBOSE session ?It should tell you how > many dead tuples were removed. > > You may also want to consider using autovacuum to help keep your > session table fairly clean. > > Michael Glaesemann > grzm seespotcode net > > >
[Please don't top post as it makes the discussion more difficult to follow.] On Jun 28, 2007, at 18:05 , Burak Seydioglu wrote: > Here is the results of VACUUM VERBOSE and a query executed right after > vacuum is complete. Still 30 sec for 700 rows. That does seem like a long time. One thing to note is SELECT * FROM session is most likely *not* what your application is doing: it's looking up one session at a time. The times you're probably interested in are of the sort SELECT * FROM session WHERE session_id = ?; > Do I need to do VACUUM > FULL? It might help, but AIUI, all VACUUM FULL does compared to VACUUM is reclaim space more aggressively. Perhaps VACUUM FULL will mean that the sequential scan will have fewer pages to visit. (A sequential scan is used because there's no advantage to using an index as SELECT * FROM session needs to return every row anyway. However, SELECT * FROM session WHERE session_id = ? will most likely use an index as it's just returning a single row.) Others on the list probably have more experience diagnosing this than I. > # VACUUM VERBOSE session; > INFO: vacuuming "public.session" > INFO: index "session_pkey" now contains 854 row versions in 5255 > pages > DETAIL: 9212 index row versions were removed. > 5028 index pages have been deleted, 5004 are currently reusable. Looks like the index had quite a few dead row versions. > INFO: "session": removed 9212 row versions in 3086 pages > DETAIL: CPU 0.08s/0.15u sec elapsed 1.47 sec. > INFO: "session": found 9212 removable, 793 nonremovable row versions > in 373746 pages > DETAIL: 34 dead row versions cannot be removed yet. > There were 2938041 unused item pointers. > 0 pages are entirely empty. This relatively huge number of pages (373,746) makes me think that the rows that are still valid are spread out across a large number of pages. Perhaps VACUUM FULL will condense them. I don't think you will need to (or even should) VACUUM FULL frequently. If you're VACUUMing the table frequently enough (such as by using autovacuum with settings appropriate for your situation), you shouldn't need to VACUUM FULL much at all. > INFO: "session": truncated 373746 to 147910 pages And it looks like VACUUM was able to significantly reduce the number of pages used for session (by over 60%, if I'm reading this correctly). > citizenre=# EXPLAIN ANALYZE SELECT * FROM session; > QUERY PLAN > ---------------------------------------------------------------------- > ------------------------------------------------- > Seq Scan on session (cost=0.00..147917.93 rows=793 width=282) > (actual time=27260.391..29186.760 rows=708 loops=1) > Total runtime: 29187.556 ms > (2 rows) You might want to throw in an ANALYZE session as well, as the statistics are slightly off (793 estimated rows, but only 708 actual rows), but I don't think this is a significant part of your performance issue. So, my (uneducated) recommendation is try a VACUUM FULL VERBOSE session, and VACUUM session more frequently than once a day, probably with autovacuum. Also, you should be looking at your app to see what queries are actually being run against session and how they perform rather than using SELECT * FROM session as a benchmark (unless SELECT * FROM session *is* a critical query for your app). I'm very interested to hear from others if I've misread this, as I'm pretty inexperienced at doing so. Comments? Hope this helps. Michael Glaesemann grzm seespotcode net
> [Please don't top post as it makes the discussion more difficult to > follow.] .... and I thought I was an experienced web user... My apologies. > The times you're probably > interested in are of the sort > > SELECT * > FROM session > WHERE session_id = ?; > This query is really fast as you can see in my original post. > Also, you should be looking at your app to see what > queries are actually being run against session and how they perform > rather than using SELECT * FROM session as a benchmark (unless SELECT > * FROM session *is* a critical query for your app) The garbage collector (DELETE FROM session WHERE session_expires < timestamp) has to do the same sequential scan. # EXPLAIN ANALYZE DELETE FROM session WHERE session_expires<0; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Seq Scan on session (cost=0.00..147919.91 rows=1 width=6) (actual time=1473.284..1473.284 rows=0 loops=1) Filter: (session_expires < 0) Total runtime: 1473.315 ms (3 rows) On a side note however, without me doing anything the results has changed even though pg_stat_activity table shows the same amount of activity. # EXPLAIN ANALYZE SELECT * FROM session; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Seq Scan on session (cost=0.00..147917.93 rows=793 width=282) (actual time=1519.959..1612.596 rows=672 loops=1) Total runtime: 1613.248 ms (2 rows) 672 rows in 1.6 secs... I guess I am looking at the wrong place. Thank you for you input. Burak
On Jun 28, 2007, at 22:28 , Burak Seydioglu wrote: > This query is really fast as you can see in my original post. Ah. I should have referred to your original email. > The garbage collector (DELETE FROM session WHERE session_expires < > timestamp) has to do the same sequential scan. You may find benefit from putting an index on the session_expires column, though of course to maintain the index with a lot of updates, you may see update performance degrade because the update requires the index to be updated as well. The only way to find out is by benchmarking your app and see what happens. With such a small number of rows, the planner may determine that a sequential scan is still faster than an index scan, so you'd end up with the same performance on DELETE that you have now. > Thank you for you input. Well, we're both learning together :) Michael Glaesemann grzm seespotcode net