Thread: Query performance on session table

Query performance on session table

From
"Burak Seydioglu"
Date:
Hello,

I am having performance issues with a table that holds session_data.
This table is heavily updated and daily vacuumed. Please see the
information below. I was not able to pinpoint the issue so any help
would be appreciated.

Burak


####################
### Table Structure ###
####################

CREATE TABLE session (
session_id char(40) NOT NULL PRIMARY KEY,
session_expires integer NOT NULL,
session_data text
);

##################
### Sample Row ###
##################

1987b8db3ab36c18d0da7f9c2915194092f7bdf7 |      1183066733 |
hkmTZblHHF+cY8g 6Dx/K0ioEc98QdmOKST ocRd P2gUfsTsMeMBV
wGiGbhTJ0CuimDVpv hH8TdWjqMc3rJW7dHJ wjdsrNaqXUpEG+9HvnbKgngG9cqa
p2IxjeTD7k8G/5ZIDrvk3+DSoFu2FB47qvacNmH+hzM U
d1Fn8oKERa1qc+AcuLxLQKdwQUV H8ZE7WXNG etkGq/LSlgIOpTyb44oy5C5evlWSiT1
A2iwCIT8kxrCn5+Avrrg/zLQ muZkBqXd5 vvPcL5 mXEhnu4b96Zy/YpyEnUcV
8coPBw1p0s1i5lwjWyMHYo7H 64HfON8prMizrEoTNyTeMt7jbxo3v
I0Ds+xP9QxY8hpv+4Hc+GN ........ encrypted data continues....

######################################
### EXPLAIN ANALYZE before VACUUM ###
######################################

# explain analyze select * from session;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on session  (cost=0.00..373745.36 rows=736 width=269)
(actual time=32824.964..75997.206 rows=710 loops=1)
 Total runtime: 75998.003 ms
(2 rows)

#####################################
### EXPLAIN ANALYZE after VACUUM ###
#####################################

# explain analyze select * from session;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on session  (cost=0.00..373753.47 rows=747 width=282)
(actual time=1357.697..60344.110 rows=731 loops=1)
 Total runtime: 60344.971 ms
(2 rows)

60 seconds for 747 rows!

#################
### Other tables ###
#################

Other tables in the database are performing just fine. Please see the
below query plan for reference.

# explain analyze select * from user;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Seq Scan on user  (cost=0.00..2650.49 rows=25849 width=254) (actual
time=22.087..3946.991 rows=25866 loops=1)
 Total runtime: 3971.146 ms
(2 rows)

~4 secs for 25849 rows.

################
### More tests ###
################

# explain analyze select * from session order by session_expires desc limit 1;
                                                            QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=373789.12..373789.12 rows=1 width=282) (actual
time=49065.062..49065.064 rows=1 loops=1)
   ->  Sort  (cost=373789.12..373790.99 rows=747 width=282) (actual
time=49065.058..49065.058 rows=1 loops=1)
         Sort Key: session_expires
         ->  Seq Scan on session  (cost=0.00..373753.47 rows=747
width=282) (actual time=5514.580..49063.397 rows=729 loops=1)
 Total runtime: 49065.152 ms
(5 rows)

# explain analyze select * from session where
session_id='1987b8db3ab36c18d0da7f9c2915194092f7bdf7';
                                                           QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------
 Index Scan using session_pkey on ce_session  (cost=0.00..12.05 rows=1
width=282) (actual time=28.113..28.115 rows=1 loops=1)
   Index Cond: (session_id = '1987b8db3ab36c18d0da7f9c2915194092f7bdf7'::bpchar)
 Total runtime: 28.144 ms
(3 rows)


##################
### Configuration ###
##################

RedHat EL3, PostgreSQL 7.4

# - Memory -

shared_buffers = 30000
sort_mem = 5120
vacuum_mem = 32768

# - Free Space Map -

max_fsm_pages = 20000
max_fsm_relations = 1000

# - Kernel Resource Usage -

max_files_per_process = 1000
preload_libraries = ''

# - Settings -

fsync = true
wal_sync_method = fsync
wal_buffers = 8

# - Checkpoints -

checkpoint_segments = 20
checkpoint_timeout = 300
checkpoint_warning = 30
commit_delay = 0
commit_siblings = 5

# - Planner Cost Constants -

effective_cache_size = 40000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025

Re: Query performance on session table

From
Michael Glaesemann
Date:
On Jun 28, 2007, at 15:26 , Burak Seydioglu wrote:

> I am having performance issues with a table that holds session_data.
> This table is heavily updated and daily vacuumed.

If it's heavily updated, vacuuming once a day may not be enough. You
can elect to vacuum a single table.

What's the output of VACUUM VERBOSE session ?It should tell you how
many dead tuples were removed.

You may also want to consider using autovacuum to help keep your
session table fairly clean.

Michael Glaesemann
grzm seespotcode net



Re: Query performance on session table

From
"Burak Seydioglu"
Date:
Here is the results of VACUUM VERBOSE and a query executed right after
vacuum is complete. Still 30 sec for 700 rows. Do I need to do VACUUM
FULL?

Thanks,

Burak


# VACUUM VERBOSE session;
INFO:  vacuuming "public.session"
INFO:  index "session_pkey" now contains 854 row versions in 5255 pages
DETAIL:  9212 index row versions were removed.
5028 index pages have been deleted, 5004 are currently reusable.
CPU 0.09s/0.03u sec elapsed 4.65 sec.
INFO:  "session": removed 9212 row versions in 3086 pages
DETAIL:  CPU 0.08s/0.15u sec elapsed 1.47 sec.
INFO:  "session": found 9212 removable, 793 nonremovable row versions
in 373746 pages
DETAIL:  34 dead row versions cannot be removed yet.
There were 2938041 unused item pointers.
0 pages are entirely empty.
CPU 2.28s/0.82u sec elapsed 68.88 sec.
INFO:  "session": truncated 373746 to 147910 pages
DETAIL:  CPU 2.65s/0.51u sec elapsed 37.76 sec.
INFO:  vacuuming "pg_toast.pg_toast_123221198"
INFO:  index "pg_toast_123221198_index" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  "pg_toast_123221198": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
VACUUM

citizenre=# EXPLAIN ANALYZE SELECT * FROM session;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on session  (cost=0.00..147917.93 rows=793 width=282)
(actual time=27260.391..29186.760 rows=708 loops=1)
 Total runtime: 29187.556 ms
(2 rows)




On 6/28/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> On Jun 28, 2007, at 15:26 , Burak Seydioglu wrote:
>
> > I am having performance issues with a table that holds session_data.
> > This table is heavily updated and daily vacuumed.
>
> If it's heavily updated, vacuuming once a day may not be enough. You
> can elect to vacuum a single table.
>
> What's the output of VACUUM VERBOSE session ?It should tell you how
> many dead tuples were removed.
>
> You may also want to consider using autovacuum to help keep your
> session table fairly clean.
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>

Re: Query performance on session table

From
Michael Glaesemann
Date:
[Please don't top post as it makes the discussion more difficult to
follow.]

On Jun 28, 2007, at 18:05 , Burak Seydioglu wrote:

> Here is the results of VACUUM VERBOSE and a query executed right after
> vacuum is complete. Still 30 sec for 700 rows.

That does seem like a long time. One thing to note is SELECT * FROM
session is most likely *not* what your application is doing: it's
looking up one session at a time. The times you're probably
interested in are of the sort

SELECT *
FROM session
WHERE session_id = ?;

> Do I need to do VACUUM
> FULL?

It might help, but AIUI, all VACUUM FULL does compared to VACUUM is
reclaim space more aggressively. Perhaps VACUUM FULL will mean that
the sequential scan will have fewer pages to visit. (A sequential
scan is used because there's no advantage to using an index as SELECT
* FROM session needs to return every row anyway. However, SELECT *
FROM session WHERE session_id = ? will most likely use an index as
it's just returning a single row.) Others on the list probably have
more experience diagnosing this than I.

> # VACUUM VERBOSE session;
> INFO:  vacuuming "public.session"
> INFO:  index "session_pkey" now contains 854 row versions in 5255
> pages
> DETAIL:  9212 index row versions were removed.
> 5028 index pages have been deleted, 5004 are currently reusable.

Looks like the index had quite a few dead row versions.

> INFO:  "session": removed 9212 row versions in 3086 pages
> DETAIL:  CPU 0.08s/0.15u sec elapsed 1.47 sec.
> INFO:  "session": found 9212 removable, 793 nonremovable row versions
> in 373746 pages
> DETAIL:  34 dead row versions cannot be removed yet.
> There were 2938041 unused item pointers.
> 0 pages are entirely empty.

This relatively huge number of pages (373,746) makes me think that
the rows that are still valid are spread out across a large number of
pages. Perhaps VACUUM FULL will condense them. I don't think you will
need to (or even should) VACUUM FULL frequently. If you're VACUUMing
the table frequently enough (such as by using autovacuum with
settings appropriate for your situation), you shouldn't need to
VACUUM FULL much at all.

> INFO:  "session": truncated 373746 to 147910 pages

And it looks like VACUUM was able to significantly reduce the number
of pages used for session (by over 60%, if I'm reading this correctly).

> citizenre=# EXPLAIN ANALYZE SELECT * FROM session;
>                                                      QUERY PLAN
> ----------------------------------------------------------------------
> -------------------------------------------------
> Seq Scan on session  (cost=0.00..147917.93 rows=793 width=282)
> (actual time=27260.391..29186.760 rows=708 loops=1)
> Total runtime: 29187.556 ms
> (2 rows)

You might want to throw in an ANALYZE session as well, as the
statistics are slightly off (793 estimated rows, but only 708 actual
rows), but I don't think this is a significant part of your
performance issue.

So, my (uneducated) recommendation is try a VACUUM FULL VERBOSE
session, and VACUUM session more frequently than once a day, probably
with autovacuum. Also, you should be looking at your app to see what
queries are actually being run against session and how they perform
rather than using SELECT * FROM session as a benchmark (unless SELECT
* FROM session *is* a critical query for your app). I'm very
interested to hear from others if I've misread this, as I'm pretty
inexperienced at doing so. Comments?

Hope this helps.

Michael Glaesemann
grzm seespotcode net



Re: Query performance on session table

From
"Burak Seydioglu"
Date:
> [Please don't top post as it makes the discussion more difficult to
> follow.]

.... and I thought I was an experienced web user... My apologies.

>  The times you're probably
> interested in are of the sort
>
> SELECT *
> FROM session
> WHERE session_id = ?;
>

This query is really fast as you can see in my original post.

> Also, you should be looking at your app to see what
> queries are actually being run against session and how they perform
> rather than using SELECT * FROM session as a benchmark (unless SELECT
> * FROM session *is* a critical query for your app)


The garbage collector (DELETE FROM session WHERE session_expires <
timestamp) has to do the same sequential scan.

# EXPLAIN ANALYZE DELETE FROM session WHERE session_expires<0;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Seq Scan on session  (cost=0.00..147919.91 rows=1 width=6) (actual
time=1473.284..1473.284 rows=0 loops=1)
   Filter: (session_expires < 0)
 Total runtime: 1473.315 ms
(3 rows)

On a side note however, without me doing anything the results has
changed even though pg_stat_activity table shows the same amount of
activity.

# EXPLAIN ANALYZE SELECT * FROM session;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Seq Scan on session  (cost=0.00..147917.93 rows=793 width=282)
(actual time=1519.959..1612.596 rows=672 loops=1)
 Total runtime: 1613.248 ms
(2 rows)


672 rows in 1.6 secs... I guess I am looking at the wrong place.

Thank you for you input.

Burak

Re: Query performance on session table

From
Michael Glaesemann
Date:
On Jun 28, 2007, at 22:28 , Burak Seydioglu wrote:

> This query is really fast as you can see in my original post.

Ah. I should have referred to your original email.

> The garbage collector (DELETE FROM session WHERE session_expires <
> timestamp) has to do the same sequential scan.

You may find benefit from putting an index on the session_expires
column, though of course to maintain the index with a lot of updates,
you may see update performance degrade because the update requires
the index to be updated as well. The only way to find out is by
benchmarking your app and see what happens. With such a small number
of rows, the planner may determine that a sequential scan is still
faster than an index scan, so you'd end up with the same performance
on DELETE that you have now.

> Thank you for you input.

Well, we're both learning together :)

Michael Glaesemann
grzm seespotcode net