Thread: Optimizing select count query which often takes over 10 seconds
Hello, for a PostgreSQL 8.4.13 database + pgbouncer on a 32 GB RAM machine with CentOS 6.3 / 64 bit I use the following settings: max_connections = 100 shared_buffers = 4096MB work_mem = 32MB checkpoint_segments = 32 # to shut up nightly pg_dump escape_string_warning = off # to shut up Drupal 7.19 warnings log_min_duration_statement = 10000 And the latter statement always reports me just 1 command (besides Drupal which I can't fix): LOG: duration: 12590.394 ms statement: select count(id) from ( select id, row_number() over(partition by yw order by money desc) as ranking from pref_money ) x where x.ranking = 1 and id='OK471018960997' This command comes from a PHP-script of mine which displays "medals" on a player profile page - meaning how many times she won a weekly tournament: # \d pref_money Table "public.pref_money" Column | Type | Modifiers --------+-----------------------+----------------------------------------- id | character varying(32) | money | integer | not null yw | character(7) | default to_char(now(), 'IYYY-IW'::text) Indexes: "pref_money_yw_index" btree (yw) Foreign-key constraints: "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE Does anybody please have an idea how could I optimize it or should I introduce a hourly job and a "medals" column (that would make my players stats less "live")? Here is the EXPLAIN output (which I hardly understand) for a player with 9 weekly medals: # explain analyze select count(id) from ( select id, row_number() over(partition by yw order by money desc) as ranking from pref_money ) x where x.ranking = 1 and id='OK452217781481'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual time=4520.719..4520.719 rows=1 loops=1) -> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82) (actual time=4470.620..4520.710 rows=6 loops=1) Filter: ((x.ranking = 1) AND ((x.id)::text = 'OK452217781481'::text)) -> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26) (actual time=4293.315..4491.652 rows=429803 loops=1) -> Sort (cost=48519.10..49603.03 rows=433574 width=26) (actual time=4293.306..4352.544 rows=429803 loops=1) Sort Key: pref_money.yw, pref_money.money Sort Method: external sort Disk: 15856kB -> Seq Scan on pref_money (cost=0.00..7923.74 rows=433574 width=26) (actual time=0.006..41.907 rows=429803 loops=1) Total runtime: 4525.662 ms (9 rows) Thank you for any hints Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Thu, Jan 24, 2013 at 4:57 AM, Alexander Farber <alexander.farber@gmail.com> wrote:
--
select count(id) from (
select id,
row_number() over(partition by yw order by money
desc) as ranking
from pref_money
) x
where x.ranking = 1 and id='OK471018960997'
Does anybody please have an idea
how could I optimize it or should
I introduce a hourly job and a "medals"
column (that would make my players
stats less "live")?
One idea is to have a new ranking column to cache every player's ranking for every weekly tournament. However, instead of updating it hourly with a cron job, you could have a trigger on the table, such that when any row is updated/inserted, you recalculate the rankings for only those rows having the same "yw" value.
Then, you might want to create an index on the ranking column as well as the yw column, which you already have indexed
Moshe
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | www.neadwerx.com
moshe@neadwerx.com | www.neadwerx.com
On 24 January 2013 10:57, Alexander Farber <alexander.farber@gmail.com> wrote:
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
# explain analyze select count(id) from (
select id,
row_number() over(partition by yw order by money
desc) as ranking
from pref_money
) x
where x.ranking = 1 and id='OK452217781481';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual
time=4520.719..4520.719 rows=1 loops=1)
-> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82)
(actual time=4470.620..4520.710 rows=6 loops=1)
Filter: ((x.ranking = 1) AND ((x.id)::text = 'OK452217781481'::text))
-> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26)
(actual time=4293.315..4491.652 rows=429803 loops=1)
-> Sort (cost=48519.10..49603.03 rows=433574
width=26) (actual time=4293.306..4352.544 rows=429803 loops=1)
Sort Key: pref_money.yw, pref_money.money
Sort Method: external sort Disk: 15856kB
It's sorting on disk. That's not going to be fast. Indeed, it's taking nearly all the time the query takes (4.4s for this step out of 4.5s for the query).
-> Seq Scan on pref_money (cost=0.00..7923.74
rows=433574 width=26) (actual time=0.006..41.907 rows=429803 loops=1)
And then it's doing a sequential scan to sort the data. I suspect that's because it's sorting on disk. Then again, this only takes 42ms, just once (loops=1), so perhaps a seqscan is indeed the fastest approach here (actually, wow, it scans 10000 records/ms - rows are 26 bytes wide, so that's 260MB/s! I'm doubting my math here...).
Total runtime: 4525.662 ms
(9 rows)
Thank you for any hints
Alex
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
Hello - On Thu, Jan 24, 2013 at 3:39 PM, Alban Hertroys <haramrae@gmail.com> wrote: > On 24 January 2013 10:57, Alexander Farber <alexander.farber@gmail.com> > wrote: >> >> # explain analyze select count(id) from ( >> select id, >> row_number() over(partition by yw order by money >> desc) as ranking >> from pref_money >> ) x >> where x.ranking = 1 and id='OK452217781481'; >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual >> time=4520.719..4520.719 rows=1 loops=1) >> -> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82) >> (actual time=4470.620..4520.710 rows=6 loops=1) >> Filter: ((x.ranking = 1) AND ((x.id)::text = >> 'OK452217781481'::text)) >> -> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26) >> (actual time=4293.315..4491.652 rows=429803 loops=1) >> -> Sort (cost=48519.10..49603.03 rows=433574 >> width=26) (actual time=4293.306..4352.544 rows=429803 loops=1) >> Sort Key: pref_money.yw, pref_money.money >> Sort Method: external sort Disk: 15856kB > > > It's sorting on disk. That's not going to be fast. Indeed, it's taking > nearly all the time the query takes (4.4s for this step out of 4.5s for the > query). I've noticed that too, but what does "sorting on disk" mean? I have a lot of RAM (32 GB) , should I increase work_mem even more? (it is currenlty 32 MB) Regards Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
I've noticed that too, but what> It's sorting on disk. That's not going to be fast. Indeed, it's taking
> nearly all the time the query takes (4.4s for this step out of 4.5s for the
> query).
does "sorting on disk" mean?
I have a lot of RAM (32 GB) ,
should I increase work_mem even more?
(it is currenlty 32 MB)
You can try increasing the amount of work_mem in your psql session only and see what amount helps. That way you don't need to permanently increase it for all your queries.
I'd start with 48 MB and increase in increments of 16 MB (as that's the size the sort operation claims to require on disk).
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
On Thu, 2013-01-24 at 15:45 +0100, Alexander Farber wrote: > Hello - > > On Thu, Jan 24, 2013 at 3:39 PM, Alban Hertroys <haramrae@gmail.com> wrote: > > On 24 January 2013 10:57, Alexander Farber <alexander.farber@gmail.com> > > wrote: > >> > >> # explain analyze select count(id) from ( > >> select id, > >> row_number() over(partition by yw order by money > >> desc) as ranking > >> from pref_money > >> ) x > >> where x.ranking = 1 and id='OK452217781481'; > >> QUERY PLAN > >> > >> ------------------------------------------------------------------------------------------------------------------------------------------- > >> Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual > >> time=4520.719..4520.719 rows=1 loops=1) > >> -> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82) > >> (actual time=4470.620..4520.710 rows=6 loops=1) > >> Filter: ((x.ranking = 1) AND ((x.id)::text = > >> 'OK452217781481'::text)) > >> -> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26) > >> (actual time=4293.315..4491.652 rows=429803 loops=1) > >> -> Sort (cost=48519.10..49603.03 rows=433574 > >> width=26) (actual time=4293.306..4352.544 rows=429803 loops=1) > >> Sort Key: pref_money.yw, pref_money.money > >> Sort Method: external sort Disk: 15856kB > > > > > > It's sorting on disk. That's not going to be fast. Indeed, it's taking > > nearly all the time the query takes (4.4s for this step out of 4.5s for the > > query). > > I've noticed that too, but what > does "sorting on disk" mean? > > I have a lot of RAM (32 GB) , > should I increase work_mem even more? > (it is currenlty 32 MB) > You should better create an index on pref_money(yw, money). It could help you get rid of the seqscan and sort operations. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Hi - On Thu, Jan 24, 2013 at 8:12 PM, Guillaume Lelarge <guillaume@lelarge.info> wrote: > You should better create an index on pref_money(yw, money). It could > help you get rid of the seqscan and sort operations. I've created an index with # create index pref_money_money_index on pref_money(money desc); and posted the new EXPLAIN output here: http://stackoverflow.com/questions/14498974/optimizing-select-count-query-which-often-takes-over-10-seconds But it doesn't seem to change much or am I too unexperienced to see the change? Thanks Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Fri, 2013-01-25 at 16:13 +0100, Alexander Farber wrote: > Hi - > > On Thu, Jan 24, 2013 at 8:12 PM, Guillaume Lelarge > <guillaume@lelarge.info> wrote: > > You should better create an index on pref_money(yw, money). It could > > help you get rid of the seqscan and sort operations. > > I've created an index with > > # create index pref_money_money_index on pref_money(money desc); > > and posted the new EXPLAIN output here: > > http://stackoverflow.com/questions/14498974/optimizing-select-count-query-which-often-takes-over-10-seconds > > But it doesn't seem to change much or > am I too unexperienced to see the change? > There's no change because you created an index on money alone, and that change sure didn't give PostgreSQL a chance to do anything better. What I told you before was to create an index on yw, and money, like this : create index pref_money_yw_money_idx on pref_money(yw, money); This should help you to change the plan and, I hope, get better performances. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Thu, Jan 24, 2013 at 1:57 AM, Alexander Farber <alexander.farber@gmail.com> wrote: > Hello, > > for a PostgreSQL 8.4.13 database + pgbouncer Using 8.4 is really going to limit your options. .. > > LOG: duration: 12590.394 ms statement: > select count(id) from ( > select id, > row_number() over(partition by yw order by money > desc) as ranking > from pref_money > ) x > where x.ranking = 1 and id='OK471018960997' Since you only care about ranking=1, it might be better to rewrite that using something like: where money = (select max(money....) But, I doubt it. I don't think even the 9.2 planner has the smarts to do what you want efficiently. It might be possible to make it do it efficiently using a recursive query, once you have the index on (yw,money). > This command comes from a PHP-script > of mine which displays "medals" on > a player profile page - meaning how many > times she won a weekly tournament: .. > > Does anybody please have an idea > how could I optimize it or should > I introduce a hourly job and a "medals" > column (that would make my players > stats less "live")? This sounds like a good idea. But if the tournament is weekly why would the job have to be hourly? Why do the results of a weekly tournament need to be 'live'? Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Thu, 2013-01-24 at 15:45 +0100, Alexander Farber wrote: > Hello - > > On Thu, Jan 24, 2013 at 3:39 PM, Alban Hertroys <haramrae@gmail.com> wrote: > > On 24 January 2013 10:57, Alexander Farber <alexander.farber@gmail.com> > > wrote: > >> > >> # explain analyze select count(id) from ( > >> select id, > >> row_number() over(partition by yw order by money > >> desc) as ranking > >> from pref_money > >> ) x > >> where x.ranking = 1 and id='OK452217781481'; > >> QUERY PLAN > >> > >> ------------------------------------------------------------------------------------------------------------------------------------------- > >> Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual > >> time=4520.719..4520.719 rows=1 loops=1) > >> -> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82) > >> (actual time=4470.620..4520.710 rows=6 loops=1) > >> Filter: ((x.ranking = 1) AND ((x.id)::text = > >> 'OK452217781481'::text)) > >> -> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26) > >> (actual time=4293.315..4491.652 rows=429803 loops=1) > >> -> Sort (cost=48519.10..49603.03 rows=433574 > >> width=26) (actual time=4293.306..4352.544 rows=429803 loops=1) > >> Sort Key: pref_money.yw, pref_money.money > >> Sort Method: external sort Disk: 15856kB > > > > > > It's sorting on disk. That's not going to be fast. Indeed, it's taking > > nearly all the time the query takes (4.4s for this step out of 4.5s for the > > query). > > I've noticed that too, but what > does "sorting on disk" mean? > > I have a lot of RAM (32 GB) , > should I increase work_mem even more? > (it is currenlty 32 MB) > You should better create an index on pref_money(yw, money). It could help you get rid of the seqscan and sort operations. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Fri, 2013-01-25 at 16:13 +0100, Alexander Farber wrote: > Hi - > > On Thu, Jan 24, 2013 at 8:12 PM, Guillaume Lelarge > <guillaume@lelarge.info> wrote: > > You should better create an index on pref_money(yw, money). It could > > help you get rid of the seqscan and sort operations. > > I've created an index with > > # create index pref_money_money_index on pref_money(money desc); > > and posted the new EXPLAIN output here: > > http://stackoverflow.com/questions/14498974/optimizing-select-count-query-which-often-takes-over-10-seconds > > But it doesn't seem to change much or > am I too unexperienced to see the change? > There's no change because you created an index on money alone, and that change sure didn't give PostgreSQL a chance to do anything better. What I told you before was to create an index on yw, and money, like this : create index pref_money_yw_money_idx on pref_money(yw, money); This should help you to change the plan and, I hope, get better performances. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
I've noticed that too, but what> It's sorting on disk. That's not going to be fast. Indeed, it's taking
> nearly all the time the query takes (4.4s for this step out of 4.5s for the
> query).
does "sorting on disk" mean?
I have a lot of RAM (32 GB) ,
should I increase work_mem even more?
(it is currenlty 32 MB)
You can try increasing the amount of work_mem in your psql session only and see what amount helps. That way you don't need to permanently increase it for all your queries.
I'd start with 48 MB and increase in increments of 16 MB (as that's the size the sort operation claims to require on disk).
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
On 24 January 2013 10:57, Alexander Farber <alexander.farber@gmail.com> wrote:
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
# explain analyze select count(id) from (
select id,
row_number() over(partition by yw order by money
desc) as ranking
from pref_money
) x
where x.ranking = 1 and id='OK452217781481';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual
time=4520.719..4520.719 rows=1 loops=1)
-> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82)
(actual time=4470.620..4520.710 rows=6 loops=1)
Filter: ((x.ranking = 1) AND ((x.id)::text = 'OK452217781481'::text))
-> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26)
(actual time=4293.315..4491.652 rows=429803 loops=1)
-> Sort (cost=48519.10..49603.03 rows=433574
width=26) (actual time=4293.306..4352.544 rows=429803 loops=1)
Sort Key: pref_money.yw, pref_money.money
Sort Method: external sort Disk: 15856kB
It's sorting on disk. That's not going to be fast. Indeed, it's taking nearly all the time the query takes (4.4s for this step out of 4.5s for the query).
-> Seq Scan on pref_money (cost=0.00..7923.74
rows=433574 width=26) (actual time=0.006..41.907 rows=429803 loops=1)
And then it's doing a sequential scan to sort the data. I suspect that's because it's sorting on disk. Then again, this only takes 42ms, just once (loops=1), so perhaps a seqscan is indeed the fastest approach here (actually, wow, it scans 10000 records/ms - rows are 26 bytes wide, so that's 260MB/s! I'm doubting my math here...).
Total runtime: 4525.662 ms
(9 rows)
Thank you for any hints
Alex
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
Hello - On Thu, Jan 24, 2013 at 3:39 PM, Alban Hertroys <haramrae@gmail.com> wrote: > On 24 January 2013 10:57, Alexander Farber <alexander.farber@gmail.com> > wrote: >> >> # explain analyze select count(id) from ( >> select id, >> row_number() over(partition by yw order by money >> desc) as ranking >> from pref_money >> ) x >> where x.ranking = 1 and id='OK452217781481'; >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=63694.22..63694.23 rows=1 width=82) (actual >> time=4520.719..4520.719 rows=1 loops=1) >> -> Subquery Scan x (cost=48519.10..63694.19 rows=11 width=82) >> (actual time=4470.620..4520.710 rows=6 loops=1) >> Filter: ((x.ranking = 1) AND ((x.id)::text = >> 'OK452217781481'::text)) >> -> WindowAgg (cost=48519.10..57190.58 rows=433574 width=26) >> (actual time=4293.315..4491.652 rows=429803 loops=1) >> -> Sort (cost=48519.10..49603.03 rows=433574 >> width=26) (actual time=4293.306..4352.544 rows=429803 loops=1) >> Sort Key: pref_money.yw, pref_money.money >> Sort Method: external sort Disk: 15856kB > > > It's sorting on disk. That's not going to be fast. Indeed, it's taking > nearly all the time the query takes (4.4s for this step out of 4.5s for the > query). I've noticed that too, but what does "sorting on disk" mean? I have a lot of RAM (32 GB) , should I increase work_mem even more? (it is currenlty 32 MB) Regards Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Thu, Jan 24, 2013 at 1:57 AM, Alexander Farber <alexander.farber@gmail.com> wrote: > Hello, > > for a PostgreSQL 8.4.13 database + pgbouncer Using 8.4 is really going to limit your options. .. > > LOG: duration: 12590.394 ms statement: > select count(id) from ( > select id, > row_number() over(partition by yw order by money > desc) as ranking > from pref_money > ) x > where x.ranking = 1 and id='OK471018960997' Since you only care about ranking=1, it might be better to rewrite that using something like: where money = (select max(money....) But, I doubt it. I don't think even the 9.2 planner has the smarts to do what you want efficiently. It might be possible to make it do it efficiently using a recursive query, once you have the index on (yw,money). > This command comes from a PHP-script > of mine which displays "medals" on > a player profile page - meaning how many > times she won a weekly tournament: .. > > Does anybody please have an idea > how could I optimize it or should > I introduce a hourly job and a "medals" > column (that would make my players > stats less "live")? This sounds like a good idea. But if the tournament is weekly why would the job have to be hourly? Why do the results of a weekly tournament need to be 'live'? Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Thu, Jan 24, 2013 at 4:57 AM, Alexander Farber <alexander.farber@gmail.com> wrote:
--
select count(id) from (
select id,
row_number() over(partition by yw order by money
desc) as ranking
from pref_money
) x
where x.ranking = 1 and id='OK471018960997'
Does anybody please have an idea
how could I optimize it or should
I introduce a hourly job and a "medals"
column (that would make my players
stats less "live")?
One idea is to have a new ranking column to cache every player's ranking for every weekly tournament. However, instead of updating it hourly with a cron job, you could have a trigger on the table, such that when any row is updated/inserted, you recalculate the rankings for only those rows having the same "yw" value.
Then, you might want to create an index on the ranking column as well as the yw column, which you already have indexed
Moshe
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | www.neadwerx.com
moshe@neadwerx.com | www.neadwerx.com
Hi - On Thu, Jan 24, 2013 at 8:12 PM, Guillaume Lelarge <guillaume@lelarge.info> wrote: > You should better create an index on pref_money(yw, money). It could > help you get rid of the seqscan and sort operations. I've created an index with # create index pref_money_money_index on pref_money(money desc); and posted the new EXPLAIN output here: http://stackoverflow.com/questions/14498974/optimizing-select-count-query-which-often-takes-over-10-seconds But it doesn't seem to change much or am I too unexperienced to see the change? Thanks Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Hello - On Fri, Jan 25, 2013 at 7:42 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Thu, Jan 24, 2013 at 1:57 AM, Alexander Farber > <alexander.farber@gmail.com> wrote: >> >> LOG: duration: 12590.394 ms statement: >> select count(id) from ( >> select id, >> row_number() over(partition by yw order by money >> desc) as ranking >> from pref_money >> ) x >> where x.ranking = 1 and id='OK471018960997' > > This sounds like a good idea. But if the tournament is weekly why > would the job have to be hourly? Why do the results of a weekly > tournament need to be 'live'? because for the current week the medals are displayed too. And when a player enters a top then he should get +1 medals and the one he pushed from the top -1 medals So even hourly isn't really good enough for me... It should be "live" stats. Regards Alex
On Sun, Jan 27, 2013 at 9:25 AM, Alexander Farber <alexander.farber@gmail.com> wrote: > Hello - > > On Fri, Jan 25, 2013 at 7:42 PM, Jeff Janes <jeff.janes@gmail.com> wrote: >> This sounds like a good idea. But if the tournament is weekly why >> would the job have to be hourly? Why do the results of a weekly >> tournament need to be 'live'? > > because for the current week > the medals are displayed too. > > And when a player enters a top > then he should get +1 medals and > the one he pushed from the top -1 medals > > So even hourly isn't really good enough for me... > It should be "live" stats. Once the week is over, materialize the medals for that week. Then the live part of the query only needs to specify the currently live week, not the entire history. And in that case, the query should be quite efficient if you have in index on both week and money. Cheers, Jeff
Thanks - On Sun, Jan 27, 2013 at 8:41 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > Once the week is over, materialize the medals for that week. Then the > live part of the query only needs to specify the currently live week, > not the entire history. And in that case, the query should be quite > efficient if you have in index on both week and money. I will try that!
Thank you, I've ended up with this cronjob (yes I'm unfrotunately having week numbers as strings): /* reset and then update medals count */ update pref_users set medals = 0; UPDATE 223456 update pref_users u set medals = s.medals from ( select id, count(id) medals from ( select id, row_number() over(partition by yw order by money desc) as ranking from pref_money where yw <> to_char(CURRENT_TIMESTAMP, 'IYYY-IW') ) x where ranking = 1 group by id ) s where u.id = s.id; UPDATE 65
Alexander Farber <alexander.farber@gmail.com> wrote: > update pref_users set medals = 0; > UPDATE 223456 You're probably going to like your performance a lot better if you modify that to: update pref_users set medals = 0 where medals <> 0; -Kevin
Hello Kevin, On Wed, Jan 30, 2013 at 12:09 AM, Kevin Grittner <kgrittn@ymail.com> wrote: > Alexander Farber <alexander.farber@gmail.com> wrote: > >> update pref_users set medals = 0; >> UPDATE 223456 > > You're probably going to like your performance a lot better if you > modify that to: > > update pref_users set medals = 0 where medals <> 0; is it really so? I only have 65 users (out of 223456) with medals != 0. When programming other languages, I never do if (x != 0) { x = 0; } but just set x = 0 straight away. Regards Alex
The cronjob gives me now occasionally: /* reset and then update medals count */ update pref_users set medals = 0; psql:/home/afarber/bin/clean-database.sql:63: ERROR: deadlock detected DETAIL: Process 31072 waits for ShareLock on transaction 124735679; blocked by process 30368. Process 30368 waits for ShareLock on transaction 124735675; blocked by process 31072. HINT: See server log for query details. update pref_users u set medals = s.medals from ( select id, count(id) medals from ( select id, row_number() over(partition by yw order by money desc) as ranking from pref_money where yw <> to_char(CURRENT_TIMESTAMP, 'IYYY-IW') ) x where ranking = 1 group by id ) s where u.id = s.id; Any ideas please how to workaround? Thank you Alex
Alexander Farber <alexander.farber@gmail.com> wrote: > Kevin Grittner <kgrittn@ymail.com> wrote: >> Alexander Farber <alexander.farber@gmail.com> wrote: >> >>> update pref_users set medals = 0; >>> UPDATE 223456 >> >> You're probably going to like your performance a lot better if >> you modify that to: >> >> update pref_users set medals = 0 where medals <> 0; > > is it really so? Yes. > I only have 65 users (out of 223456) with medals != 0. That's precisely the point. You don't want to update all 223456 rows when there are only 65 which need to be changed. > When programming other languages, I never do > if (x != 0) { x = 0; } but just set x = 0 straight away. Well, if updating a row was as cheap as assigning zero to x I wouldn't suggest a change to your code. If assigning something to x involved an expensive function or disk access, you might try to put an "if" around it. If you don't want to burden your query with the condition, you could consider attaching a trigger to every table that you might want to assign existing values to rows. See the suppress_redundant_updates_trigger() function for details: http://www.postgresql.org/docs/current/interactive/functions-trigger.html -Kevin
Alexander Farber <alexander.farber@gmail.com> wrote: > The cronjob gives me now occasionally: > > /* reset and then update medals count */ > update pref_users set medals = 0; > psql:/home/afarber/bin/clean-database.sql:63: ERROR: deadlock detected > DETAIL: Process 31072 waits for ShareLock on transaction 124735679; blocked by process 30368. > Process 30368 waits for ShareLock on transaction 124735675; blocked by process 31072. > HINT: See server log for query details. > Any ideas please how to workaround? Yeah, try this: update pref_users set medals = 0 where medals <> 0; :-) That should significantly reduce the frequency of deadlocks; however, IMO any application using a relational database should be prepared to retry database transactions which fail with a serialization error, and a deadlock is one form of that. The standard SQLSTATE to look for is '40001' and in PostgreSQL you should also check for '40P01'. -Kevin
On Tue, Jan 29, 2013 at 11:41 PM, Alexander Farber <alexander.farber@gmail.com> wrote: > > When programming other languages, I never do > if (x != 0) { x = 0; } but just set x = 0 straight away. Most other languages are not transactional and durable. Databases are different. Cheers, Jeff
On Wed, Jan 30, 2013 at 2:06 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > update pref_users set medals = 0 where medals <> 0; Thank you all for your insightful comments This has cured my cronjob