Thread: Re: PostgreSQL does CAST implicitely between int and a domain derived from int
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
"Kevin Grittner"
Date:
Jean-Michel Pouré<jm@poure.com> wrote: > [no postgresql.conf changes except] > shared_buffer 24M. That's part of your problem. (Well, that's understating it; we don't have any real evidence that you have any performance problems *not* resulting from failure to do normal configuration.) If you download the configurator tool I referenced in another email you can get a start on fixing this. Or search the lists -- this stuff had been discussed many, many times. Or do a google search; I typed postgresql performance configuration and got many hits. If you spend even one hour closely reviewing any of the top few hits and testing a more optimal configuration for your hardware and workload, most of your performance problems would probably go away. Then, if you *still* have any performance problems, people could help you diagnose and fine-tune from a base which should be much closer. The truth is, with a proper configuration your "biggest problem", which you've misdiagnosed as the result of casting, would probably go away. I will take another look at it now that you have the results of EXPLAIN ANALYZE posted, but I seriously doubt that it's going to do well without tuning the configuration. -Kevin
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
"Kevin Grittner"
Date:
I wrote: > I will take another look at it now that you have the results of > EXPLAIN ANALYZE posted Could you run this?: set work_mem = '50MB'; set effective_cache_size = '3GB'; EXPLAIN ANALYZE <your query> begin transaction; drop index node_comment_statistics_node_comment_timestamp_idx; EXPLAIN ANALYZE <your query> rollback transaction; The BEGIN TRANSACTION and ROLLBACK TRANSACTION will prevent the index from actually being dropped; it just won't be visible to your query during that second run. I'm kinda curious what plan it chooses without it. Some configuration options can be dynamically overridden for a particular connection. This is not a complete list of what you might want to use in your postgresql.conf file, but it might turn up an interesting plan for diagnostic purposes. -Kevin
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Jean-Michel Pouré
Date:
Le mercredi 26 août 2009 à 18:03 -0500, Kevin Grittner a écrit : > That's part of your problem. Sorry, I wrote that too quickly. My configuration is (Quad core, 8Gb): shared_buffers = 2GB (WAS 1GB) temp_buffers = 128MB (modified after reading your message) work_mem = 512MB (modified after reading your message) Still casting. > If you download > the configurator tool I referenced in another email you can get a > start on fixing this. I do that immediately. Then I get back to you. Why not include a configurator in PostgreSQL core? Kind regards, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Jean-Michel Pouré
Date:
Dear Kevin, Thanks for help! Could you run this?: > > set work_mem = '50MB'; > set effective_cache_size = '3GB'; > EXPLAIN ANALYZE <your query> > begin transaction; > drop index node_comment_statistics_node_comment_timestamp_idx; > EXPLAIN ANALYZE <your query> > rollback transaction; set work_mem = '50MB'; set effective_cache_size = '1GB'; EXPLAIN ANALYSE SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.vid = tn.vid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND tn.tid = 3 ORDER BY ncs.last_comment_timestamp DESC LIMIT 1 OFFSET 0 ; "Limit (cost=0.00..544.67 rows=1 width=17) (actual time=455.234..455.234 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49565.19 rows=91 width=17) (actual time=455.232..455.232 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49538.56 rows=91 width=21) (actual time=455.232..455.232 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49512.17 rows=91 width=13) (actual time=455.232..455.232 rows=0 loops=1)" " -> Nested Loop (cost=0.00..27734.58 rows=67486 width=17) (actual time=0.027..264.540 rows=67486 loops=1)" " -> Index Scan Backward using node_comment_statistics_node_comment_timestamp_idx on node_comment_statistics ncs (cost=0.00..3160.99 rows=67486 width=13) (actual time=0.014..40.618 rows=67486 loops=1)" " -> Index Scan using node_pkey on node n (cost=0.00..0.35 rows=1 width=12) (actual time=0.002..0.003 rows=1 loops=67486)" " Index Cond: (n.nid = (ncs.nid)::integer)" " Filter: (n.status = 1)" " -> Index Scan using term_node_vid_idx on term_node tn (cost=0.00..0.31 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=67486)" " Index Cond: ((tn.vid)::integer = (n.vid)::integer)" " Filter: ((tn.tid)::integer = 3)" " -> Index Scan using users_pkey on users u2 (cost=0.00..0.28 rows=1 width=12) (never executed)" " Index Cond: (u2.uid = ncs.last_comment_uid)" " -> Index Scan using users_pkey on users u1 (cost=0.00..0.28 rows=1 width=4) (never executed)" " Index Cond: (u1.uid = n.uid)" "Total runtime: 455.311 ms" > begin transaction; > drop index node_comment_statistics_node_comment_timestamp_idx; > EXPLAIN ANALYZE <your query> > rollback transaction; begin transaction; drop index node_comment_statistics_node_comment_timestamp_idx; EXPLAIN ANALYSE SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.vid = tn.vid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND tn.tid = 3 ORDER BY ncs.last_comment_timestamp DESC LIMIT 1 OFFSET 0 ; rollback transaction; Does not show any result because of ROLLBACK; The query executes in 89 ms. > Some configuration options can be dynamically overridden for a > particular connection. This is not a complete list of what you might > want to use in your postgresql.conf file, but it might turn up an > interesting plan for diagnostic purposes. I am turning to configurator, stay tuned. Again,thanks for your help. Bye, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int and adomain derived from int
From
"Kevin Grittner"
Date:
Jean-Michel Pouré<jm@poure.com> wrote: > Still casting. For about the tenth time on the topic -- YOUR PROBLEM HAS NOTHING WHATSOEVER TO DO WITH CASTING! Let that go so you can look for the real problem. Just as an example, look at this closely: test=# create table t2 (c1 int not null primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for table "t2" CREATE TABLE test=# insert into t2 select * from generate_series(1,10000); INSERT 0 10000 test=# vacuum analyze; VACUUM test=# explain analyze select count(*) from t1 where c1 between 200 and 400; QUERY PLAN ---------------------------------------------------------------------- ------------------------------------------------- Aggregate (cost=12.75..12.76 rows=1 width=0) (actual time=0.764..0.765 rows=1 loops=1) -> Index Scan using t1_pkey on t1 (cost=0.00..12.25 rows=200 width=0) (actual time=0.095..0.470 rows=201 loops=1) Index Cond: (((c1)::integer >= 200) AND ((c1)::integer <= 400)) Total runtime: 0.827 ms (4 rows) The type is always put in there so that you can see what it's doing; it doesn't reflect anything which is actually taking any time. Let it go. -Kevin
Re: PostgreSQL does CAST implicitely between int and adomain derived from int
From
Jean-Michel Pouré
Date:
Le jeudi 27 août 2009 à 09:01 -0500, Kevin Grittner a écrit : > The type is always put in there so that you can see what it's doing; > it doesn't reflect anything which is actually taking any time. My query plan for the same query is: "Aggregate (cost=12.75..12.76 rows=1 width=0) (actual time=0.094..0.094 rows=1 loops=1)" " -> Index Scan using t2_pkey on t2 (cost=0.00..12.25 rows=200 width=0) (actual time=0.016..0.068 rows=201 loops=1)" " Index Cond: ((c1 >= 200) AND (c1 <= 400))" "Total runtime: 0.142 ms" So I don't see any :: in my results. In my various query plans on my database, the :: is only displayed when comparing int and int_unsigned. So I interpreted the :: as a cast. Are you sure that :: does not inform of a cast? Do we have documentation about that? Kind regards, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int and adomain derived from int
From
"Kevin Grittner"
Date:
>Jean-Michel Pouré<jm@poure.com> wrote: > Does not show any result because of ROLLBACK; Then you need to use a better tool to run it. For example, in psql: test=# create table t2 (c1 int not null); CREATE TABLE test=# insert into t2 select * from generate_series(1,10000); INSERT 0 10000 test=# create unique index t2_c1 on t2 (c1); CREATE INDEX test=# explain analyze select count(*) from t2 where c1 between 200 and 400; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Aggregate (cost=47.77..47.78 rows=1 width=0) (actual time=0.668..0.669 rows=1 loops=1) -> Bitmap Heap Scan on t2 (cost=4.76..47.64 rows=50 width=0) (actual time=0.091..0.380 rows=201 loops=1) Recheck Cond: ((c1 >= 200) AND (c1 <= 400)) -> Bitmap Index Scan on t2_c1 (cost=0.00..4.75 rows=50 width=0) (actual time=0.080..0.080 rows=201 loops=1) Index Cond: ((c1 >= 200) AND (c1 <= 400)) Total runtime: 0.722 ms (6 rows) test=# begin transaction; BEGIN test=# drop index t2_c1; DROP INDEX test=# explain analyze select count(*) from t2 where c1 between 200 and 400; QUERY PLAN -------------------------------------------------------------------------------------------------------- Aggregate (cost=190.50..190.51 rows=1 width=0) (actual time=3.324..3.325 rows=1 loops=1) -> Seq Scan on t2 (cost=0.00..190.00 rows=200 width=0) (actual time=0.053..3.036 rows=201 loops=1) Filter: ((c1 >= 200) AND (c1 <= 400)) Total runtime: 3.366 ms (4 rows) test=# rollback transaction; ROLLBACK -Kevin
Re: PostgreSQL does CAST implicitely between int and adomain derived from int
From
Jean-Michel Pouré
Date:
Le jeudi 27 août 2009 à 09:16 -0500, Kevin Grittner a écrit : > Then you need to use a better tool to run it. Understood, thanks. cms=# set work_mem = '50MB'; SET cms=# set effective_cache_size = '1GB'; SET cms=# begin transaction; BEGIN cms=# drop index node_comment_statistics_node_comment_timestamp_idx; DROP INDEX cms=# cms=# EXPLAIN ANALYSE cms-# SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid cms-# FROM node n cms-# INNER JOIN users u1 ON n.uid = u1.uid cms-# INNER JOIN term_node tn ON n.vid = tn.vid cms-# INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid cms-# INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid cms-# WHERE n.status = 1 AND tn.tid = 3 cms-# ORDER BY ncs.last_comment_timestamp DESC LIMIT 1 OFFSET 0 ; cms=# rollback transaction; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=972.82..972.82 rows=1 width=17) (actual time=0.018..0.018 rows=0 loops=1) -> Sort (cost=972.82..973.04 rows=91 width=17) (actual time=0.018..0.018 rows=0 loops=1) Sort Key: ncs.last_comment_timestamp Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=4.96..972.36 rows=91 width=17) (actual time=0.010..0.010 rows=0 loops=1) -> Nested Loop (cost=4.96..945.74 rows=91 width=21) (actual time=0.010..0.010 rows=0 loops=1) -> Nested Loop (cost=4.96..919.34 rows=91 width=13) (actual time=0.010..0.010 rows=0 loops=1) -> Nested Loop (cost=4.96..890.02 rows=91 width=8) (actual time=0.009..0.009 rows=0 loops=1) -> Bitmap Heap Scan on term_node tn (cost=4.96..215.63 rows=91 width=4) (actual time=0.009..0.009 rows=0 loops=1) Recheck Cond: ((tid)::integer = 3) -> Bitmap Index Scan on term_node_tid_idx (cost=0.00..4.94 rows=91 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: ((tid)::integer = 3) -> Index Scan using node_vid_idx on node n (cost=0.00..7.40 rows=1 width=12) (never executed) Index Cond: ((n.vid)::integer = (tn.vid)::integer) Filter: (n.status = 1) -> Index Scan using node_comment_statistics_pkey on node_comment_statistics ncs (cost=0.00..0.31 rows=1 width=13) (never executed) Index Cond: ((ncs.nid)::integer = n.nid) -> Index Scan using users_pkey on users u2 (cost=0.00..0.28 rows=1 width=12) (never executed) Index Cond: (u2.uid = ncs.last_comment_uid) -> Index Scan using users_pkey on users u1 (cost=0.00..0.28 rows=1 width=4) (never executed) Index Cond: (u1.uid = n.uid) Total runtime: 0.092 ms (22 lignes) Does it mean my index is broken and should be rebuilt? Kind regards, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int andadomain derived from int
From
"Kevin Grittner"
Date:
Jean-Michel Pouré<jm@poure.com> wrote: > ... Index Cond: ((tid)::integer = 3) > ... Index Cond: ((n.vid)::integer = (tn.vid)::integer) > ... Index Cond: ((ncs.nid)::integer = n.nid) > Total runtime: 0.092 ms Sorry, but I just had to point that out. I feel much better now. ;-) > Does it mean my index is broken and should be rebuilt? No, probably not. Just to get another data point, what happens if you run the same query without taking the index out of the picture, but without the LIMIT or OFFSET clauses? An EXPLAIN ANALYZE of that would help understand it more fully. -Kevin
Re: PostgreSQL does CAST implicitely between int andadomain derived from int
From
Jean-Michel Pouré
Date:
Le jeudi 27 août 2009 à 09:52 -0500, Kevin Grittner a écrit : > Just to get another data point, what happens if you run the same query > without taking the index out of the picture, but without the LIMIT or > OFFSET clauses? An EXPLAIN ANALYZE of that would help understand it > more fully. Also, just a short notice that this SELECT returns no result. You were right: adding LIMIT 1 changes speed from O.090 ms to 420 ms. This has nothing to do with casting. EXPLAIN ANALYSE SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.vid = tn.vid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND tn.tid = 3 ORDER BY ncs.last_comment_timestamp DESC "Sort (cost=975.32..975.55 rows=91 width=17) (actual time=0.021..0.021 rows=0 loops=1)" " Sort Key: ncs.last_comment_timestamp" " Sort Method: quicksort Memory: 25kB" " -> Nested Loop (cost=4.96..972.36 rows=91 width=17) (actual time=0.016..0.016 rows=0 loops=1)" " -> Nested Loop (cost=4.96..945.74 rows=91 width=21) (actual time=0.016..0.016 rows=0 loops=1)" " -> Nested Loop (cost=4.96..919.34 rows=91 width=13) (actual time=0.016..0.016 rows=0 loops=1)" " -> Nested Loop (cost=4.96..890.02 rows=91 width=8) (actual time=0.016..0.016 rows=0 loops=1)" " -> Bitmap Heap Scan on term_node tn (cost=4.96..215.63 rows=91 width=4) (actual time=0.016..0.016 rows=0 loops=1)" " Recheck Cond: ((tid)::integer = 3)" " -> Bitmap Index Scan on term_node_tid_idx (cost=0.00..4.94 rows=91 width=0) (actual time=0.014..0.014 rows=0 loops=1)" " Index Cond: ((tid)::integer = 3)" " -> Index Scan using node_vid_idx on node n (cost=0.00..7.40 rows=1 width=12) (never executed)" " Index Cond: ((n.vid)::integer = (tn.vid)::integer)" " Filter: (n.status = 1)" " -> Index Scan using node_comment_statistics_pkey on node_comment_statistics ncs (cost=0.00..0.31 rows=1 width=13) (never executed)" " Index Cond: ((ncs.nid)::integer = n.nid)" " -> Index Scan using users_pkey on users u2 (cost=0.00..0.28 rows=1 width=12) (never executed)" " Index Cond: (u2.uid = ncs.last_comment_uid)" " -> Index Scan using users_pkey on users u1 (cost=0.00..0.28 rows=1 width=4) (never executed)" " Index Cond: (u1.uid = n.uid)" "Total runtime: 0.090 ms" EXPLAIN ANALYSE SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.vid = tn.vid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND tn.tid = 3 ORDER BY ncs.last_comment_timestamp DESC LIMIT 1 "Limit (cost=0.00..544.67 rows=1 width=17) (actual time=435.715..435.715 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49565.19 rows=91 width=17) (actual time=435.713..435.713 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49538.56 rows=91 width=21) (actual time=435.713..435.713 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49512.17 rows=91 width=13) (actual time=435.713..435.713 rows=0 loops=1)" " -> Nested Loop (cost=0.00..27734.58 rows=67486 width=17) (actual time=0.029..252.443 rows=67486 loops=1)" " -> Index Scan Backward using node_comment_statistics_node_comment_timestamp_idx on node_comment_statistics ncs (cost=0.00..3160.99 rows=67486 width=13) (actual time=0.014..40.583 rows=67486 loops=1)" " -> Index Scan using node_pkey on node n (cost=0.00..0.35 rows=1 width=12) (actual time=0.002..0.003 rows=1 loops=67486)" " Index Cond: (n.nid = (ncs.nid)::integer)" " Filter: (n.status = 1)" " -> Index Scan using term_node_vid_idx on term_node tn (cost=0.00..0.31 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=67486)" " Index Cond: ((tn.vid)::integer = (n.vid)::integer)" " Filter: ((tn.tid)::integer = 3)" " -> Index Scan using users_pkey on users u2 (cost=0.00..0.28 rows=1 width=12) (never executed)" " Index Cond: (u2.uid = ncs.last_comment_uid)" " -> Index Scan using users_pkey on users u1 (cost=0.00..0.28 rows=1 width=4) (never executed)" " Index Cond: (u1.uid = n.uid)" "Total runtime: 435.788 ms" EXPLAIN ANALYSE SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.vid = tn.vid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND tn.tid = 3 ORDER BY ncs.last_comment_timestamp DESC LIMIT 1 OFFSET 0 "Limit (cost=0.00..544.67 rows=1 width=17) (actual time=541.488..541.488 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49565.19 rows=91 width=17) (actual time=541.486..541.486 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49538.56 rows=91 width=21) (actual time=541.485..541.485 rows=0 loops=1)" " -> Nested Loop (cost=0.00..49512.17 rows=91 width=13) (actual time=541.485..541.485 rows=0 loops=1)" " -> Nested Loop (cost=0.00..27734.58 rows=67486 width=17) (actual time=0.024..307.341 rows=67486 loops=1)" " -> Index Scan Backward using node_comment_statistics_node_comment_timestamp_idx on node_comment_statistics ncs (cost=0.00..3160.99 rows=67486 width=13) (actual time=0.012..62.504 rows=67486 loops=1)" " -> Index Scan using node_pkey on node n (cost=0.00..0.35 rows=1 width=12) (actual time=0.003..0.003 rows=1 loops=67486)" " Index Cond: (n.nid = (ncs.nid)::integer)" " Filter: (n.status = 1)" " -> Index Scan using term_node_vid_idx on term_node tn (cost=0.00..0.31 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=67486)" " Index Cond: ((tn.vid)::integer = (n.vid)::integer)" " Filter: ((tn.tid)::integer = 3)" " -> Index Scan using users_pkey on users u2 (cost=0.00..0.28 rows=1 width=12) (never executed)" " Index Cond: (u2.uid = ncs.last_comment_uid)" " -> Index Scan using users_pkey on users u1 (cost=0.00..0.28 rows=1 width=4) (never executed)" " Index Cond: (u1.uid = n.uid)" "Total runtime: 541.568 ms"
Attachment
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
"Kevin Grittner"
Date:
Jean-Michel Pouré<jm@poure.com> wrote: > Also, just a short notice that this SELECT returns no result. Once you posted EXPLAIN ANALYZE results, that was clear because actual rows on the top line is zero. > You were right: adding LIMIT 1 changes speed from O.090 ms to 420 > ms. In summary, what's happening is that when the LIMIT 1 is there, the optimizer sees that the index will return rows in the order you requested, and thinks that it won't have to read very far to get a match, at which point it would be able to stop. There are no matches, but it has to read all the way through the index, pulling related rows to check for matches, before it can know that. Without the limit, it optimizes for the fastest plan which will scan all the rows. The first test returns nothing, so all the joins become very cheap -- they are never exercised. This is related to a topic recently discussed on the hackers list -- whether the optimizer should be modified to recognize "risky" plans, and try to avoid them. This is another example of a query which might benefit from such work. It's also possible that this is another manifestation of an issue about which there has been some dispute -- the decision to always round up any fraction on expected rows to the next whole number. I don't know without doing more research, but it wouldn't shock me if this rounding contributed to the optimizer's expectations that it would get a match soon enough to make the problem plan a good one. It is *possible* that if you boost your default_statistics_target and run ANALYZE (or VACUUM ANALYZE), it will recognize that it isn't a good idea to read backwards on that index. I would try it and see, if that's practical for you. If not, you might be able to limit the plans that the optimizer considers using various techniques, but that's a bit of a kludge; I'd save it for a last resort. > This has nothing to do with casting. Yeah, that much was pretty apparent to many people from the start. It was rather frustrating that you weren't listening on that point; I think that resulted in you wasting time focusing on the wrong things and not moving in a productive direction sooner. As has been suggested by someone else, you'll get better results presenting your problem with as much relevant detail as possible and asking for help sorting it out, rather than putting too much emphasis on your preliminary guess as to the cause. -Kevin
Le jeudi 27 août 2009 à 11:36 -0500, Kevin Grittner a écrit : > It is *possible* that if you boost your default_statistics_target and > run ANALYZE (or VACUUM ANALYZE), it will recognize that it isn't a > good idea to read backwards on that index. I would try it and see, if > that's practical for you. If not, you might be able to limit the > plans that the optimizer considers using various techniques, but > that's a bit of a kludge; I'd save it for a last resort. I will try that. > Yeah, that much was pretty apparent to many people from the start. It > was rather frustrating that you weren't listening on that point; I > think that resulted in you wasting time focusing on the wrong things > and not moving in a productive direction sooner. As has been > suggested by someone else, you'll get better results presenting your > problem with as much relevant detail as possible and asking for help > sorting it out, rather than putting too much emphasis on your > preliminary guess as to the cause. Yeah. I will keep that in mind, don't worry. This kind of slow queries on LIMIT seems to happen all the time on Drupal. Maybe it is because the site is not yet going live. Also this means that Drupal on PostgreSQL could rock completely if/when the optimizer has enough information to find the correct plan. If you are interested, I can post on performance ML strange queries with LIMIT that may be interesting after we go life and have enough statistics. Many thanks and bye, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Robert Haas
Date:
2009/8/27 Kevin Grittner <Kevin.Grittner@wicourts.gov>: > It is *possible* that if you boost your default_statistics_target and > run ANALYZE (or VACUUM ANALYZE), it will recognize that it isn't a > good idea to read backwards on that index. I would try it and see, if > that's practical for you. I notice this in one of the plans: -> Bitmap Index Scan on term_node_tid_idx (cost=0.00..4.94 rows=91 width=0) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: ((tid)::integer = 3) That's a pretty bad estimate for a scan of a single relation with a filter on one column. I'd like to see the output of: SELECT MIN(tid), MAX(tid), SUM(1) FROM term_node; SHOW default_statistics_target; By the way, why does EXPLAIN not display the name of the table as well as the index when it performs a bitmap index scan? It does do so for a regular index scan. What version of PG is this again? ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > By the way, why does EXPLAIN not display the name of the table as well > as the index when it performs a bitmap index scan? Because that plan node is not in fact touching the table. The table name is shown in the BitmapHeapScan node that *does* touch the table. regards, tom lane
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Jean-Michel Pouré
Date:
Le jeudi 27 août 2009 à 13:35 -0400, Robert Haas a écrit : > SELECT MIN(tid), MAX(tid), SUM(1) FROM term_node; > SHOW default_statistics_target; SELECT MIN(tid), MAX(tid), SUM(1) FROM term_node; 6;56;67479 SHOW default_statistics_target; 100 For information, if some hackers are interested and they belong to the community for a long time, I can provide tables with data. Kind regards, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Jean-Michel Pouré
Date:
Le jeudi 27 août 2009 à 14:05 -0400, Tom Lane a écrit : > tom lane Dear Tom, Why is the query planner displaying ::integer What does it mean? Kind regards, Jean-Michel
Attachment
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Bruce Momjian
Date:
Jean-Michel Pour� wrote: -- Start of PGP signed section. > Le jeudi 27 ao?t 2009 ? 14:05 -0400, Tom Lane a ?crit : > > tom lane > > Dear Tom, > > Why is the query planner displaying ::integer > What does it mean? ::integer casts a data type to INTEGER. It is the same as CAST(). -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Jean-Michel Pouré
Date:
Le samedi 29 août 2009 à 11:16 -0400, Bruce Momjian a écrit : > > Why is the query planner displaying ::integer > > What does it mean? > > ::integer casts a data type to INTEGER. It is the same as CAST(). In Drupal database, we have two types: integer int_unsigned CREATE DOMAIN int_unsigned AS integer CONSTRAINT int_unsigned_check CHECK ((VALUE >= 0)); Why do queries cast between integer and int_unsigned? Kind regards, Jean-Michel
Attachment
Jean-Michel =?ISO-8859-1?Q?Pour=E9?= <jm@poure.com> writes: > In Drupal database, we have two types: > CREATE DOMAIN int_unsigned > AS integer > CONSTRAINT int_unsigned_check CHECK ((VALUE >= 0)); > Why do queries cast between integer and int_unsigned? That domain doesn't have any operators of its own. To compare to another value, or use an index, you have to cast it to integer which does have operators. It's a no-op cast, but logically necessary. regards, tom lane
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Jean-Michel Pouré
Date:
Le samedi 29 août 2009 à 13:44 -0400, Tom Lane a écrit : > That domain doesn't have any operators of its own. To compare to > another value, or use an index, you have to cast it to integer which > does have operators. It's a no-op cast, but logically necessary. Dear Tom, Thanks for answering. On more question: Drupal makes use these no-op CREATE DOMAINs in the database schema : CREATE DOMAIN int_unsigned AS integer CONSTRAINT int_unsigned_check CHECK ((VALUE >= 0)); CREATE DOMAIN bigint_unsigned AS bigint CONSTRAINT bigint_unsigned_check CHECK ((VALUE >= 0)); CREATE DOMAIN smallint_unsigned AS smallint CONSTRAINT smallint_unsigned_check CHECK ((VALUE >= 0)); CREATE DOMAIN varchar_ci AS character varying(255) DEFAULT ''::character varying NOT NULL; In my slow queries, I can notice excessive no-op casts. Do you think this could lead to excessive sequential scans? What do you recommend: using normal types and moving constraints in the Drupal database? Is PostgreSQL domain broken as it forces casting or is this a no-op for performance? What do you recommend? Kind regards, Jean-Michel
Attachment
Jean-Michel =?ISO-8859-1?Q?Pour=E9?= <jm@poure.com> writes: > What do you recommend: using normal types and moving constraints in the > Drupal database? Is PostgreSQL domain broken as it forces casting or is > this a no-op for performance? In principle it should be an unnoticeable slowdown. In the past we've had some issues with the planner failing to recognize possible optimizations when there was a cast in the way, but I'm not aware of any such bugs at the moment. regards, tom lane
Re: PostgreSQL does CAST implicitely between int and a domain derived from int
From
Greg Stark
Date:
On Sat, Aug 29, 2009 at 10:45 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Jean-Michel =?ISO-8859-1?Q?Pour=E9?= <jm@poure.com> writes: >> What do you recommend: using normal types and moving constraints in the >> Drupal database? Is PostgreSQL domain broken as it forces casting or is >> this a no-op for performance? > > In principle it should be an unnoticeable slowdown. In the past we've > had some issues with the planner failing to recognize possible > optimizations when there was a cast in the way, but I'm not aware of > any such bugs at the moment. In particular since your plan nodes are all index scans it's clear that the casts are not getting in the way. The symptom when there were problems was that the planner was forced to use sequential scans because it couldn't match the casted expressionto the index expression or some variant of that. -- greg http://mit.edu/~gsstark/resume.pdf