An Analyze question - Mailing list pgsql-admin
From | Nick Fankhauser |
---|---|
Subject | An Analyze question |
Date | |
Msg-id | NEBBLAAHGLEEPCGOBHDGIEDKENAA.nickf@ontko.com Whole thread Raw |
Responses |
Re: An Analyze question
|
List | pgsql-admin |
Hello- Does analyze consider the entire key when creating statistics for a table, or a substring composed of the leading chars? Some background: I've just modified *all* of the keys on my database by prefixing them with the same source_id that is 5 chars long. This is in preparation for merging together data from several different sources where I know that the keys are unique to the source, but not necessarily between sources. So on every primary & foreign key, I have executed this update: update table set key = '18105'||key; Now, a few queries that used to be swift are very slow, and on further investigation, I found that the planner is making different decisions- essentially it looks like the statistics now indicate that each key is much less selective & hence a poor candidate for an index scan. I did an analyze on the whole database, and then did an analyze specifically on the tables involved, and checked against an original copy of the database to make sure the indexes are identical in both. If the keys are taken in their entirety, nothing has changed- they are just as selective as ever. However, if only the leading chars are considered, or if the leading chars have a higher weighting, they would certainly appear much less selective. Any thoughts on what happened here? Thanks- -Nick PS: The before & after explains are pasted in below: Before: monroe=# explain select * from actor_cases where actor_id = '18105A7313 53'; NOTICE: QUERY PLAN: Merge Join (cost=27748.94..27807.92 rows=145 width=192) -> Sort (cost=27713.16..27713.16 rows=3410 width=144) -> Nested Loop (cost=0.00..27372.75 rows=3410 width=144) -> Index Scan using actor_case_assignment_both on actor_case_assignment (cost=0.00..11766.67 rows=3410 width=24) -> Index Scan using case_data_case_id on case_data (cost=0.00..4.56 rows=1 width=120) -> Sort (cost=35.78..35.78 rows=522 width=48) -> Seq Scan on local_case_type (cost=0.00..12.22 rows=522 width=48) After: develop=# explain select * from actor_cases where actor_id = '18105A7313 53'; NOTICE: QUERY PLAN: Hash Join (cost=27801.99..53031.15 rows=306 width=192) -> Hash Join (cost=27788.47..51957.43 rows=11377 width=144) -> Seq Scan on case_data (cost=0.00..6932.35 rows=226535 width=120) -> Hash (cost=27693.03..27693.03 rows=11377 width=24) -> Seq Scan on actor_case_assignment (cost=0.00..27693.03 rows=11377 width=24) -> Hash (cost=12.22..12.22 rows=522 width=48) -> Seq Scan on local_case_type (cost=0.00..12.22 rows=522 width=48) -------------------------------------------------------------------------- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
pgsql-admin by date: