Thread: An unresolved performance problem.

An unresolved performance problem.

From

Achilleus Mantzios

Date:

07 May 2003, 07:29:07

Hi, few days ago, i posted some really wierd (at least to me)
situation (maybe a potentian bug) to the performance and bugs list
and to some core hacker(s) privately as well,
and i got no response.
Moreover i asked for some feedback
in order to understand/fix the problem myself,
and again received no response.

What i asked was pretty simple:
"1. Is it possible that the absense of statistics make the planer produce
better plans
than in the case of statistcs generated with vacuum
analyze/analyze?
2. If No, i found a bug,
3. If yes then under what conditions??
4. If no person knows the answer or no hacker wants to dig into the
problem then is there a direction i must follow to understand/fix whats
going on myself??""

Pretty straight i think.

Well, i stack on step 1.

It seemed to me that either my question was too naive to deserve some real
investigation (doubtedly), or no one was in a position to comment on
it (doubtedly), or that it is not considered an interesting case (possible),
or that some people move all the mail i send to the lists to /dev/null
(unfortunately possible too).

So Since i really have stuck to postgresql for over 2 years for both
technical and emotional reasons, i would feel much more confident
if i would reach step 2 or greater.

The table i have in question is a critical one in my application
since it monitors important plan maintenance data, and i have
to move on with this problem.

Thanx

P.S. the www (64.49.215.82) server is down for while.

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Hannu Krosing

Date:

07 May 2003, 09:40:56

Achilleus Mantzios kirjutas K, 07.05.2003 kell 19:33:
> Hi, few days ago, i posted some really wierd (at least to me)
> situation (maybe a potentian bug) to the performance and bugs list
> and to some core hacker(s) privately as well,
> and i got no response.
> Moreover i asked for some feedback
> in order to understand/fix the problem myself,
> and again received no response.
>
> What i asked was pretty simple:
> "1. Is it possible that the absense of statistics make the planer produce
> better plans than in the case of statistcs generated with vacuum
> analyze/analyze?

Yes, the planner is not perfect, the statistics are just statistics
(based on a random sample), etc..

This question comes up at least once a month on either [PERFORM] or
[HACKERS], search the mailing lists to get more thorough
discussion/explanation.

> 2. If No, i found a bug,

Rather a feature ;-p

> 3. If yes then under what conditions??

if

1) ANALYZE produced skewed data which was worse than default.

or.

2) some costs are way off for your system (try changing them in
postgresql.conf)

> 4. If no person knows the answer or no hacker wants to dig into the
> problem then is there a direction i must follow to understand/fix whats
> going on myself??""

You can sturt by enabling/disabling various scan methods

psqldb# set enable_seqscan to off;
SET


and see what happens, then adjust the weights in postgresql.conf or use
some combination of SETs around critical queries to force the plan you
like.


------------
Hannu

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

07 May 2003, 10:05:02

On 7 May 2003, Hannu Krosing wrote:

> Achilleus Mantzios kirjutas K, 07.05.2003 kell 19:33:
> > Hi, few days ago, i posted some really wierd (at least to me)
> > situation (maybe a potentian bug) to the performance and bugs list
> > and to some core hacker(s) privately as well,
> > and i got no response.
> > Moreover i asked for some feedback
> > in order to understand/fix the problem myself,
> > and again received no response.
> >
> > What i asked was pretty simple:
> > "1. Is it possible that the absense of statistics make the planer produce
> > better plans than in the case of statistcs generated with vacuum
> > analyze/analyze?
>
> Yes, the planner is not perfect, the statistics are just statistics
> (based on a random sample), etc..
>
> This question comes up at least once a month on either [PERFORM] or
> [HACKERS], search the mailing lists to get more thorough
> discussion/explanation.

Ooopss i am i pgsql-performance@postgresl.org newbie
(up to now i thought -sql was where all the fun takes place :)

>
> > 2. If No, i found a bug,
>
> Rather a feature ;-p
>
> > 3. If yes then under what conditions??
>
> if
>
> 1) ANALYZE produced skewed data which was worse than default.
>
> or.
>
> 2) some costs are way off for your system (try changing them in
> postgresql.conf)
>

My systems are (rather usual) linux/freebsd and the costs defined (by
default) in postgresql.conf worked well for all queries except
a cursed query on a cursed table.
So i start to believe its an estimation selectivity
problem.

> > 4. If no person knows the answer or no hacker wants to dig into the
> > problem then is there a direction i must follow to understand/fix whats
> > going on myself??""
>
> You can sturt by enabling/disabling various scan methods
>
> psqldb# set enable_seqscan to off;
> SET
>

I have about 10 indexes on this table, and the "correct" one
is used only if i do set enable_seqscan to off; and
drop all other indexes.
Otherwise i get either a seq scan or the wrong index.

>
> and see what happens, then adjust the weights in postgresql.conf or use
> some combination of SETs around critical queries to force the plan you
> like.
>

Also i played with ALTER TABLE set statistics
but could not generate this ideal situation when
no stats where available (right after a load).

The problem is that other queries on this table
need some indexes.
I dunno whata do :(

>
> ------------
> Hannu
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Josh Berkus

Date:

07 May 2003, 11:27:35

Achilleus,

> My systems are (rather usual) linux/freebsd and the costs defined (by
> default) in postgresql.conf worked well for all queries except
> a cursed query on a cursed table.
> So i start to believe its an estimation selectivity
> problem.

We can probably fix the problem by re-writing the query then;  see my previous
example this weekend about overdetermining criteria in order to force the use
of an index.

How about posting the query and the EXPLAIN ANALYZE results?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [SQL] An unresolved performance problem.

From

"Mendola Gaetano"

Date:

07 May 2003, 11:37:11

----- Original Message -----
From: "Achilleus Mantzios" <achill@matrix.gatewaynet.com>
To: <pgsql-sql@postgresql.org>; <pgsql-performance@postgresql.org>;
<pgsql-bugs@postgresql.org>
Sent: Wednesday, May 07, 2003 6:33 PM
Subject: [SQL] An unresolved performance problem.


>
> Hi, few days ago, i posted some really wierd (at least to me)
> situation (maybe a potentian bug) to the performance and bugs list
> and to some core hacker(s) privately as well,
> and i got no response.

I seen around a lot of questions are remaining without any reply,
may be in this period the guys like Tom Lane are too busy.

> Moreover i asked for some feedback
> in order to understand/fix the problem myself,
> and again received no response.
>
> What i asked was pretty simple:
> "1. Is it possible that the absense of statistics make the planer produce
> better plans
> than in the case of statistcs generated with vacuum
> analyze/analyze?
> 2. If No, i found a bug,
> 3. If yes then under what conditions??
> 4. If no person knows the answer or no hacker wants to dig into the
> problem then is there a direction i must follow to understand/fix whats
> going on myself??""

Can you give us more informations? Like the table structure, wich kind
of query are you tring to do and so on...


Gaetano

Re: An unresolved performance problem.

From

"scott.marlowe"

Date:

07 May 2003, 11:54:09

On Wed, 7 May 2003, Achilleus Mantzios wrote:

>
> Hi, few days ago, i posted some really wierd (at least to me)
> situation (maybe a potentian bug) to the performance and bugs list
> and to some core hacker(s) privately as well,
> and i got no response.
> Moreover i asked for some feedback
> in order to understand/fix the problem myself,
> and again received no response.
>
> What i asked was pretty simple:
> "1. Is it possible that the absense of statistics make the planer produce
> better plans
> than in the case of statistcs generated with vacuum
> analyze/analyze?

One of the common examples of this happening was posted a few weeks back.
someone was basically doing this:

delete from table;
analyze table;
insert into table (1,000,000 times);

the problem was that the table had fk constraints to another table, and
the query planner for the inserts (all 1,000,000 of them) assumed it was
inserting into a mostly empty table, and therefore used seq scans instead
of index scans.

It's not a bug, not quite a feature, just a corner case.

Re: Unanswered Questions WAS: An unresolved performance problem.

From

Josh Berkus

Date:

07 May 2003, 12:57:51

Gaetano,

> I seen around a lot of questions are remaining without any reply,
> may be in this period the guys like Tom Lane are too busy.

Yes, they are.   Currently the major contributors are working hard to shape up
both 7.3.3. and 7.4 (and having a long-running discussion about the due date
for 7.4), so they don't have much time for questions.

And for my part, I'm too busy with my paying job to answer all the questions
that get posted, as I suspect are Stephan and Bruno and several other people
who field newbie questions.   Given the flood of requests, I have to
prioritize ... and a question which is missing several crucial details (like
a copy of the query!!!)   is going to get answered way later than a question
which provides all the needed information -- if at all.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco

Re: An unresolved performance problem.

From

Manfred Koizar

Date:

07 May 2003, 14:43:09

On Wed, 7 May 2003 17:09:17 -0200 (GMT+2), Achilleus Mantzios
<achill@matrix.gatewaynet.com> wrote:
>I have about 10 indexes on this table, and the "correct" one
>is used only if i do set enable_seqscan to off; and
>drop all other indexes.

What we already have is

|dynacom=# EXPLAIN ANALYZE
|SELECT count(*)
|  FROM status
| WHERE assettable='vessels' AND appname='ISM PMS' AND apptblname='items' AND status='warn' AND isvalid AND
assetidval=57;
|
|QUERY PLAN (fbsd)

|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Aggregate  (cost=6.02..6.02 rows=1 width=0) (actual time=14.16..14.16 rows=1 loops=1)
|   ->  Index Scan using status_all on status  (cost=0.00..6.02 rows=1 width=0) (actual time=13.09..13.95 rows=75
loops=1)
|         Index Cond: ((assettable = 'vessels'::character varying) AND (assetidval = 57) AND (appname = 'ISM
PMS'::charactervarying) AND (apptblname = 'items'::character varying) AND (status = 'warn'::character varying)) 
|         Filter: isvalid
| Total runtime: 14.40 msec
|(5 rows)
|
|QUERY PLAN (lnx)

|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Aggregate  (cost=1346.56..1346.56 rows=1 width=0) (actual time=244.05..244.05 rows=1 loops=1)
|   ->  Seq Scan on status  (cost=0.00..1345.81 rows=300 width=0) (actual time=0.63..243.93 rows=75 loops=1)
|         Filter: ((assettable = 'vessels'::character varying) AND (appname = 'ISM PMS'::character varying) AND
(apptblname= 'items'::character varying) AND (status = 'warn'::character varying) AND isvalid AND (assetidval = 57)) 
| Total runtime: 244.12 msec
|(4 rows)

Now set enable_seqscan to off, and show as the EXPLAIN ANALYSE output.
If the wrong index is used, remove it and rerun the query.  Repeat
until you arrive at the correct index and show us these results, too.

>Otherwise i get either a seq scan or the wrong index.
|   ->  Seq Scan on status  (cost=0.00..1345.81 rows=300 width=0) (actual time=0.63..243.93 rows=75 loops=1)
                                        ^^^^
This seems strange, given that relpages = 562.
What are your config settings?  And what hardware is this running on,
especially how much RAM?

Servus
 Manfred

Re: [SQL] Unanswered Questions WAS: An unresolved performance problem.

From

Randall Lucas

Date:

07 May 2003, 19:52:48

Folks,

I suspect that a good number of fairly simple questions aren't being
answered because they're either misdirected or because the poster
hasn't included an "answerable" question (one with sufficient
information to answer).

A suggestion to partially counter this, at least for "slow query" type
questions, has been put forth.  If we make it a social norm on the
pg-lists in general to reply off-list to inadequately descriptive "slow
query" questions with a canned message of helpful guidance, we may be
able to up the level of "answerability" of most questions.  Ideally,
this would make the questions more transparent, so that more responses
can come from folks other than the major contributors.

Thoughts?  Josh and I have placed a draft at
http://techdocs.postgresql.org/guides/SlowQueryPostingGuidelines

I'd specifically like to hear whether people would suggest more of an
emphasis on heuristics for self-help in such a message, what other info
should be included in a "good" slow query question, and people's
thoughts on the netiquette of the whole idea.

Best,

Randall

On Wednesday, May 7, 2003, at 12:57 PM, Josh Berkus wrote:

> Gaetano,
>
>> I seen around a lot of questions are remaining without any reply,
>> may be in this period the guys like Tom Lane are too busy.
>
> Yes, they are.   Currently the major contributors are working hard to
> shape up
> both 7.3.3. and 7.4 (and having a long-running discussion about the
> due date
> for 7.4), so they don't have much time for questions.
>
> And for my part, I'm too busy with my paying job to answer all the
> questions
> that get posted, as I suspect are Stephan and Bruno and several other
> people
> who field newbie questions.   Given the flood of requests, I have to
> prioritize ... and a question which is missing several crucial details
> (like
> a copy of the query!!!)   is going to get answered way later than a
> question
> which provides all the needed information -- if at all.
>
> --
> -Josh Berkus
>  Aglio Database Solutions
>  San Francisco
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>

Re: [SQL] Unanswered Questions WAS: An unresolved performance problem.

From

Tom Lane

Date:

08 May 2003, 00:13:52

Randall Lucas <rlucas@tercent.net> writes:
> I suspect that a good number of fairly simple questions aren't being
> answered because they're either misdirected or because the poster
> hasn't included an "answerable" question (one with sufficient
> information to answer).

That's always been a problem, but it does seem to have been getting
worse lately.

> A suggestion to partially counter this, at least for "slow query" type
> questions, has been put forth.  If we make it a social norm on the
> pg-lists in general to reply off-list to inadequately descriptive "slow
> query" questions with a canned message of helpful guidance, we may be
> able to up the level of "answerability" of most questions.

The idea of some canned guidance doesn't seem bad, but I'm not sure if
it should be off-list or not.  If newbies are corrected off-list then
other newbies who might be lurking, or reading the archives, don't learn
any better and will make the same mistakes in their turn.

How about a standard answer of "you haven't really provided enough info
for us to be helpful, please see this-URL for some hints"?  That would
avoid bulking up the list archives with many copies, yet at the same
time the archives would provide evidence of the existence of hints...

> Thoughts?  Josh and I have placed a draft at
> http://techdocs.postgresql.org/guides/SlowQueryPostingGuidelines

Looks good, though I concur with Stephan's comment that the table
schemas aren't optional.

It might be worth including a checklist of the standard kinds of errors
(for example, datatype mismatch preventing index usage).  Come to think
of it, that starts to make it look like a FAQ list directed towards
performance issues.  Maybe we could make this a subsection of the main
FAQ?

            regards, tom lane

Re: [SQL] Unanswered Questions WAS: An unresolved performance problem.

From

Sean Chittenden

Date:

08 May 2003, 00:57:50

> > I suspect that a good number of fairly simple questions aren't
> > being answered because they're either misdirected or because the
> > poster hasn't included an "answerable" question (one with
> > sufficient information to answer).
>
> That's always been a problem, but it does seem to have been getting
> worse lately.

I hate to point this out, but "TIP 4" is getting a bit old and the 6
tips that we throw out to probably about 40K people about 1-200 times
a day have probably reached saturation.  Without looking at the
archives, I bet anyone a shot of good scotch that, it's probably
pretty infrequent that people don't kill -9 their postmasters.

Any chance we could flush out the TIPs at the bottom to include,
"VACUUM ANALYZE your database regularly," or "When reporting a
problem, include the output from EXPLAIN [query]," or "ANALYZE tables
before examining the output from an EXPLAIN [query]," or "Visit [url]
for a tutorial on (schemas|triggers|views)."

-sc

--
Sean Chittenden

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

08 May 2003, 03:44:12

About the unanswered questions problem:

There seems to be a trade off between
describing a problem as minimalistically
as possible so that it gets the chance
of being read (on one hand) and giving
the full details, explain analyze,
pg_class,pg_statistic data (on the other hand),
in order to be more informational.
At the extreme cases: provide a "query slow" post
on one hand and provide the whole pg_dump
on the other.
The problem is that in the first case
"he hasnt given any real info"
and in the second case every one is avoiding
reading 10 pages of data.
I think i must have missed the "golden intersection".

Well now to the point.

The problem was dealt using a hint
from Mr Kenneth Marshall.
Setting random_page_cost = 1.9
resulted in a smaller cost calculation
for the index than the seq scan.

Now the question is:
With random_page_cost = 4 (default)
i get
dynacom=# EXPLAIN ANALYZE select count(*) from status where
assettable='vessels' and appname='ISM PMS' and apptblname='items' and
status='warn' and isvalid and assetidval=57;

QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1669.01..1669.01 rows=1 width=0) (actual
time=258.45..258.46 rows=1 loops=1)
   ->  Seq Scan on status  (cost=0.00..1668.62 rows=158 width=0) (actual
time=171.26..258.38 rows=42 loops=1)
         Filter: ((assettable = 'vessels'::character varying) AND (appname
= 'ISM PMS'::character varying) AND (apptblname = 'items'::character
varying) AND (status = 'warn'::character varying) AND isvalid AND
(assetidval = 57))
 Total runtime: 258.52 msec
(4 rows)

dynacom=#

And with random_page_cost = 1.9, i get
dynacom=# EXPLAIN ANALYZE select count(*) from status where
assettable='vessels' and appname='ISM PMS' and apptblname='items' and
status='warn' and isvalid and assetidval=57;

QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1650.39..1650.39 rows=1 width=0) (actual
time=18.86..18.86 rows=1 loops=1)
   ->  Index Scan using status_all on status  (cost=0.00..1650.04 rows=139
width=0) (actual time=18.26..18.77 rows=42 loops=1)
         Index Cond: ((assettable = 'vessels'::character varying) AND
(assetidval = 57) AND (appname = 'ISM PMS'::character varying) AND
(apptblname = 'items'::character varying) AND (status = 'warn'::character
varying))
         Filter: isvalid
 Total runtime: 18.94 msec
(5 rows)

dynacom=#

That is, we have a marginal decrease of the total cost
for the index scan when random_page_cost = 1.9,
whereas the "real cost" in the means of total runtime
ranges from 218 msecs (seq scan) to 19 msecs (index scan).
(is it sane?)
-----
(returning to the general -performance posting problem)
Altho a FAQ with "please do VACUUM ANALYZE before
posting to the lists" is something usefull in general,
it does not provide enuf info for the users,
at least for "corner cases" (as a fellow pgsql'er
wrote)

I think in order to stop this undesirable phaenomenon
of flooding the lists, the best way is to provide
the actual algorithms that govern the planer/optimiser,
in a form of lets say "advanced documentation".

(If there is such thing, i am sorry but i wasnt
told so by anyone.)

Otherwise there are gonna be unhappy core hackers
(having to examine each case individually)
and of course bad performing systems on the users side.

P.S.

Of course there are newbies in postgresql,
ofcourse there are people who think that
"support" is to be taken for granted,
ofcourse there are people with minimal
programming/hacking skills,
but i think the average "power user"
altho he didnt get the chance to
follow the "hard core" hacking path
in his life, he has a CompScience BSc or MSc,
and can deal with both complicated algoritmic
issues and source code reading,
and morever on the average he likes to
give and receive respect.

(not to mention that he is the person who can
"spread the word" based on strong arguments
and solid ground)

Thats my 20 drachmas.

==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Andrew Sullivan

Date:

08 May 2003, 07:20:18

On Thu, May 08, 2003 at 10:48:52AM -0200, Achilleus Mantzios wrote:

> That is, we have a marginal decrease of the total cost
> for the index scan when random_page_cost = 1.9,
> whereas the "real cost" in the means of total runtime
> ranges from 218 msecs (seq scan) to 19 msecs (index scan).
> (is it sane?)

You're right that the problem is the poor estimate of the cost of
that selection.  I recall you mentioning that you'd expanded the
statistics on the field, but I don't recall to what.  I know that
under some circumstances, you _really_ have to increase the stats to
get a meaningful sample.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

Re: An unresolved performance problem.

From

Tom Lane

Date:

08 May 2003, 10:42:55

Andrew Sullivan <andrew@libertyrms.info> writes:
> On Thu, May 08, 2003 at 10:48:52AM -0200, Achilleus Mantzios wrote:
>> That is, we have a marginal decrease of the total cost
>> for the index scan when random_page_cost = 1.9,
>> whereas the "real cost" in the means of total runtime
>> ranges from 218 msecs (seq scan) to 19 msecs (index scan).
>> (is it sane?)

> You're right that the problem is the poor estimate of the cost of
> that selection.

Are the table and index orders the same?  Oliver Elphick pointed out
awhile ago that we're doing a bad job of index order correlation
estimation for multi-column indexes --- the correlation is taken to
be much lower than it should be.  But if the correlation is near
zero anyway then this wouldn't explain Achilleus' problem...

            regards, tom lane

Re: [SQL] Unanswered Questions WAS: An unresolved performance problem.

From

johnnnnnn

Date:

08 May 2003, 10:45:21

On Wed, May 07, 2003 at 09:57:49PM -0700, Sean Chittenden wrote:
> I hate to point this out, but "TIP 4" is getting a bit old and the 6
> tips that we throw out to probably about 40K people about 1-200
> times a day have probably reached saturation.  Without looking at
> the archives, I bet anyone a shot of good scotch that, it's probably
> pretty infrequent that people don't kill -9 their postmasters.
>
> Any chance we could flush out the TIPs at the bottom to include,
> "VACUUM ANALYZE your database regularly," or "When reporting a
> problem, include the output from EXPLAIN [query]," or "ANALYZE
> tables before examining the output from an EXPLAIN [query]," or
> "Visit [url] for a tutorial on (schemas|triggers|views)."

Better yet, have TIPs that are appropriate to the subscribed
list. -performance has different posting guidelines, things to try,
etc. than does -bugs, than does -sql (than does -hackers, than does
-interfaces, ...).

I don't know how feasible it is to separate them out, but i think it's
worth looking into.

-johnnnnnnnnnnn

Re: [SQL] Unanswered Questions WAS: An unresolved performance

From

"scott.marlowe"

Date:

08 May 2003, 12:30:52

On Thu, 8 May 2003, johnnnnnn wrote:

> On Wed, May 07, 2003 at 09:57:49PM -0700, Sean Chittenden wrote:
> > I hate to point this out, but "TIP 4" is getting a bit old and the 6
> > tips that we throw out to probably about 40K people about 1-200
> > times a day have probably reached saturation.  Without looking at
> > the archives, I bet anyone a shot of good scotch that, it's probably
> > pretty infrequent that people don't kill -9 their postmasters.
> >
> > Any chance we could flush out the TIPs at the bottom to include,
> > "VACUUM ANALYZE your database regularly," or "When reporting a
> > problem, include the output from EXPLAIN [query]," or "ANALYZE
> > tables before examining the output from an EXPLAIN [query]," or
> > "Visit [url] for a tutorial on (schemas|triggers|views)."
>
> Better yet, have TIPs that are appropriate to the subscribed
> list. -performance has different posting guidelines, things to try,
> etc. than does -bugs, than does -sql (than does -hackers, than does
> -interfaces, ...).
>
> I don't know how feasible it is to separate them out, but i think it's
> worth looking into.

Agreed.

Also, some tips might well cross over, like say, vacuum and analyze
regularly.  Hmmm.  Sounds like a job for a relational database :-)

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

09 May 2003, 03:55:34

On Thu, 8 May 2003, Tom Lane wrote:

> Andrew Sullivan <andrew@libertyrms.info> writes:
> > On Thu, May 08, 2003 at 10:48:52AM -0200, Achilleus Mantzios wrote:
> >> That is, we have a marginal decrease of the total cost
> >> for the index scan when random_page_cost = 1.9,
> >> whereas the "real cost" in the means of total runtime
> >> ranges from 218 msecs (seq scan) to 19 msecs (index scan).
> >> (is it sane?)
>
> > You're right that the problem is the poor estimate of the cost of
> > that selection.
>
> Are the table and index orders the same?  Oliver Elphick pointed out
> awhile ago that we're doing a bad job of index order correlation
> estimation for multi-column indexes --- the correlation is taken to
> be much lower than it should be.  But if the correlation is near
> zero anyway then this wouldn't explain Achilleus' problem...

Please correct me if i am wrong. (i think i probably am)
The correlation value in pg_statistc for a column refers to the
correlation between
the ordering of a table's tuples and the ordering of that column.
(So it plays some role in determining the execution plan
if an index exists on that column. Also CLUSTERing a
single-column index on the table makes reordering
of the table according to that index, that is the ordering of that
column).

Is that correct??

If so, how can one find the correlation between the ordering
of a table and a multicolumn index?

>
>             regards, tom lane
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Tom Lane

Date:

09 May 2003, 08:31:53

Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> If so, how can one find the correlation between the ordering
> of a table and a multicolumn index?

Well, it is surely no better than the correlation of the index's
first column --- what is that?

            regards, tom lane

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

09 May 2003, 09:06:40

On Fri, 9 May 2003, Tom Lane wrote:

> Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> > If so, how can one find the correlation between the ordering
> > of a table and a multicolumn index?
>
> Well, it is surely no better than the correlation of the index's
> first column --- what is that?

it is 1

>
>             regards, tom lane
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Tom Lane

Date:

09 May 2003, 09:08:01

Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> On Fri, 9 May 2003, Tom Lane wrote:
>> Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
>>> If so, how can one find the correlation between the ordering
>>> of a table and a multicolumn index?
>>
>> Well, it is surely no better than the correlation of the index's
>> first column --- what is that?

> it is 1

Well, that's suggestive, isn't it?  What about the remaining columns?

            regards, tom lane

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

09 May 2003, 09:17:21

On Fri, 9 May 2003, Tom Lane wrote:

> >> Well, it is surely no better than the correlation of the index's
> >> first column --- what is that?
>
> > it is 1
>
> Well, that's suggestive, isn't it?  What about the remaining columns?

The index is defined as:

status_all btree (assettable, assetidval, appname, apptblname, status,
isvalid)

And correlations are:

  attname   | correlation
-------------+-------------
 assettable  |           1
 assetidval  |    0.125902
 appname     |    0.942771
 apptblname  |    0.928761
 status      |    0.443405
 isvalid     |    0.970531

>
>             regards, tom lane
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Tom Lane

Date:

09 May 2003, 09:30:11

Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> On Fri, 9 May 2003, Tom Lane wrote:
>> Well, that's suggestive, isn't it?  What about the remaining columns?

> The index is defined as:

> status_all btree (assettable, assetidval, appname, apptblname, status,
> isvalid)

> And correlations are:

>   attname   | correlation
> -------------+-------------
>  assettable  |           1
>  assetidval  |    0.125902
>  appname     |    0.942771
>  apptblname  |    0.928761
>  status      |    0.443405
>  isvalid     |    0.970531

Actually, thinking twice about it, I'm not sure if the correlations of
the righthand columns mean anything.  If the table were perfectly
ordered by the index, you'd expect righthand values to cycle through
their range for each lefthand value, and so they'd show low
correlations.

The fact that most of the columns show high correlation makes me think
that they are not independent --- is that right?

But anyway, I'd say that yes this table is probably quite well ordered
by the index.  You could just visually compare the results of

select * from tab

select * from tab
  order by assettable, assetidval, appname, apptblname, status, isvalid

to confirm this.

And that tells us where the problem is: the code is estimating a low
index correlation where it should be estimating a high one.  If you
don't mind running a nonstandard version of Postgres, you could try
making btcostestimate() in src/backend/utils/adt/selfuncs.c estimate
the indexCorrelation as just varCorrelation, instead of
varCorrelation / nKeys.  This is doubtless an overcorrection in the
other direction (which is why it hasn't been done in the official
sources) but it's probably better than what's there, at least for
your purposes.

            regards, tom lane

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

09 May 2003, 10:08:49

On Fri, 9 May 2003, Tom Lane wrote:

> Achilleus Mantzios <achill@matrix.gatewaynet.com> writes:
> > On Fri, 9 May 2003, Tom Lane wrote:
> >> Well, that's suggestive, isn't it?  What about the remaining columns?
>
> > The index is defined as:
>
> > status_all btree (assettable, assetidval, appname, apptblname, status,
> > isvalid)
>
> > And correlations are:
>
> >   attname   | correlation
> > -------------+-------------
> >  assettable  |           1
> >  assetidval  |    0.125902
> >  appname     |    0.942771
> >  apptblname  |    0.928761
> >  status      |    0.443405
> >  isvalid     |    0.970531
>
> Actually, thinking twice about it, I'm not sure if the correlations of
> the righthand columns mean anything.  If the table were perfectly
> ordered by the index, you'd expect righthand values to cycle through
> their range for each lefthand value, and so they'd show low
> correlations.

When i clustered (on onother system no to spoil the situation)
CLUSTER status_all on status;
i got identical results on the order (see below),
also i got quite high correlations.

>
> The fact that most of the columns show high correlation makes me think
> that they are not independent --- is that right?

Well, assettable,appname,apptblname
have high frequencies on one value, so they can be
regarded as constants.
assetidval, status and isvalid play the most part of the
selectivity.
(i have included the first 3 columns in the status_all index for future
usage)

>
> But anyway, I'd say that yes this table is probably quite well ordered
> by the index.  You could just visually compare the results of
>
> select * from tab
>
> select * from tab
>   order by assettable, assetidval, appname, apptblname, status, isvalid
>
> to confirm this.
>

If the table was ordered by status_all index i would show something like
   attname   | correlation
-------------+-------------
 assettable  |           1
 assetidval  |           1
 appname     |    0.927842
 apptblname  |    0.895155
 status      |    0.539183
 isvalid     |    0.722838

In the current (production system) situation, visually, i dont see any
correlation between the two.

> And that tells us where the problem is: the code is estimating a low
> index correlation where it should be estimating a high one.  If you
> don't mind running a nonstandard version of Postgres, you could try
> making btcostestimate() in src/backend/utils/adt/selfuncs.c estimate
> the indexCorrelation as just varCorrelation, instead of
> varCorrelation / nKeys.  This is doubtless an overcorrection in the
> other direction (which is why it hasn't been done in the official
> sources) but it's probably better than what's there, at least for
> your purposes.
>

On the test system,
if i cluster the table according to assetidval the optimiser
uses the index on that column which does a pretty good job.
Even better, if i revert the table to an ordering according
to its id (to spoil the previous effect of the CLUSTER command)
and i set random_page_cost = 2 i get the usage of the better
status_all index.

This way the correlations seem low, but the expected selectivity
is either way 83 rows.

Are you suggesting to try the change in src/backend/utils/adt/selfuncs.c
at this exact situation i am on my test system?? (its linux too)

Thanx a lot!

>             regards, tom lane
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr

Re: An unresolved performance problem.

From

Achilleus Mantzios

Date:

09 May 2003, 10:19:58

I changed
*indexCorrelation = varCorrelation / nKeys;
to
*indexCorrelation = varCorrelation ;

and i got cost=28.88
and it beats every other index.


--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com
        mantzios@softlab.ece.ntua.gr