Home > mailing lists

Re: Huge Performance Difference on Similar Query in Pg7.2 - Mailing list pgsql-general

From	Nigel J. Andrews
Subject	Re: Huge Performance Difference on Similar Query in Pg7.2
Date	March 22, 2002 12:42:32
Msg-id	Pine.LNX.4.21.0203221731040.6141-100000@ponder.fairway2k.co.uk Whole thread
In response to	Re: Huge Performance Difference on Similar Query in Pg7.2 (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Responses	Re: Huge Performance Difference on Similar Query in Pg7.2 Re: Huge Performance Difference on Similar Query in Pg7.2
List	pgsql-general

Tree view

On Fri, 22 Mar 2002, Stephan Szabo wrote:

> > I wrote:
> >
> > I've read lots of messages on this subject, and the FAQ, recently but I'm still
> > confused. None of what I have read seems to account for the execution plans I'm
> > seeing, except for this mention of scanning a million values in an index and
> > discarding all but a small number. However, even this I can not see applies to
> > a primary key on a million row table. I upgraded from 7.0.x to 7.2 because I
> > was thinking it was an oddity that was probably fixed.
> >
> > First, my apologies for the length of this posting.
> >
> > Next some background:
> >
> > I have a database where one of it's tables records 'sessions', called
> > chat_sessions. It has an integer field, session_id, declared as primary key.
> > This table is the small table in the example with only about 2000 rows.
> >
> > There is another table called chat_post, the large table holding about 1
> > million rows. It has two integer fields, session_id and post_number, which
>
> Are they actually integers (int4), or are either of them a different type
> like int2 or int8?  There are special case workarounds for those two due
> to a problem with the types of integer literals.  This looks likely since
> even with seq_scan set off it wanted to do a sequence scan which generally
> means it doesn't believe it can use the index.
>
> > explain analyze select count(*) from chat_post cp where cp.session_id
> > = 123;
>
> Does cp.session_id='123' give something different?
>
> > Show looking up in large table, selecting on primary key, uses
> > sequential scan on large
> >
> > explain analyze select count(*) from chat_post cp where cp.session_id
> > = 123 and cp.post_number = 10;
>
> Same here for '123' and '10'.

Spectacular! Bingo! etc.

Using the numbers quoted yields use of the primary key. I am indeed using
something other than int4, int2 in fact. So this is something to do with the
using integer literals which are presumably first interpreted as int4 and then
are converted in some long winded fashion, or something, to int2 for each and
every test or row, whereas specifying them as text causes the backend to
convert to the correct int2 only at the start?

I choose the smaller int because these are unlikely to be restrictive for this
DB and I thought I may as well try and not waste space. Whether it used the
same storage as int4 didn't really matter as I'd given it the opportunity to
use less if it could. Is it worth me moving these to int4 type?

Thanks for that, I can now tell my friend who had a good laugh with me last
weekend about this.

BTW, is this sort of feature documented anywhere or does it come into the 'what
makes a person an expert' catagory?

Nigel Andrews
Logictree Systems Limited

pgsql-general by date:

From: Herb Blacker
Date: 22 March 2002, 12:30:10
Subject: Referential Integrity problem

From: mark
Date: 22 March 2002, 12:54:32
Subject: Re: ODBC problem

Re: Huge Performance Difference on Similar Query in Pg7.2 - Mailing list pgsql-general

Previous

Next