Thread: PostgreSQL performance in simple queries

PostgreSQL performance in simple queries

From
Eugeny Balakhonov
Date:
Hello for all!

I have PostgreSQL 7.4 under last version of Cygwin and have some
problems with performance :( It is very strange...  I don't remember
this problem on previous version Cygwin and PostgreSQL 7.3

I have only two simple tables:

CREATE TABLE public.files_t
(
  id int8 NOT NULL,
  parent int8,
  size int8 NOT NULL,
  dir bool NOT NULL DEFAULT false,
  ctime timestamp NOT NULL,
  ftime timestamp NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone,
  name text NOT NULL,
  access varchar(10) NOT NULL,
  host int4 NOT NULL,
  uname text NOT NULL,
  CONSTRAINT pk_files_k PRIMARY KEY (id),
  CONSTRAINT fk_files_k FOREIGN KEY (parent) REFERENCES public.files_t (id) ON UPDATE CASCADE ON DELETE CASCADE,
  CONSTRAINT fk_hosts_k FOREIGN KEY (host) REFERENCES public.hosts_t (id) ON UPDATE CASCADE ON DELETE CASCADE
) WITH OIDS;

and

CREATE TABLE public.hosts_t
(
  id int4 NOT NULL,
  ftime timestamp NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone,
  utime timestamp NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone,
  name text NOT NULL,
  address inet NOT NULL,
  CONSTRAINT pk_hosts_k PRIMARY KEY (id)
) WITH OIDS;

Table files_t has 249259 records and table hosts_t has only 59 records.

I tries to run simple query:

select * from files_t where parent = 3333

This query works 0.256 seconds! It is very big time for this small
table!
I have index for field "parent":

CREATE INDEX files_parent_idx
  ON public.files_t
  USING btree
  (parent);

But if I tries to see query plan then I see following text:

Seq Scan on files_t (cost=0.00..6103.89 rows=54 width=102)
 Filter: (parent = 3333)

PostgreSQL do not uses index files_parent_idx!

I have enabled all options of "QUERY TUNING" in postgresql.conf, I
have increased memory sizes for PostgreSQL:

shared_buffers = 2000           # min 16, at least max_connections*2, 8KB each
sort_mem = 32768                # min 64, size in KB
vacuum_mem = 65536              # min 1024, size in KB
fsync = false                   # turns forced synchronization on or off
checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
enable_hashagg = true
enable_hashjoin = true
enable_indexscan = true
enable_mergejoin = true
enable_nestloop = true
enable_seqscan = true
enable_sort = true
enable_tidscan = true
geqo = true
geqo_threshold = 22
geqo_effort = 1
geqo_generations = 0
geqo_pool_size = 0              # default based on tables in statement,
                                # range 128-1024
geqo_selection_bias = 2.0       # range 1.5-2.0
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = false


Please help me!
My database has a very small size (only 249259 records) but it works
very slowly :(

Best regards
Eugeny




Re: PostgreSQL performance in simple queries

From
"Gary Doades"
Date:
Try using

select * from files_t where parent = 3333::int8

You have declared parent as int8, but the query will assume int4 for "3333" and may not use the index.

Also make sure you have ANALYZEd this table.

Regards,
Gary.

On 20 May 2004 at 0:07, Eugeny Balakhonov wrote:

> Hello for all!
>
> I have PostgreSQL 7.4 under last version of Cygwin and have some
> problems with performance :( It is very strange...  I don't remember
> this problem on previous version Cygwin and PostgreSQL 7.3
>
> I have only two simple tables:
>
> CREATE TABLE public.files_t
> (
>   id int8 NOT NULL,
>   parent int8,
>   size int8 NOT NULL,
>   dir bool NOT NULL DEFAULT false,
>   ctime timestamp NOT NULL,
>   ftime timestamp NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone,
>   name text NOT NULL,
>   access varchar(10) NOT NULL,
>   host int4 NOT NULL,
>   uname text NOT NULL,
>   CONSTRAINT pk_files_k PRIMARY KEY (id),
>   CONSTRAINT fk_files_k FOREIGN KEY (parent) REFERENCES public.files_t (id) ON UPDATE CASCADE ON DELETE CASCADE,
>   CONSTRAINT fk_hosts_k FOREIGN KEY (host) REFERENCES public.hosts_t (id) ON UPDATE CASCADE ON DELETE CASCADE
> ) WITH OIDS;
>
> and
>
> CREATE TABLE public.hosts_t
> (
>   id int4 NOT NULL,
>   ftime timestamp NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone,
>   utime timestamp NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone,
>   name text NOT NULL,
>   address inet NOT NULL,
>   CONSTRAINT pk_hosts_k PRIMARY KEY (id)
> ) WITH OIDS;
>
> Table files_t has 249259 records and table hosts_t has only 59 records.
>
> I tries to run simple query:
>
> select * from files_t where parent = 3333
>
> This query works 0.256 seconds! It is very big time for this small
> table!
> I have index for field "parent":
>
> CREATE INDEX files_parent_idx
>   ON public.files_t
>   USING btree
>   (parent);
>
> But if I tries to see query plan then I see following text:
>
> Seq Scan on files_t (cost=0.00..6103.89 rows=54 width=102)
>  Filter: (parent = 3333)
>
> PostgreSQL do not uses index files_parent_idx!
>
> I have enabled all options of "QUERY TUNING" in postgresql.conf, I
> have increased memory sizes for PostgreSQL:
>
> shared_buffers = 2000           # min 16, at least max_connections*2, 8KB each
> sort_mem = 32768                # min 64, size in KB
> vacuum_mem = 65536              # min 1024, size in KB
> fsync = false                   # turns forced synchronization on or off
> checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
> enable_hashagg = true
> enable_hashjoin = true
> enable_indexscan = true
> enable_mergejoin = true
> enable_nestloop = true
> enable_seqscan = true
> enable_sort = true
> enable_tidscan = true
> geqo = true
> geqo_threshold = 22
> geqo_effort = 1
> geqo_generations = 0
> geqo_pool_size = 0              # default based on tables in statement,
>                                 # range 128-1024
> geqo_selection_bias = 2.0       # range 1.5-2.0
> stats_start_collector = true
> stats_command_string = true
> stats_block_level = true
> stats_row_level = true
> stats_reset_on_server_start = false
>
>
> Please help me!
> My database has a very small size (only 249259 records) but it works
> very slowly :(
>
> Best regards
> Eugeny
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly

Re: PostgreSQL performance in simple queries

From
Neil Conway
Date:
Eugeny Balakhonov wrote:
> I tries to run simple query:
>
> select * from files_t where parent = 3333

Use this instead:

select * from files_t where parent = '3333';

("parent = 3333::int8" would work as well.)

PostgreSQL (< 7.5) won't consider using an indexscan when the predicate
involves an integer literal and the column datatype is int2 or int8.

-Neil

Re: PostgreSQL performance in simple queries

From
Joseph Shraibman
Date:
Neil Conway wrote:

> PostgreSQL (< 7.5) won't consider using an indexscan when the predicate
> involves an integer literal and the column datatype is int2 or int8.

Is this fixed for 7.5?  It isn't checked off on the TODO list at
http://developer.postgresql.org/todo.php

Re: PostgreSQL performance in simple queries

From
Tom Lane
Date:
Joseph Shraibman <jks@selectacast.net> writes:
> Neil Conway wrote:
>> PostgreSQL (< 7.5) won't consider using an indexscan when the predicate
>> involves an integer literal and the column datatype is int2 or int8.

> Is this fixed for 7.5?  It isn't checked off on the TODO list at
> http://developer.postgresql.org/todo.php

It is.  I don't know why Bruce hasn't checked it off.


Some other stuff that needs work in TODO:

: Bracketed items "[]" have more detailed.

More detailed what?  Grammar please.

: * Remove unreferenced table files and temp tables during database vacuum
:   or postmaster startup (Bruce)

I'm not sure this is still needed given that we now log file deletion in
WAL.

: * Allow pg_dump to dump sequences using NO_MAXVALUE and NO_MINVALUE

Seems to be done.

: * Prevent whole-row references from leaking memory, e.g. SELECT COUNT(tab.*)

Done.

: * Make LENGTH() of CHAR() not count trailing spaces

Done.

: * Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8,
:   float4, numeric/decimal too

Done, per above.

: * Allow more ISOLATION LEVELS to be accepted, but issue a warning for them

Presently we accept all four with no warning ...

: * Add GUC setting to make created tables default to WITHOUT OIDS

Seems to be done, other than the argument about how pg_dump should work.

: * Allow fastpast to pass values in portable format

This was done in 7.4.

: * Move psql backslash database information into the backend, use nmumonic
:   commands? [psql]

Spelling problem...

: * JDBC

With JDBC out of the core, I'm not sure why we still have a JDBC section
in the core TODO.

: * Have pg_dump -c clear the database using dependency information

I think this works now.  Not really tested, but in principle it should
work.

: * Cache last known per-tuple offsets to speed long tuple access

This sounds exactly like attcacheoff, which has been there since
Berkeley.  Either remove this or fix the description to give some
idea what's really meant.

: * Automatically place fixed-width, NOT NULL columns first in a table

This is not ever going to happen, given that we've rejected the idea of
having separate logical and physical column positions.

: * Change representation of whole-tuple parameters to functions

Done.  (However, you might want to add something about supporting
composite types as table columns, which isn't done.)

: * Allow the regression tests to start postmaster with -i so the tests
:   can be run on systems that don't support unix-domain sockets

Done long ago.

            regards, tom lane

Re: PostgreSQL performance in simple queries

From
Bruce Momjian
Date:
Tom Lane wrote:
> Joseph Shraibman <jks@selectacast.net> writes:
> > Neil Conway wrote:
> >> PostgreSQL (< 7.5) won't consider using an indexscan when the predicate
> >> involves an integer literal and the column datatype is int2 or int8.
>
> > Is this fixed for 7.5?  It isn't checked off on the TODO list at
> > http://developer.postgresql.org/todo.php
>
> It is.  I don't know why Bruce hasn't checked it off.
>


OK, marked as done:

* -Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8,
  float4, numeric/decimal too
>
> Some other stuff that needs work in TODO:
>
> : Bracketed items "[]" have more detailed.
>
> More detailed what?  Grammar please.

Fixed.  "more detail".

> : * Remove unreferenced table files and temp tables during database vacuum
> :   or postmaster startup (Bruce)
>
> I'm not sure this is still needed given that we now log file deletion in
> WAL.

OK, removed.

>
> : * Allow pg_dump to dump sequences using NO_MAXVALUE and NO_MINVALUE
>
> Seems to be done.


OK.


>
> : * Prevent whole-row references from leaking memory, e.g. SELECT COUNT(tab.*)
>
> Done.

OK.

>
> : * Make LENGTH() of CHAR() not count trailing spaces
>
> Done.

OK.

>
> : * Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8,
> :   float4, numeric/decimal too
>
> Done, per above.

Got it.

>
> : * Allow more ISOLATION LEVELS to be accepted, but issue a warning for them
>
> Presently we accept all four with no warning ...

OK.  Warning part removed.

>
> : * Add GUC setting to make created tables default to WITHOUT OIDS
>
> Seems to be done, other than the argument about how pg_dump should work.

I did the pg_dump part using SET only where needed.  That is done.

>
> : * Allow fastpast to pass values in portable format
>
> This was done in 7.4.


Removed.

>
> : * Move psql backslash database information into the backend, use nmumonic
> :   commands? [psql]
>
> Spelling problem...

Fixed.

>
> : * JDBC
>
> With JDBC out of the core, I'm not sure why we still have a JDBC section
> in the core TODO.

Removed.  If they want it they can get it from our CVS history.

> : * Have pg_dump -c clear the database using dependency information
>
> I think this works now.  Not really tested, but in principle it should
> work.

OK.
>
> : * Cache last known per-tuple offsets to speed long tuple access
>
> This sounds exactly like attcacheoff, which has been there since
> Berkeley.  Either remove this or fix the description to give some
> idea what's really meant.

Added "adjusting for NULLs and TOAST values.  The issue is that when
NULLs or TOAST is present, those aren't useful.  I was thinking we could
remember the pattern of the previous row and use those offsets if the
TOAST/NULL pattern was the same, or something like that.  Is that a
valid idea?

> : * Automatically place fixed-width, NOT NULL columns first in a table
>
> This is not ever going to happen, given that we've rejected the idea of
> having separate logical and physical column positions.

Removed.

>
> : * Change representation of whole-tuple parameters to functions
>
> Done.  (However, you might want to add something about supporting
> composite types as table columns, which isn't done.)

OK, marked a done, and added new line:

    * Support composite types as table columns

> : * Allow the regression tests to start postmaster with -i so the tests
> :   can be run on systems that don't support unix-domain sockets
>
> Done long ago.

Removed.


Thanks for the updates!



--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: PostgreSQL performance in simple queries

From
Joseph Shraibman
Date:
Tom Lane wrote:

>
> : * JDBC
>
> With JDBC out of the core, I'm not sure why we still have a JDBC section
> in the core TODO.

Speaking of which why is the jdbc site so hard to find?  For that matter
the new foundry can only be found through the news article on the front
page.