Re: many similar indexbased selects are extremely slow - Mailing list pgsql-general

From Jeff Davis
Subject Re: many similar indexbased selects are extremely slow
Date
Msg-id 1104631910.3003.427.camel@jeff
Whole thread Raw
In response to Re: many similar indexbased selects are extremely slow  (Jeff Davis <jdavis-pgsql@empires.org>)
List pgsql-general
For what it's worth, I put 100k rows into a table in 8.0beta5, and
selected 10k at a time. When doing each SELECT seperately using the
index, it took about 2.5s to do 10k SELECTs. When using an IN query
containing all the id's that I wanted, it took less than a second.

Jeff

On Sat, 2005-01-01 at 17:34 -0800, Jeff Davis wrote:
> Well, first it would be a good idea to see what postgres is actually
> doing. Send the output of:
>
> => EXPLAIN ANALYZE SELECT OID FROM atrikelindex WHERE id_artikel=?;
>
> (where ? is replaced by some id value)
>
> It will either say index lookup or sequential scan since it's just a
> select from one table. That will tell you whether the index is being
> used or not.
>
> Next, if it is doing something that seems unreasonable, try doing:
>
> => VACUUM ANALYZE artikelindex;
>
> And we should also look at the EXPLAIN output on the IN query version
> like:
>
> => EXPLAIN ANALYZE SELECT OID FROM artikelindex WHERE id_artikel IN
> (?,?,...,?);
>
> Although I'm not sure what kind of performance to expect if you have 10k
> values in that list.
>
> Also, what version are you using? And how many rows in the table? Is it
> regularly vacuumed? Are there a lot of updates/deletes going to that
> table?
>
> I suspect that the fastest way that postgres can get you those rows
> would be the IN query on a recent version of postgresql. It may choose a
> sequential scan, which likely would be good since one seq scan will
> hopefully take less than 10 seconds.
>
> Regards,
>     Jeff Davis
>
>
> On Sun, 2005-01-02 at 00:52 +0100, peter pilsl wrote:
> > psql8:
> >
> > I use a bigger psql-table to store information and keep an id-value of
> > each row in memory of my application for faster access.
> > My applications is able to calculate a list of needed id's in very short
> > time and then wants to retrieve all rows corresponding to this id's.
> >
> > So in fact I perform a lot of operations like:
> >
> > select field1,field2,field3 from mytable where id=XX;
> >
> > There is a index on the id-field and the id-field is of type OID, so
> > everything should be quite fast. Unfortunately it is not.
> >
> > On a 2GHz-machine with 1GB RAM it takes approx. 10seconds to fetch 10000
> > rows. In this testscenario I only fetch the OID and no other col.
> > I dont understand this. Am I expecting far to much? Is 10seconds for the
> > retrieval of 10000 OIDs a fine value? I want it to be less than one
> > second and from my experience with postgres this operation is extremely
> > slow compared to the impressive speed of most other operations.
> >
> > I also tried to use the IN-operator, which is much more slower. Is there
> > any other way to speed up things? I can order the list of id's to
> > retrieve in my application if there is a way to tell psql not to search
> > the whole index every time but somehow "do better".
> >
> > If it is of any interest, here is the table. The id is stored as id_artikel.
> >
> >
> >                                         Table "public.artikelindex"
> >          Column         |            Type             |
> >      Modifiers
> > -----------------------+-----------------------------+----------------------------------------------------
> >   autor_artikel         | text                        |
> >   titel_artikel         | text                        |
> >   jahrgang_zeitschrift  | integer                     |
> >   jahr_zeitschrift      | character varying(20)       |
> >   heftnummer            | character varying(30)       |
> >   seitenzahl_artikel    | character varying(30)       |
> >   bemerkungen_artikel   | text                        |
> >   deskriptoren_alt      | text                        |
> >   deskriptoren_neu      | text                        |
> >   personennamen_artikel | text                        |
> >   orte_artikel          | text                        |
> >   id_artikel            | oid                         |
> >   id_titel              | oid                         |
> >   cdate                 | timestamp without time zone | default
> > ('now'::text)::timestamp(6) with time zone
> >   udate                 | timestamp without time zone | default
> > ('now'::text)::timestamp(6) with time zone
> >   uid                   | oid                         |
> >   gid                   | oid                         |
> >   mod                   | boolean                     |
> > Indexes:
> >      "id_artikel_idx" btree (id_artikel)
> >      "id_titel_idx" btree (id_titel)
> >      "idx_artikelindeax_autor" btree (autor_artikel)
> >      "idx_artikelindex_fingerprint" btree (id_artikel)
> >      "idx_artikelindex_jahr" btree (jahrgang_zeitschrift)
> >      "idx_artikelindex_jahrgang" btree (jahr_zeitschrift)
> >      "idx_artikelindex_zeitschrift" btree (id_titel)
> > Rules:
> >      delete_val AS
> >      ON DELETE TO artikelindex DO  UPDATE counter SET val = counter.val + 1
> >    WHERE counter.tab::text = 'artikelindex'::character varying::text
> >      insert_val AS
> >      ON INSERT TO artikelindex DO  UPDATE counter SET val = counter.val + 1
> >    WHERE counter.tab::text = 'artikelindex'::character varying::text
> >      update_val AS
> >      ON UPDATE TO artikelindex DO  UPDATE counter SET val = counter.val + 1
> >    WHERE counter.tab::text = 'artikelindex'::character varying::text
> >
> >
> > And more: here is my retrieving program. I use perl and the DBI-module
> > and the following code-snip
> >
> > --------------------------
> > my $sth=$dbh->prepare(
> >         'select OID from artikelindex where id_artikel=?');
> >
> > foreach (@id) {
> >    my $ret=$sth->execute($_);
> >    my $x=$sth->fetchrow_arrayref;
> > }
> > -------------------------
> >
> >
> >
> > thnx a lot for any idea,
> > peter
> >
> >
> >
> > --
> > mag. peter pilsl
> > goldfisch.at
> > IT-management
> > tel +43 699 1 3574035
> > fax +43 699 4 3574035
> > pilsl@goldfisch.at
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html


pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: many similar indexbased selects are extremely slow
Next
From: Jeff Davis
Date:
Subject: Re: disabling OIDs?