Re: many similar indexbased selects are extremely slow - Mailing list pgsql-general
From | Jeff Davis |
---|---|
Subject | Re: many similar indexbased selects are extremely slow |
Date | |
Msg-id | 1104631910.3003.427.camel@jeff Whole thread Raw |
In response to | Re: many similar indexbased selects are extremely slow (Jeff Davis <jdavis-pgsql@empires.org>) |
List | pgsql-general |
For what it's worth, I put 100k rows into a table in 8.0beta5, and selected 10k at a time. When doing each SELECT seperately using the index, it took about 2.5s to do 10k SELECTs. When using an IN query containing all the id's that I wanted, it took less than a second. Jeff On Sat, 2005-01-01 at 17:34 -0800, Jeff Davis wrote: > Well, first it would be a good idea to see what postgres is actually > doing. Send the output of: > > => EXPLAIN ANALYZE SELECT OID FROM atrikelindex WHERE id_artikel=?; > > (where ? is replaced by some id value) > > It will either say index lookup or sequential scan since it's just a > select from one table. That will tell you whether the index is being > used or not. > > Next, if it is doing something that seems unreasonable, try doing: > > => VACUUM ANALYZE artikelindex; > > And we should also look at the EXPLAIN output on the IN query version > like: > > => EXPLAIN ANALYZE SELECT OID FROM artikelindex WHERE id_artikel IN > (?,?,...,?); > > Although I'm not sure what kind of performance to expect if you have 10k > values in that list. > > Also, what version are you using? And how many rows in the table? Is it > regularly vacuumed? Are there a lot of updates/deletes going to that > table? > > I suspect that the fastest way that postgres can get you those rows > would be the IN query on a recent version of postgresql. It may choose a > sequential scan, which likely would be good since one seq scan will > hopefully take less than 10 seconds. > > Regards, > Jeff Davis > > > On Sun, 2005-01-02 at 00:52 +0100, peter pilsl wrote: > > psql8: > > > > I use a bigger psql-table to store information and keep an id-value of > > each row in memory of my application for faster access. > > My applications is able to calculate a list of needed id's in very short > > time and then wants to retrieve all rows corresponding to this id's. > > > > So in fact I perform a lot of operations like: > > > > select field1,field2,field3 from mytable where id=XX; > > > > There is a index on the id-field and the id-field is of type OID, so > > everything should be quite fast. Unfortunately it is not. > > > > On a 2GHz-machine with 1GB RAM it takes approx. 10seconds to fetch 10000 > > rows. In this testscenario I only fetch the OID and no other col. > > I dont understand this. Am I expecting far to much? Is 10seconds for the > > retrieval of 10000 OIDs a fine value? I want it to be less than one > > second and from my experience with postgres this operation is extremely > > slow compared to the impressive speed of most other operations. > > > > I also tried to use the IN-operator, which is much more slower. Is there > > any other way to speed up things? I can order the list of id's to > > retrieve in my application if there is a way to tell psql not to search > > the whole index every time but somehow "do better". > > > > If it is of any interest, here is the table. The id is stored as id_artikel. > > > > > > Table "public.artikelindex" > > Column | Type | > > Modifiers > > -----------------------+-----------------------------+---------------------------------------------------- > > autor_artikel | text | > > titel_artikel | text | > > jahrgang_zeitschrift | integer | > > jahr_zeitschrift | character varying(20) | > > heftnummer | character varying(30) | > > seitenzahl_artikel | character varying(30) | > > bemerkungen_artikel | text | > > deskriptoren_alt | text | > > deskriptoren_neu | text | > > personennamen_artikel | text | > > orte_artikel | text | > > id_artikel | oid | > > id_titel | oid | > > cdate | timestamp without time zone | default > > ('now'::text)::timestamp(6) with time zone > > udate | timestamp without time zone | default > > ('now'::text)::timestamp(6) with time zone > > uid | oid | > > gid | oid | > > mod | boolean | > > Indexes: > > "id_artikel_idx" btree (id_artikel) > > "id_titel_idx" btree (id_titel) > > "idx_artikelindeax_autor" btree (autor_artikel) > > "idx_artikelindex_fingerprint" btree (id_artikel) > > "idx_artikelindex_jahr" btree (jahrgang_zeitschrift) > > "idx_artikelindex_jahrgang" btree (jahr_zeitschrift) > > "idx_artikelindex_zeitschrift" btree (id_titel) > > Rules: > > delete_val AS > > ON DELETE TO artikelindex DO UPDATE counter SET val = counter.val + 1 > > WHERE counter.tab::text = 'artikelindex'::character varying::text > > insert_val AS > > ON INSERT TO artikelindex DO UPDATE counter SET val = counter.val + 1 > > WHERE counter.tab::text = 'artikelindex'::character varying::text > > update_val AS > > ON UPDATE TO artikelindex DO UPDATE counter SET val = counter.val + 1 > > WHERE counter.tab::text = 'artikelindex'::character varying::text > > > > > > And more: here is my retrieving program. I use perl and the DBI-module > > and the following code-snip > > > > -------------------------- > > my $sth=$dbh->prepare( > > 'select OID from artikelindex where id_artikel=?'); > > > > foreach (@id) { > > my $ret=$sth->execute($_); > > my $x=$sth->fetchrow_arrayref; > > } > > ------------------------- > > > > > > > > thnx a lot for any idea, > > peter > > > > > > > > -- > > mag. peter pilsl > > goldfisch.at > > IT-management > > tel +43 699 1 3574035 > > fax +43 699 4 3574035 > > pilsl@goldfisch.at > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
pgsql-general by date: