Re: [GENERAL] 50 MB Table - Mailing list pgsql-general

From Bruce Bantos
Subject Re: [GENERAL] 50 MB Table
Date
Msg-id 06ce01bf8884$059d9fe0$0200a8c0@rsdevelop
Whole thread Raw
In response to 50 MB Table  (JB <jimbag@kw.igs.net>)
List pgsql-general
> I've been shuffling thru the 'LIKE' code in pg, but whilst perusing, it
> occurred to me that I could use a function for this thing. Since i'm
> only looking at the first part of the string I use...
>
> SELECT * FROM info WHERE substring(street_name from 1 to 4) = 'MAIN'
>
> ...and the query completes in under 2 secs. Stupidly obvious now but I
> thought I would share this blinding insight with the list ;)

It was not so obvious to me. I was struggling with the same thing. This is a
good workaround for the case of "searchstring%". I wonder if the LIKE code
could be enhanced to perform the query in the same way.

>
> Thanks again for all your help.
>
> cheers
> jb
>
> Paul Condon wrote:
> >
> > The example you give, LIKE 'MAIN%', should be able to use the index, but
I'm
> > not that expert on the internals of PostgreSQL. I was speaking from a
general
> > knowledge of SQL. I was supposing you were trying to do something, such
as,
> > LIKE '%BLVD%'. In this latter case, an index will bring you no benefit.
> >
> > Perhaps it is working faster than it would have, had there been no
index. Have
> > you tried dropping the index and seeing if the performance gets worse?
To make
> > this test more powerful, choose a pattern that matches very few rows in
your
> > table.
> >
> > JB wrote:
> >
> > > Thanks for taking the time to reply. I think that I wasn't as clear as
I
> > > could be. This table is normalized and as far as I understand, what
I'm
> > > doing with it is not extraordinary. The schema is basically...
> > >
> > > CREATE TABLE info (
> > >   lastname char(50),
> > >   street_name char(50),
> > >   street_number char(5),
> > >   ... (a bunch of other stuff that works fine with '=')
> > > );
> > >
> > > CREATE INDEX nx_info1 ON info (lastname);
> > > CREATE INDEX nx_info2 ON info (street_name);
> > >
> > > The select is as simple as this in most cases...
> > >
> > > SELECT * FROM info WHERE street_name LIKE 'MAIN%';
> > >
> > > .,,the table about 50MB worth, about 70,000 records. I have an index
on
> > > 'lastname' and 'street_name' and I need to search on each of these
with
> > > 'LIKE'. So I was wondering about ways to speed this up. It's very
slow.
> > > It takes about 20 seconds for the above query. I even uppercased all
the
> > > names, hoping tht would help. I wondered if I'd used the wrong index
> > > type (btree), or if there were some flags that would help. Is there a
> > > way to bust the indexes out alpha on the first letter say, or some
other
> > > such scheme. BTW the machine is RH6.1 with 128mb ram, 27 GB, P350, no
X
> > > and no users (except me ;)
> > >
> > > Paul Condon wrote:
> > > >
> > > > JB wrote:
> > > >
> > > > > I have a 50 MB +- table in postgres. The data is normalized so
there's
> > > > > not much I can do about the size. The tuples are about 512 bytes
so
> > > > > there's a pile of 'em. I need searching on of several fields, a
couple
> > > > > in particular are text fields that needs 'LIKE'. The problem is,
the
> > > > > thing is way too slow. So, I was wondering, before I go hunting
for some
> > > > > other solution, could anyone here point me to some ways to (hand)
> > > > > optimize the searching in postgres? Different indexes, hashing and
LIKE?
> > > > > I'm not sure where to go with this.
> > > > >
> > > > > The basic criteria are:
> > > > >  - sizes of indexes, etc, is not an issue. There's lot's of room
on the
> > > > > box.
> > > > >  - the data is basically static so a read-only (if such a thing)
is
> > > > > fine.
> > > > >  - it needs to be FAST
> > > > >
> > > > > cheers
> > > > > jb
> > > > >
> > > > > ************
> > > >
> > > > It sounds as if you have several different kinds of information
encoded in
> > > > a single column using special words or letter combinations. This is
a
> > > > violation of the ideal that data items should be "atomic." You
should make
> > > > a catalog of all the things that you want to be able to say about
each
> > > > tuple, and design a relational schema in which atomic assertion is
given
> > > > its own column (attribute). Then you will be able to create indices
on
> > > > each, and you won't have to use LIKE in your WHERE clauses.
> > > >
> > > > Paul
> > >
>
> --
> If everything is coming your way then you're in the wrong lane.
>
> ************
>


pgsql-general by date:

Previous
From: JB
Date:
Subject: Re: [GENERAL] 50 MB Table
Next
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] database corruption?