Re: [GENERAL] 50 MB Table - Mailing list pgsql-general
From | Bruce Bantos |
---|---|
Subject | Re: [GENERAL] 50 MB Table |
Date | |
Msg-id | 06ce01bf8884$059d9fe0$0200a8c0@rsdevelop Whole thread Raw |
In response to | 50 MB Table (JB <jimbag@kw.igs.net>) |
List | pgsql-general |
> I've been shuffling thru the 'LIKE' code in pg, but whilst perusing, it > occurred to me that I could use a function for this thing. Since i'm > only looking at the first part of the string I use... > > SELECT * FROM info WHERE substring(street_name from 1 to 4) = 'MAIN' > > ...and the query completes in under 2 secs. Stupidly obvious now but I > thought I would share this blinding insight with the list ;) It was not so obvious to me. I was struggling with the same thing. This is a good workaround for the case of "searchstring%". I wonder if the LIKE code could be enhanced to perform the query in the same way. > > Thanks again for all your help. > > cheers > jb > > Paul Condon wrote: > > > > The example you give, LIKE 'MAIN%', should be able to use the index, but I'm > > not that expert on the internals of PostgreSQL. I was speaking from a general > > knowledge of SQL. I was supposing you were trying to do something, such as, > > LIKE '%BLVD%'. In this latter case, an index will bring you no benefit. > > > > Perhaps it is working faster than it would have, had there been no index. Have > > you tried dropping the index and seeing if the performance gets worse? To make > > this test more powerful, choose a pattern that matches very few rows in your > > table. > > > > JB wrote: > > > > > Thanks for taking the time to reply. I think that I wasn't as clear as I > > > could be. This table is normalized and as far as I understand, what I'm > > > doing with it is not extraordinary. The schema is basically... > > > > > > CREATE TABLE info ( > > > lastname char(50), > > > street_name char(50), > > > street_number char(5), > > > ... (a bunch of other stuff that works fine with '=') > > > ); > > > > > > CREATE INDEX nx_info1 ON info (lastname); > > > CREATE INDEX nx_info2 ON info (street_name); > > > > > > The select is as simple as this in most cases... > > > > > > SELECT * FROM info WHERE street_name LIKE 'MAIN%'; > > > > > > .,,the table about 50MB worth, about 70,000 records. I have an index on > > > 'lastname' and 'street_name' and I need to search on each of these with > > > 'LIKE'. So I was wondering about ways to speed this up. It's very slow. > > > It takes about 20 seconds for the above query. I even uppercased all the > > > names, hoping tht would help. I wondered if I'd used the wrong index > > > type (btree), or if there were some flags that would help. Is there a > > > way to bust the indexes out alpha on the first letter say, or some other > > > such scheme. BTW the machine is RH6.1 with 128mb ram, 27 GB, P350, no X > > > and no users (except me ;) > > > > > > Paul Condon wrote: > > > > > > > > JB wrote: > > > > > > > > > I have a 50 MB +- table in postgres. The data is normalized so there's > > > > > not much I can do about the size. The tuples are about 512 bytes so > > > > > there's a pile of 'em. I need searching on of several fields, a couple > > > > > in particular are text fields that needs 'LIKE'. The problem is, the > > > > > thing is way too slow. So, I was wondering, before I go hunting for some > > > > > other solution, could anyone here point me to some ways to (hand) > > > > > optimize the searching in postgres? Different indexes, hashing and LIKE? > > > > > I'm not sure where to go with this. > > > > > > > > > > The basic criteria are: > > > > > - sizes of indexes, etc, is not an issue. There's lot's of room on the > > > > > box. > > > > > - the data is basically static so a read-only (if such a thing) is > > > > > fine. > > > > > - it needs to be FAST > > > > > > > > > > cheers > > > > > jb > > > > > > > > > > ************ > > > > > > > > It sounds as if you have several different kinds of information encoded in > > > > a single column using special words or letter combinations. This is a > > > > violation of the ideal that data items should be "atomic." You should make > > > > a catalog of all the things that you want to be able to say about each > > > > tuple, and design a relational schema in which atomic assertion is given > > > > its own column (attribute). Then you will be able to create indices on > > > > each, and you won't have to use LIKE in your WHERE clauses. > > > > > > > > Paul > > > > > -- > If everything is coming your way then you're in the wrong lane. > > ************ >
pgsql-general by date: