Thread: Re: Simple query, 10 million records...MySQL ten times faster

Re: Simple query, 10 million records...MySQL ten times faster

From
"Luke Lonergan"
Date:

NUMERIC operations are very slow in pgsql.  Equality comparisons are somewhat faster, but other operations are very slow compared to other vendor's NUMERIC.

We've sped it up a lot here internally, but you may want to consider using FLOAT for what you are doing.

- Luke

Msg is shrt cuz m on ma treo

 -----Original Message-----
From:   Bill Moran [mailto:wmoran@collaborativefusion.com]
Sent:   Thursday, April 26, 2007 05:13 PM Eastern Standard Time
To:     zardozrocks
Cc:     pgsql-performance@postgresql.org
Subject:        Re: [PERFORM] Simple query, 10 million records...MySQL ten times faster

In response to zardozrocks <zardozrocks@gmail.com>:

> I have this table:
>
> CREATE TABLE test_zip_assoc (
>     id serial NOT NULL,
>     f_id integer DEFAULT 0 NOT NULL,
>     lat_radians numeric(6,5) DEFAULT 0.00000 NOT NULL,
>     long_radians numeric(6,5) DEFAULT 0.00000 NOT NULL
> );
> CREATE INDEX lat_radians ON test_zip_assoc USING btree (lat_radians);
> CREATE INDEX long_radians ON test_zip_assoc USING btree
> (long_radians);
>
>
>
> It's basically a table that associates some foreign_key (for an event,
> for instance) with a particular location using longitude and
> latitude.  I'm basically doing a simple proximity search.  I have
> populated the database with *10 million* records.  I then test
> performance by picking 50 zip codes at random and finding the records
> within 50 miles with a query like this:
>
> SELECT id
>       FROM test_zip_assoc
>       WHERE
>               lat_radians > 0.69014816041
>               AND lat_radians < 0.71538026567
>               AND long_radians > -1.35446228028
>               AND long_radians < -1.32923017502
>
>
> On my development server (dual proc/dual core Opteron 2.8 Ghz with 4GB
> ram) this query averages 1.5 seconds each time it runs after a brief
> warmup period.  In PostGreSQL it averages about 15 seconds.
>
> Both of those times are too slow.  I need the query to run in under a
> second with as many as a billion records.  I don't know if this is
> possible but I'm really hoping someone can help me restructure my
> indexes (multicolumn?, multiple indexes with a 'where' clause?) so
> that I can get this running as fast as possible.
>
> If I need to consider some non-database data structure in RAM I will
> do that too.  Any help or tips would be greatly appreciated.  I'm
> willing to go to greath lengths to test this if someone can make a
> good suggestion that sounds like it has a reasonable chance of
> improving the speed of this search.  There's an extensive thread on my
> efforts already here:
>
> http://phpbuilder.com/board/showthread.php?t=10331619&page=10

Why didn't you investigate/respond to the last posts there?  The advice
to bump shared_buffers is good advice.  work_mem might also need bumped.

Figure out which postgresql.conf your system is using and get it dialed
in for your hardware.  You can make all the indexes you want, but if
you've told Postgres that it only has 8M of RAM to work with, performance
is going to suck.  I don't see hardware specs on that thread (but I
didn't read the whole thing)  If the system you're using is a dedicated
DB system, set shared_buffers to 1/3 - 1/2 of the physical RAM on the
machine for starters.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org