Home > mailing lists

Thread: Using Between

Using Between

From

"Ozer, Pam"

Date:

27 August 2010, 21:21:58

I have a query that

Select Distinct VehicleId

From Vehicle

Where VehicleMileage between 0 and 15000.

I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked up. It does get picked up when I run

Select Distinct VehicleId

From Vehicle

Where VehicleMileage = 15000.

I just want to make sure that there is not a special index I should be using.

Thanks

Pam Ozer

Re: Using Between

From

Craig James

Date:

27 August 2010, 21:42:05

On 8/27/10 5:21 PM, Ozer, Pam wrote:
> I have a query that
>
> Select Distinct VehicleId
>
>  From Vehicle
>
> Where VehicleMileage between 0 and 15000.
>
> I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked
up.It does get picked up when I run 
>
> Select Distinct VehicleId
>
>  From Vehicle
>
> Where VehicleMileage = 15000.
>
> I just want to make sure that there is not a special index I should be using.

You need to post EXPLAIN ANALYZE of your query. It could be that an index scan is actually not a good plan (for
example,a sequential scan might be faster if most of your vehicles have low mileage).  Without the EXPLAIN ANALYZE,
there'sno way to say what's going on. 

Did you ANALYZE your database after you loaded the data?

Craig

> Thanks
>
> *Pam Ozer*
>

Re: Using Between

From

"Ozer, Pam"

Date:

30 August 2010, 13:51:32

Yes.  ANALYZE was run after we loaded the data.  Thanks for your
assistance
Here is the full Query.

select distinct VehicleUsed.VehicleUsedId as VehicleUsedId ,
 VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority ,
 VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail ,
 VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice ,
 VehicleUsed.VehicleUsedPrice as VehicleUsedPrice ,
 VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage ,
 VehicleUsed.VehicleUsedMileage as VehicleUsedMileage ,
 VehicleUsed.IsCPO as IsCPO ,
 VehicleUsed.IsMTCA as IsMTCA
from VehicleUsed
inner join PostalCodeRegionCountyCity on ( lower (
VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode
) )
where
( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000)
 and
( PostalCodeRegionCountyCity.RegionId = 26 )

order by VehicleUsed.VehicleUsedDisplayPriority ,
 VehicleUsed.HasVehicleUsedThumbnail desc ,
 VehicleUsed.HasVehicleUsedPrice desc ,
 VehicleUsed.VehicleUsedPrice ,
 VehicleUsed.HasVehicleUsedMileage desc ,
 VehicleUsed.VehicleUsedMileage ,
 VehicleUsed.IsCPO desc ,
 VehicleUsed.IsMTCA desc
limit 500000




Here is the explain Analyze

Limit  (cost=59732.41..60849.24 rows=44673 width=39) (actual
time=1940.274..1944.312 rows=2363 loops=1)
  Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
  ->  Unique  (cost=59732.41..60849.24 rows=44673 width=39) (actual
time=1940.272..1943.011 rows=2363 loops=1)
        Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
        ->  Sort  (cost=59732.41..59844.10 rows=44673 width=39) (actual
time=1940.270..1941.101 rows=2363 loops=1)
              Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
              Sort Key: vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca,
vehicleused.vehicleusedid
              Sort Method:  quicksort  Memory: 231kB
              ->  Hash Join  (cost=289.85..55057.07 rows=44673 width=39)
(actual time=3.799..1923.958 rows=2363 loops=1)
                    Output: vehicleused.vehicleusedid,
vehicleused.vehicleuseddisplaypriority,
vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
                    Hash Cond: (lower((vehicleused.postalcode)::text) =
lower((postalcoderegioncountycity.postalcode)::text))
                    ->  Seq Scan on vehicleused  (cost=0.00..51807.63
rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1)
                          Output: vehicleused.vehicleusedid,
vehicleused.datasetid, vehicleused.vehicleusedproductid,
vehicleused.sellernodeid, vehicleused.vehicleyear,
vehicleused.vehiclemakeid, vehicleused.vehiclemodelid,
vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid,
vehicleused.vehicledoors, vehicleused.vehicleenginetypeid,
vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid,
vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode,
vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage,
vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid,
vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority,
vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath,
vehicleused.vehiclebodystylegroupid, vehicleused.productid,
vehicleused.productgroupid, vehicleused.vehiclevin,
vehicleused.vehicleclassgroupid,
vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight,
vehicleused.buyerid, vehicleused.dealerid,
vehicleused.hasvehicleusedprice, vehicleused.dealerstockid,
vehicleused.datesold, vehicleused.hasthumbnailimagepath,
vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid,
vehicleused.vehicletitletypeid, vehicleused.warranty,
vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath,
vehicleused.description, vehicleused.inserteddate,
vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor,
vehicleused.vehicleusedmileagerangefloor,
vehicleused.hasvehicleusedmileage,
vehicleused.VehicleUsedIntId.distinct_count,
vehicleused.VehicleUsedPrice.average,
vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo,
vehicleused.ismtca, vehicleused.cpoprogramoemid,
vehicleused.cpoprogram3rdpartyid
                          Filter: ((vehicleusedpricerangefloor >= 0) AND
(vehicleusedpricerangefloor <= 15000))
                    ->  Hash  (cost=283.32..283.32 rows=522 width=6)
(actual time=1.084..1.084 rows=532 loops=1)
                          Output: postalcoderegioncountycity.postalcode
                          ->  Bitmap Heap Scan on
postalcoderegioncountycity  (cost=12.30..283.32 rows=522 width=6)
(actual time=0.092..0.361 rows=532 loops=1)
                                Output:
postalcoderegioncountycity.postalcode
                                Recheck Cond: (regionid = 26)
                                ->  Bitmap Index Scan on
postalcoderegioncountycity_i05  (cost=0.00..12.17 rows=522 width=0)
(actual time=0.082..0.082 rows=532 loops=1)
                                      Index Cond: (regionid = 26)
Total runtime: 1945.244 ms

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Craig James
Sent: Friday, August 27, 2010 5:42 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Using Between

On 8/27/10 5:21 PM, Ozer, Pam wrote:
> I have a query that
>
> Select Distinct VehicleId
>
>  From Vehicle
>
> Where VehicleMileage between 0 and 15000.
>
> I have an index on VehicleMileage. Is there another way to put an
index on a between? The index is not being picked up. It does get picked
up when I run
>
> Select Distinct VehicleId
>
>  From Vehicle
>
> Where VehicleMileage = 15000.
>
> I just want to make sure that there is not a special index I should be
using.

You need to post EXPLAIN ANALYZE of your query. It could be that an
index scan is actually not a good plan (for example, a sequential scan
might be faster if most of your vehicles have low mileage).  Without the
EXPLAIN ANALYZE, there's no way to say what's going on.

Did you ANALYZE your database after you loaded the data?

Craig

> Thanks
>
> *Pam Ozer*
>


--
Sent via pgsql-performance mailing list
(pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Using Between

From

Robert Haas

Date:

21 September 2010, 16:35:29

On Mon, Aug 30, 2010 at 12:51 PM, Ozer, Pam <pozer@automotive.com> wrote:
> Yes.  ANALYZE was run after we loaded the data.  Thanks for your
> assistance
> Here is the full Query.
>
> select distinct VehicleUsed.VehicleUsedId as VehicleUsedId ,
>  VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority ,
>  VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail ,
>  VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice ,
>  VehicleUsed.VehicleUsedPrice as VehicleUsedPrice ,
>  VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage ,
>  VehicleUsed.VehicleUsedMileage as VehicleUsedMileage ,
>  VehicleUsed.IsCPO as IsCPO ,
>  VehicleUsed.IsMTCA as IsMTCA
> from VehicleUsed
> inner join PostalCodeRegionCountyCity on ( lower (
> VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode
> ) )
> where
> ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000)
>  and
> ( PostalCodeRegionCountyCity.RegionId = 26 )
>
> order by VehicleUsed.VehicleUsedDisplayPriority ,
>  VehicleUsed.HasVehicleUsedThumbnail desc ,
>  VehicleUsed.HasVehicleUsedPrice desc ,
>  VehicleUsed.VehicleUsedPrice ,
>  VehicleUsed.HasVehicleUsedMileage desc ,
>  VehicleUsed.VehicleUsedMileage ,
>  VehicleUsed.IsCPO desc ,
>  VehicleUsed.IsMTCA desc
> limit 500000
>
>
>
>
> Here is the explain Analyze
>
> Limit  (cost=59732.41..60849.24 rows=44673 width=39) (actual
> time=1940.274..1944.312 rows=2363 loops=1)
>  Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>  ->  Unique  (cost=59732.41..60849.24 rows=44673 width=39) (actual
> time=1940.272..1943.011 rows=2363 loops=1)
>        Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>        ->  Sort  (cost=59732.41..59844.10 rows=44673 width=39) (actual
> time=1940.270..1941.101 rows=2363 loops=1)
>              Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>              Sort Key: vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca,
> vehicleused.vehicleusedid
>              Sort Method:  quicksort  Memory: 231kB
>              ->  Hash Join  (cost=289.85..55057.07 rows=44673 width=39)
> (actual time=3.799..1923.958 rows=2363 loops=1)
>                    Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>                    Hash Cond: (lower((vehicleused.postalcode)::text) =
> lower((postalcoderegioncountycity.postalcode)::text))
>                    ->  Seq Scan on vehicleused  (cost=0.00..51807.63
> rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1)
>                          Output: vehicleused.vehicleusedid,
> vehicleused.datasetid, vehicleused.vehicleusedproductid,
> vehicleused.sellernodeid, vehicleused.vehicleyear,
> vehicleused.vehiclemakeid, vehicleused.vehiclemodelid,
> vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid,
> vehicleused.vehicledoors, vehicleused.vehicleenginetypeid,
> vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid,
> vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode,
> vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage,
> vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid,
> vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority,
> vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath,
> vehicleused.vehiclebodystylegroupid, vehicleused.productid,
> vehicleused.productgroupid, vehicleused.vehiclevin,
> vehicleused.vehicleclassgroupid,
> vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight,
> vehicleused.buyerid, vehicleused.dealerid,
> vehicleused.hasvehicleusedprice, vehicleused.dealerstockid,
> vehicleused.datesold, vehicleused.hasthumbnailimagepath,
> vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid,
> vehicleused.vehicletitletypeid, vehicleused.warranty,
> vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath,
> vehicleused.description, vehicleused.inserteddate,
> vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor,
> vehicleused.vehicleusedmileagerangefloor,
> vehicleused.hasvehicleusedmileage,
> vehicleused.VehicleUsedIntId.distinct_count,
> vehicleused.VehicleUsedPrice.average,
> vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo,
> vehicleused.ismtca, vehicleused.cpoprogramoemid,
> vehicleused.cpoprogram3rdpartyid
>                          Filter: ((vehicleusedpricerangefloor >= 0) AND
> (vehicleusedpricerangefloor <= 15000))
>                    ->  Hash  (cost=283.32..283.32 rows=522 width=6)
> (actual time=1.084..1.084 rows=532 loops=1)
>                          Output: postalcoderegioncountycity.postalcode
>                          ->  Bitmap Heap Scan on
> postalcoderegioncountycity  (cost=12.30..283.32 rows=522 width=6)
> (actual time=0.092..0.361 rows=532 loops=1)
>                                Output:
> postalcoderegioncountycity.postalcode
>                                Recheck Cond: (regionid = 26)
>                                ->  Bitmap Index Scan on
> postalcoderegioncountycity_i05  (cost=0.00..12.17 rows=522 width=0)
> (actual time=0.082..0.082 rows=532 loops=1)
>                                      Index Cond: (regionid = 26)
> Total runtime: 1945.244 ms

How many rows are in the vehicleused table in total?

Is your database small enough to fit in memory?

Do you have any non-default settings in postgresql.conf?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Using Between

From

"Ozer, Pam"

Date:

21 September 2010, 17:14:15

There are 850,000 records in vehicleused.  And the database is too big to be kept in memory.

Here are our config settings.

listen_addresses = '*'                    # what IP address(es) to listen on;
                                                                                # comma-separated list of addresses;
                                                                                # defaults to 'localhost', '*' = all
                                                                                # (change requires restart)
port = 5432                                                         # (change requires restart)
max_connections = 100                                                # (change requires restart)
                                                                                # (change requires restart)
bonjour_name = 'colapcnt1d'                                     # defaults to the computer name
                                                                                # (change requires restart)

shared_buffers = 500MB                                              # min 128kB
effective_cache_size = 1000MB

log_destination = 'stderr'                             # Valid values are combinations of
logging_collector = on                    # Enable capturing of stderr and csvlog


datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'                                     # locale for system error message
                                                                                # strings
lc_monetary = 'en_US.UTF-8'                                     # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'                                        # locale for number formatting
lc_time = 'en_US.UTF-8'                                                               # locale for time formatting

# default configuration for text search
default_text_search_config = 'pg_catalog.english'

max_connections = 100
temp_buffers = 100MB
work_mem = 100MB
maintenance_work_mem = 500MB
max_files_per_process = 10000
seq_page_cost = 1.0
random_page_cost = 1.1
cpu_tuple_cost = 0.1
cpu_index_tuple_cost = 0.05
cpu_operator_cost = 0.01
default_statistics_target = 1000
autovacuum_max_workers = 1

#log_min_messages = DEBUG1
#log_min_duration_statement = 1000
#log_statement = all
#log_temp_files = 128
#log_lock_waits = on
#log_line_prefix = '%m %u %d %h %p %i %c %l %s'
#log_duration = on
#debug_print_plan = on

-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Tuesday, September 21, 2010 12:35 PM
To: Ozer, Pam
Cc: Craig James; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Using Between

On Mon, Aug 30, 2010 at 12:51 PM, Ozer, Pam <pozer@automotive.com> wrote:
> Yes.  ANALYZE was run after we loaded the data.  Thanks for your
> assistance
> Here is the full Query.
>
> select distinct VehicleUsed.VehicleUsedId as VehicleUsedId ,
>  VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority ,
>  VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail ,
>  VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice ,
>  VehicleUsed.VehicleUsedPrice as VehicleUsedPrice ,
>  VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage ,
>  VehicleUsed.VehicleUsedMileage as VehicleUsedMileage ,
>  VehicleUsed.IsCPO as IsCPO ,
>  VehicleUsed.IsMTCA as IsMTCA
> from VehicleUsed
> inner join PostalCodeRegionCountyCity on ( lower (
> VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode
> ) )
> where
> ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000)
>  and
> ( PostalCodeRegionCountyCity.RegionId = 26 )
>
> order by VehicleUsed.VehicleUsedDisplayPriority ,
>  VehicleUsed.HasVehicleUsedThumbnail desc ,
>  VehicleUsed.HasVehicleUsedPrice desc ,
>  VehicleUsed.VehicleUsedPrice ,
>  VehicleUsed.HasVehicleUsedMileage desc ,
>  VehicleUsed.VehicleUsedMileage ,
>  VehicleUsed.IsCPO desc ,
>  VehicleUsed.IsMTCA desc
> limit 500000
>
>
>
>
> Here is the explain Analyze
>
> Limit  (cost=59732.41..60849.24 rows=44673 width=39) (actual
> time=1940.274..1944.312 rows=2363 loops=1)
>  Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>  ->  Unique  (cost=59732.41..60849.24 rows=44673 width=39) (actual
> time=1940.272..1943.011 rows=2363 loops=1)
>        Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>        ->  Sort  (cost=59732.41..59844.10 rows=44673 width=39) (actual
> time=1940.270..1941.101 rows=2363 loops=1)
>              Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>              Sort Key: vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca,
> vehicleused.vehicleusedid
>              Sort Method:  quicksort  Memory: 231kB
>              ->  Hash Join  (cost=289.85..55057.07 rows=44673 width=39)
> (actual time=3.799..1923.958 rows=2363 loops=1)
>                    Output: vehicleused.vehicleusedid,
> vehicleused.vehicleuseddisplaypriority,
> vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice,
> vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage,
> vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca
>                    Hash Cond: (lower((vehicleused.postalcode)::text) =
> lower((postalcoderegioncountycity.postalcode)::text))
>                    ->  Seq Scan on vehicleused  (cost=0.00..51807.63
> rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1)
>                          Output: vehicleused.vehicleusedid,
> vehicleused.datasetid, vehicleused.vehicleusedproductid,
> vehicleused.sellernodeid, vehicleused.vehicleyear,
> vehicleused.vehiclemakeid, vehicleused.vehiclemodelid,
> vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid,
> vehicleused.vehicledoors, vehicleused.vehicleenginetypeid,
> vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid,
> vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode,
> vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage,
> vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid,
> vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority,
> vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath,
> vehicleused.vehiclebodystylegroupid, vehicleused.productid,
> vehicleused.productgroupid, vehicleused.vehiclevin,
> vehicleused.vehicleclassgroupid,
> vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight,
> vehicleused.buyerid, vehicleused.dealerid,
> vehicleused.hasvehicleusedprice, vehicleused.dealerstockid,
> vehicleused.datesold, vehicleused.hasthumbnailimagepath,
> vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid,
> vehicleused.vehicletitletypeid, vehicleused.warranty,
> vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath,
> vehicleused.description, vehicleused.inserteddate,
> vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor,
> vehicleused.vehicleusedmileagerangefloor,
> vehicleused.hasvehicleusedmileage,
> vehicleused.VehicleUsedIntId.distinct_count,
> vehicleused.VehicleUsedPrice.average,
> vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo,
> vehicleused.ismtca, vehicleused.cpoprogramoemid,
> vehicleused.cpoprogram3rdpartyid
>                          Filter: ((vehicleusedpricerangefloor >= 0) AND
> (vehicleusedpricerangefloor <= 15000))
>                    ->  Hash  (cost=283.32..283.32 rows=522 width=6)
> (actual time=1.084..1.084 rows=532 loops=1)
>                          Output: postalcoderegioncountycity.postalcode
>                          ->  Bitmap Heap Scan on
> postalcoderegioncountycity  (cost=12.30..283.32 rows=522 width=6)
> (actual time=0.092..0.361 rows=532 loops=1)
>                                Output:
> postalcoderegioncountycity.postalcode
>                                Recheck Cond: (regionid = 26)
>                                ->  Bitmap Index Scan on
> postalcoderegioncountycity_i05  (cost=0.00..12.17 rows=522 width=0)
> (actual time=0.082..0.082 rows=532 loops=1)
>                                      Index Cond: (regionid = 26)
> Total runtime: 1945.244 ms

How many rows are in the vehicleused table in total?

Is your database small enough to fit in memory?

Do you have any non-default settings in postgresql.conf?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Using Between

From

Robert Haas

Date:

22 September 2010, 07:52:11

On Tue, Sep 21, 2010 at 4:04 PM, Ozer, Pam <pozer@automotive.com> wrote:
> There are 850,000 records in vehicleused.  And the database is too big to be kept in memory.

Ah.  So in other words, you are retrieving about half the rows in that
table.  For those kinds of queries, using the index tends to actually
be slower, because (1) you read the index in addition to reading the
table, which has CPU and I/O cost, and (2) instead of reading the
table sequentially, you end up jumping around and reading it out of
order, which tends to result in more disk seeks and defeats the OS
prefetch logic.  The query planner is usually pretty smart about
making good decisions about this kind of thing.  As a debugging aid
(but never in production), you can try disabling enable_seqscan and
see what plan you get that way.  If it's slower, well then the query
planner did the right thing.  If it's faster, then probably you need
to adjust seq_page_cost and random_page_cost a bit.  But my guess is
that it will be somewhere between a lot slower and only very slightly
faster.

A whole different line of inquiry is ask the more general question
"how can I make this query faster?", but I'm not sure whether you're
unhappy with how the query is running or just curious about why the
index isn't being used.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Using Between

From

"Ozer, Pam"

Date:

22 September 2010, 12:19:06

The question is how can we make it faster.

-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Wednesday, September 22, 2010 3:52 AM
To: Ozer, Pam
Cc: Craig James; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Using Between

On Tue, Sep 21, 2010 at 4:04 PM, Ozer, Pam <pozer@automotive.com> wrote:
> There are 850,000 records in vehicleused.  And the database is too big to be kept in memory.

Ah.  So in other words, you are retrieving about half the rows in that
table.  For those kinds of queries, using the index tends to actually
be slower, because (1) you read the index in addition to reading the
table, which has CPU and I/O cost, and (2) instead of reading the
table sequentially, you end up jumping around and reading it out of
order, which tends to result in more disk seeks and defeats the OS
prefetch logic.  The query planner is usually pretty smart about
making good decisions about this kind of thing.  As a debugging aid
(but never in production), you can try disabling enable_seqscan and
see what plan you get that way.  If it's slower, well then the query
planner did the right thing.  If it's faster, then probably you need
to adjust seq_page_cost and random_page_cost a bit.  But my guess is
that it will be somewhere between a lot slower and only very slightly
faster.

A whole different line of inquiry is ask the more general question
"how can I make this query faster?", but I'm not sure whether you're
unhappy with how the query is running or just curious about why the
index isn't being used.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Using Between

From

Robert Haas

Date:

22 September 2010, 13:27:28

On Wed, Sep 22, 2010 at 11:18 AM, Ozer, Pam <pozer@automotive.com> wrote:
> The question is how can we make it faster.

If there's just one region ID for any given postal code, you might try
adding a column to vehicleused and storing the postal codes there.
You could possibly populate that column using a trigger; probably it
doesn't change unless the postalcode changes.  Then you could index
that column and query against it directly, rather than joining to
PostalCodeRegionCountyCity.  Short of that, I don't see any obvious
way to avoid reading most of the vehicleused table.  There may or may
not be an index that can speed that up slightly and of course you can
always throw hardware at the problem, but fundamentally reading half a
million or more rows isn't going to be instantaneous.

Incidentally, it would probably simplify things to store postal codes
in the same case throughout the system. If you can avoid the need to
write lower(x) = lower(y) and just write x = y you may get better
plans.  I'm not sure that's the case in this particular example but
it's something to think about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Using Between

From

"Ozer, Pam"

Date:

22 September 2010, 13:28:52

Thank you.  I will take a look at those suggestions.

-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Wednesday, September 22, 2010 9:27 AM
To: Ozer, Pam
Cc: Craig James; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Using Between

On Wed, Sep 22, 2010 at 11:18 AM, Ozer, Pam <pozer@automotive.com>
wrote:
> The question is how can we make it faster.

If there's just one region ID for any given postal code, you might try
adding a column to vehicleused and storing the postal codes there.
You could possibly populate that column using a trigger; probably it
doesn't change unless the postalcode changes.  Then you could index
that column and query against it directly, rather than joining to
PostalCodeRegionCountyCity.  Short of that, I don't see any obvious
way to avoid reading most of the vehicleused table.  There may or may
not be an index that can speed that up slightly and of course you can
always throw hardware at the problem, but fundamentally reading half a
million or more rows isn't going to be instantaneous.

Incidentally, it would probably simplify things to store postal codes
in the same case throughout the system. If you can avoid the need to
write lower(x) = lower(y) and just write x = y you may get better
plans.  I'm not sure that's the case in this particular example but
it's something to think about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Using Between

From

"mark"

Date:

23 September 2010, 09:29:32

>>> The question is how can we make it faster.

>>If there's just one region ID for any given postal code, you might try
>>adding a column to vehicleused and storing the postal codes there.
>>You could possibly populate that column using a trigger; probably it
>>doesn't change unless the postalcode changes.  Then you could index
>>that column and query against it directly, rather than joining to
>>PostalCodeRegionCountyCity.  Short of that, I don't see any obvious
>>way to avoid reading most of the vehicleused table.  There may or may
>>not be an index that can speed that up slightly and of course you can
>>always throw hardware at the problem, but fundamentally reading half a
>>million or more rows isn't going to be instantaneous.

>>Incidentally, it would probably simplify things to store postal codes
>>in the same case throughout the system. If you can avoid the need to
>>write lower(x) = lower(y) and just write x = y you may get better
>>plans.  I'm not sure that's the case in this particular example but
>>it's something to think about.

Something else you might test is bumping the read-ahead value. Most linux
installs have this at 256, might try bumping the value to ~8Meg and tune
from there . this may help you slightly for seq scan performance. As always:
YMMV. It's not going to magically fix low performing I/O subsystems and it
won't help many applications of PG but there are a few outlying instances
where this change can help a little bit.


I am sure someone will step in and tell you it is a bad idea - AND they will
probably have perfectly valid reasons for why it is, so you will need to
consider the ramifications.. if at all possible test and tune to see.
..: Mark



>>--
>>Robert Haas
>>EnterpriseDB: http://www.enterprisedb.com
>>The Enterprise Postgres Company

--
>>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>>To make changes to your subscription:
>>http://www.postgresql.org/mailpref/pgsql-performance