Pam Ozer
Thread: Using Between
I have a query that
Select Distinct VehicleId
From Vehicle
Where VehicleMileage between 0 and 15000.
I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked up. It does get picked up when I run
Select Distinct VehicleId
From Vehicle
Where VehicleMileage = 15000.
I just want to make sure that there is not a special index I should be using.
Thanks
On 8/27/10 5:21 PM, Ozer, Pam wrote: > I have a query that > > Select Distinct VehicleId > > From Vehicle > > Where VehicleMileage between 0 and 15000. > > I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked up.It does get picked up when I run > > Select Distinct VehicleId > > From Vehicle > > Where VehicleMileage = 15000. > > I just want to make sure that there is not a special index I should be using. You need to post EXPLAIN ANALYZE of your query. It could be that an index scan is actually not a good plan (for example,a sequential scan might be faster if most of your vehicles have low mileage). Without the EXPLAIN ANALYZE, there'sno way to say what's going on. Did you ANALYZE your database after you loaded the data? Craig > Thanks > > *Pam Ozer* >
Yes. ANALYZE was run after we loaded the data. Thanks for your assistance Here is the full Query. select distinct VehicleUsed.VehicleUsedId as VehicleUsedId , VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority , VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail , VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice , VehicleUsed.VehicleUsedPrice as VehicleUsedPrice , VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage , VehicleUsed.VehicleUsedMileage as VehicleUsedMileage , VehicleUsed.IsCPO as IsCPO , VehicleUsed.IsMTCA as IsMTCA from VehicleUsed inner join PostalCodeRegionCountyCity on ( lower ( VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode ) ) where ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000) and ( PostalCodeRegionCountyCity.RegionId = 26 ) order by VehicleUsed.VehicleUsedDisplayPriority , VehicleUsed.HasVehicleUsedThumbnail desc , VehicleUsed.HasVehicleUsedPrice desc , VehicleUsed.VehicleUsedPrice , VehicleUsed.HasVehicleUsedMileage desc , VehicleUsed.VehicleUsedMileage , VehicleUsed.IsCPO desc , VehicleUsed.IsMTCA desc limit 500000 Here is the explain Analyze Limit (cost=59732.41..60849.24 rows=44673 width=39) (actual time=1940.274..1944.312 rows=2363 loops=1) Output: vehicleused.vehicleusedid, vehicleused.vehicleuseddisplaypriority, vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca -> Unique (cost=59732.41..60849.24 rows=44673 width=39) (actual time=1940.272..1943.011 rows=2363 loops=1) Output: vehicleused.vehicleusedid, vehicleused.vehicleuseddisplaypriority, vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca -> Sort (cost=59732.41..59844.10 rows=44673 width=39) (actual time=1940.270..1941.101 rows=2363 loops=1) Output: vehicleused.vehicleusedid, vehicleused.vehicleuseddisplaypriority, vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca Sort Key: vehicleused.vehicleuseddisplaypriority, vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca, vehicleused.vehicleusedid Sort Method: quicksort Memory: 231kB -> Hash Join (cost=289.85..55057.07 rows=44673 width=39) (actual time=3.799..1923.958 rows=2363 loops=1) Output: vehicleused.vehicleusedid, vehicleused.vehicleuseddisplaypriority, vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca Hash Cond: (lower((vehicleused.postalcode)::text) = lower((postalcoderegioncountycity.postalcode)::text)) -> Seq Scan on vehicleused (cost=0.00..51807.63 rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1) Output: vehicleused.vehicleusedid, vehicleused.datasetid, vehicleused.vehicleusedproductid, vehicleused.sellernodeid, vehicleused.vehicleyear, vehicleused.vehiclemakeid, vehicleused.vehiclemodelid, vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid, vehicleused.vehicledoors, vehicleused.vehicleenginetypeid, vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid, vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode, vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage, vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid, vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority, vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath, vehicleused.vehiclebodystylegroupid, vehicleused.productid, vehicleused.productgroupid, vehicleused.vehiclevin, vehicleused.vehicleclassgroupid, vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight, vehicleused.buyerid, vehicleused.dealerid, vehicleused.hasvehicleusedprice, vehicleused.dealerstockid, vehicleused.datesold, vehicleused.hasthumbnailimagepath, vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid, vehicleused.vehicletitletypeid, vehicleused.warranty, vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath, vehicleused.description, vehicleused.inserteddate, vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor, vehicleused.vehicleusedmileagerangefloor, vehicleused.hasvehicleusedmileage, vehicleused.VehicleUsedIntId.distinct_count, vehicleused.VehicleUsedPrice.average, vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo, vehicleused.ismtca, vehicleused.cpoprogramoemid, vehicleused.cpoprogram3rdpartyid Filter: ((vehicleusedpricerangefloor >= 0) AND (vehicleusedpricerangefloor <= 15000)) -> Hash (cost=283.32..283.32 rows=522 width=6) (actual time=1.084..1.084 rows=532 loops=1) Output: postalcoderegioncountycity.postalcode -> Bitmap Heap Scan on postalcoderegioncountycity (cost=12.30..283.32 rows=522 width=6) (actual time=0.092..0.361 rows=532 loops=1) Output: postalcoderegioncountycity.postalcode Recheck Cond: (regionid = 26) -> Bitmap Index Scan on postalcoderegioncountycity_i05 (cost=0.00..12.17 rows=522 width=0) (actual time=0.082..0.082 rows=532 loops=1) Index Cond: (regionid = 26) Total runtime: 1945.244 ms -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Craig James Sent: Friday, August 27, 2010 5:42 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Using Between On 8/27/10 5:21 PM, Ozer, Pam wrote: > I have a query that > > Select Distinct VehicleId > > From Vehicle > > Where VehicleMileage between 0 and 15000. > > I have an index on VehicleMileage. Is there another way to put an index on a between? The index is not being picked up. It does get picked up when I run > > Select Distinct VehicleId > > From Vehicle > > Where VehicleMileage = 15000. > > I just want to make sure that there is not a special index I should be using. You need to post EXPLAIN ANALYZE of your query. It could be that an index scan is actually not a good plan (for example, a sequential scan might be faster if most of your vehicles have low mileage). Without the EXPLAIN ANALYZE, there's no way to say what's going on. Did you ANALYZE your database after you loaded the data? Craig > Thanks > > *Pam Ozer* > -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
On Mon, Aug 30, 2010 at 12:51 PM, Ozer, Pam <pozer@automotive.com> wrote: > Yes. ANALYZE was run after we loaded the data. Thanks for your > assistance > Here is the full Query. > > select distinct VehicleUsed.VehicleUsedId as VehicleUsedId , > VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority , > VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail , > VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice , > VehicleUsed.VehicleUsedPrice as VehicleUsedPrice , > VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage , > VehicleUsed.VehicleUsedMileage as VehicleUsedMileage , > VehicleUsed.IsCPO as IsCPO , > VehicleUsed.IsMTCA as IsMTCA > from VehicleUsed > inner join PostalCodeRegionCountyCity on ( lower ( > VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode > ) ) > where > ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000) > and > ( PostalCodeRegionCountyCity.RegionId = 26 ) > > order by VehicleUsed.VehicleUsedDisplayPriority , > VehicleUsed.HasVehicleUsedThumbnail desc , > VehicleUsed.HasVehicleUsedPrice desc , > VehicleUsed.VehicleUsedPrice , > VehicleUsed.HasVehicleUsedMileage desc , > VehicleUsed.VehicleUsedMileage , > VehicleUsed.IsCPO desc , > VehicleUsed.IsMTCA desc > limit 500000 > > > > > Here is the explain Analyze > > Limit (cost=59732.41..60849.24 rows=44673 width=39) (actual > time=1940.274..1944.312 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > -> Unique (cost=59732.41..60849.24 rows=44673 width=39) (actual > time=1940.272..1943.011 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > -> Sort (cost=59732.41..59844.10 rows=44673 width=39) (actual > time=1940.270..1941.101 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > Sort Key: vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca, > vehicleused.vehicleusedid > Sort Method: quicksort Memory: 231kB > -> Hash Join (cost=289.85..55057.07 rows=44673 width=39) > (actual time=3.799..1923.958 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > Hash Cond: (lower((vehicleused.postalcode)::text) = > lower((postalcoderegioncountycity.postalcode)::text)) > -> Seq Scan on vehicleused (cost=0.00..51807.63 > rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.datasetid, vehicleused.vehicleusedproductid, > vehicleused.sellernodeid, vehicleused.vehicleyear, > vehicleused.vehiclemakeid, vehicleused.vehiclemodelid, > vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid, > vehicleused.vehicledoors, vehicleused.vehicleenginetypeid, > vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid, > vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode, > vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage, > vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid, > vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority, > vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath, > vehicleused.vehiclebodystylegroupid, vehicleused.productid, > vehicleused.productgroupid, vehicleused.vehiclevin, > vehicleused.vehicleclassgroupid, > vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight, > vehicleused.buyerid, vehicleused.dealerid, > vehicleused.hasvehicleusedprice, vehicleused.dealerstockid, > vehicleused.datesold, vehicleused.hasthumbnailimagepath, > vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid, > vehicleused.vehicletitletypeid, vehicleused.warranty, > vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath, > vehicleused.description, vehicleused.inserteddate, > vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor, > vehicleused.vehicleusedmileagerangefloor, > vehicleused.hasvehicleusedmileage, > vehicleused.VehicleUsedIntId.distinct_count, > vehicleused.VehicleUsedPrice.average, > vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo, > vehicleused.ismtca, vehicleused.cpoprogramoemid, > vehicleused.cpoprogram3rdpartyid > Filter: ((vehicleusedpricerangefloor >= 0) AND > (vehicleusedpricerangefloor <= 15000)) > -> Hash (cost=283.32..283.32 rows=522 width=6) > (actual time=1.084..1.084 rows=532 loops=1) > Output: postalcoderegioncountycity.postalcode > -> Bitmap Heap Scan on > postalcoderegioncountycity (cost=12.30..283.32 rows=522 width=6) > (actual time=0.092..0.361 rows=532 loops=1) > Output: > postalcoderegioncountycity.postalcode > Recheck Cond: (regionid = 26) > -> Bitmap Index Scan on > postalcoderegioncountycity_i05 (cost=0.00..12.17 rows=522 width=0) > (actual time=0.082..0.082 rows=532 loops=1) > Index Cond: (regionid = 26) > Total runtime: 1945.244 ms How many rows are in the vehicleused table in total? Is your database small enough to fit in memory? Do you have any non-default settings in postgresql.conf? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
There are 850,000 records in vehicleused. And the database is too big to be kept in memory. Here are our config settings. listen_addresses = '*' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all # (change requires restart) port = 5432 # (change requires restart) max_connections = 100 # (change requires restart) # (change requires restart) bonjour_name = 'colapcnt1d' # defaults to the computer name # (change requires restart) shared_buffers = 500MB # min 128kB effective_cache_size = 1000MB log_destination = 'stderr' # Valid values are combinations of logging_collector = on # Enable capturing of stderr and csvlog datestyle = 'iso, mdy' lc_messages = 'en_US.UTF-8' # locale for system error message # strings lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting # default configuration for text search default_text_search_config = 'pg_catalog.english' max_connections = 100 temp_buffers = 100MB work_mem = 100MB maintenance_work_mem = 500MB max_files_per_process = 10000 seq_page_cost = 1.0 random_page_cost = 1.1 cpu_tuple_cost = 0.1 cpu_index_tuple_cost = 0.05 cpu_operator_cost = 0.01 default_statistics_target = 1000 autovacuum_max_workers = 1 #log_min_messages = DEBUG1 #log_min_duration_statement = 1000 #log_statement = all #log_temp_files = 128 #log_lock_waits = on #log_line_prefix = '%m %u %d %h %p %i %c %l %s' #log_duration = on #debug_print_plan = on -----Original Message----- From: Robert Haas [mailto:robertmhaas@gmail.com] Sent: Tuesday, September 21, 2010 12:35 PM To: Ozer, Pam Cc: Craig James; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Using Between On Mon, Aug 30, 2010 at 12:51 PM, Ozer, Pam <pozer@automotive.com> wrote: > Yes. ANALYZE was run after we loaded the data. Thanks for your > assistance > Here is the full Query. > > select distinct VehicleUsed.VehicleUsedId as VehicleUsedId , > VehicleUsed.VehicleUsedDisplayPriority as VehicleUsedDisplayPriority , > VehicleUsed.HasVehicleUsedThumbnail as HasVehicleUsedThumbnail , > VehicleUsed.HasVehicleUsedPrice as HasVehicleUsedPrice , > VehicleUsed.VehicleUsedPrice as VehicleUsedPrice , > VehicleUsed.HasVehicleUsedMileage as HasVehicleUsedMileage , > VehicleUsed.VehicleUsedMileage as VehicleUsedMileage , > VehicleUsed.IsCPO as IsCPO , > VehicleUsed.IsMTCA as IsMTCA > from VehicleUsed > inner join PostalCodeRegionCountyCity on ( lower ( > VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode > ) ) > where > ( VehicleUsed.VehicleUsedPriceRangeFloor between 0 and 15000) > and > ( PostalCodeRegionCountyCity.RegionId = 26 ) > > order by VehicleUsed.VehicleUsedDisplayPriority , > VehicleUsed.HasVehicleUsedThumbnail desc , > VehicleUsed.HasVehicleUsedPrice desc , > VehicleUsed.VehicleUsedPrice , > VehicleUsed.HasVehicleUsedMileage desc , > VehicleUsed.VehicleUsedMileage , > VehicleUsed.IsCPO desc , > VehicleUsed.IsMTCA desc > limit 500000 > > > > > Here is the explain Analyze > > Limit (cost=59732.41..60849.24 rows=44673 width=39) (actual > time=1940.274..1944.312 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > -> Unique (cost=59732.41..60849.24 rows=44673 width=39) (actual > time=1940.272..1943.011 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > -> Sort (cost=59732.41..59844.10 rows=44673 width=39) (actual > time=1940.270..1941.101 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > Sort Key: vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca, > vehicleused.vehicleusedid > Sort Method: quicksort Memory: 231kB > -> Hash Join (cost=289.85..55057.07 rows=44673 width=39) > (actual time=3.799..1923.958 rows=2363 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.vehicleuseddisplaypriority, > vehicleused.hasvehicleusedthumbnail, vehicleused.hasvehicleusedprice, > vehicleused.vehicleusedprice, vehicleused.hasvehicleusedmileage, > vehicleused.vehicleusedmileage, vehicleused.iscpo, vehicleused.ismtca > Hash Cond: (lower((vehicleused.postalcode)::text) = > lower((postalcoderegioncountycity.postalcode)::text)) > -> Seq Scan on vehicleused (cost=0.00..51807.63 > rows=402058 width=45) (actual time=0.016..1454.616 rows=398495 loops=1) > Output: vehicleused.vehicleusedid, > vehicleused.datasetid, vehicleused.vehicleusedproductid, > vehicleused.sellernodeid, vehicleused.vehicleyear, > vehicleused.vehiclemakeid, vehicleused.vehiclemodelid, > vehicleused.vehiclesubmodelid, vehicleused.vehiclebodystyleid, > vehicleused.vehicledoors, vehicleused.vehicleenginetypeid, > vehicleused.vehicledrivetrainid, vehicleused.vehicleexteriorcolorid, > vehicleused.hasvehicleusedthumbnail, vehicleused.postalcode, > vehicleused.vehicleusedprice, vehicleused.vehicleusedmileage, > vehicleused.buyerguid, vehicleused.vehicletransmissiontypeid, > vehicleused.vehicleusedintid, vehicleused.vehicleuseddisplaypriority, > vehicleused.vehicleusedsearchradius, vehicleused.vehiclejatoimagepath, > vehicleused.vehiclebodystylegroupid, vehicleused.productid, > vehicleused.productgroupid, vehicleused.vehiclevin, > vehicleused.vehicleclassgroupid, > vehicleused.vehiclegenericexteriorcolorid, vehicleused.highlight, > vehicleused.buyerid, vehicleused.dealerid, > vehicleused.hasvehicleusedprice, vehicleused.dealerstockid, > vehicleused.datesold, vehicleused.hasthumbnailimagepath, > vehicleused.vehicleinteriorcolorid, vehicleused.vehicleconditionid, > vehicleused.vehicletitletypeid, vehicleused.warranty, > vehicleused.thumbnailimagepath, vehicleused.fullsizeimagepath, > vehicleused.description, vehicleused.inserteddate, > vehicleused.feeddealerid, vehicleused.vehicleusedpricerangefloor, > vehicleused.vehicleusedmileagerangefloor, > vehicleused.hasvehicleusedmileage, > vehicleused.VehicleUsedIntId.distinct_count, > vehicleused.VehicleUsedPrice.average, > vehicleused.VehicleUsedId.distinct_count, vehicleused.iscpo, > vehicleused.ismtca, vehicleused.cpoprogramoemid, > vehicleused.cpoprogram3rdpartyid > Filter: ((vehicleusedpricerangefloor >= 0) AND > (vehicleusedpricerangefloor <= 15000)) > -> Hash (cost=283.32..283.32 rows=522 width=6) > (actual time=1.084..1.084 rows=532 loops=1) > Output: postalcoderegioncountycity.postalcode > -> Bitmap Heap Scan on > postalcoderegioncountycity (cost=12.30..283.32 rows=522 width=6) > (actual time=0.092..0.361 rows=532 loops=1) > Output: > postalcoderegioncountycity.postalcode > Recheck Cond: (regionid = 26) > -> Bitmap Index Scan on > postalcoderegioncountycity_i05 (cost=0.00..12.17 rows=522 width=0) > (actual time=0.082..0.082 rows=532 loops=1) > Index Cond: (regionid = 26) > Total runtime: 1945.244 ms How many rows are in the vehicleused table in total? Is your database small enough to fit in memory? Do you have any non-default settings in postgresql.conf? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Tue, Sep 21, 2010 at 4:04 PM, Ozer, Pam <pozer@automotive.com> wrote: > There are 850,000 records in vehicleused. And the database is too big to be kept in memory. Ah. So in other words, you are retrieving about half the rows in that table. For those kinds of queries, using the index tends to actually be slower, because (1) you read the index in addition to reading the table, which has CPU and I/O cost, and (2) instead of reading the table sequentially, you end up jumping around and reading it out of order, which tends to result in more disk seeks and defeats the OS prefetch logic. The query planner is usually pretty smart about making good decisions about this kind of thing. As a debugging aid (but never in production), you can try disabling enable_seqscan and see what plan you get that way. If it's slower, well then the query planner did the right thing. If it's faster, then probably you need to adjust seq_page_cost and random_page_cost a bit. But my guess is that it will be somewhere between a lot slower and only very slightly faster. A whole different line of inquiry is ask the more general question "how can I make this query faster?", but I'm not sure whether you're unhappy with how the query is running or just curious about why the index isn't being used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
The question is how can we make it faster. -----Original Message----- From: Robert Haas [mailto:robertmhaas@gmail.com] Sent: Wednesday, September 22, 2010 3:52 AM To: Ozer, Pam Cc: Craig James; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Using Between On Tue, Sep 21, 2010 at 4:04 PM, Ozer, Pam <pozer@automotive.com> wrote: > There are 850,000 records in vehicleused. And the database is too big to be kept in memory. Ah. So in other words, you are retrieving about half the rows in that table. For those kinds of queries, using the index tends to actually be slower, because (1) you read the index in addition to reading the table, which has CPU and I/O cost, and (2) instead of reading the table sequentially, you end up jumping around and reading it out of order, which tends to result in more disk seeks and defeats the OS prefetch logic. The query planner is usually pretty smart about making good decisions about this kind of thing. As a debugging aid (but never in production), you can try disabling enable_seqscan and see what plan you get that way. If it's slower, well then the query planner did the right thing. If it's faster, then probably you need to adjust seq_page_cost and random_page_cost a bit. But my guess is that it will be somewhere between a lot slower and only very slightly faster. A whole different line of inquiry is ask the more general question "how can I make this query faster?", but I'm not sure whether you're unhappy with how the query is running or just curious about why the index isn't being used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Sep 22, 2010 at 11:18 AM, Ozer, Pam <pozer@automotive.com> wrote: > The question is how can we make it faster. If there's just one region ID for any given postal code, you might try adding a column to vehicleused and storing the postal codes there. You could possibly populate that column using a trigger; probably it doesn't change unless the postalcode changes. Then you could index that column and query against it directly, rather than joining to PostalCodeRegionCountyCity. Short of that, I don't see any obvious way to avoid reading most of the vehicleused table. There may or may not be an index that can speed that up slightly and of course you can always throw hardware at the problem, but fundamentally reading half a million or more rows isn't going to be instantaneous. Incidentally, it would probably simplify things to store postal codes in the same case throughout the system. If you can avoid the need to write lower(x) = lower(y) and just write x = y you may get better plans. I'm not sure that's the case in this particular example but it's something to think about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Thank you. I will take a look at those suggestions. -----Original Message----- From: Robert Haas [mailto:robertmhaas@gmail.com] Sent: Wednesday, September 22, 2010 9:27 AM To: Ozer, Pam Cc: Craig James; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Using Between On Wed, Sep 22, 2010 at 11:18 AM, Ozer, Pam <pozer@automotive.com> wrote: > The question is how can we make it faster. If there's just one region ID for any given postal code, you might try adding a column to vehicleused and storing the postal codes there. You could possibly populate that column using a trigger; probably it doesn't change unless the postalcode changes. Then you could index that column and query against it directly, rather than joining to PostalCodeRegionCountyCity. Short of that, I don't see any obvious way to avoid reading most of the vehicleused table. There may or may not be an index that can speed that up slightly and of course you can always throw hardware at the problem, but fundamentally reading half a million or more rows isn't going to be instantaneous. Incidentally, it would probably simplify things to store postal codes in the same case throughout the system. If you can avoid the need to write lower(x) = lower(y) and just write x = y you may get better plans. I'm not sure that's the case in this particular example but it's something to think about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
>>> The question is how can we make it faster. >>If there's just one region ID for any given postal code, you might try >>adding a column to vehicleused and storing the postal codes there. >>You could possibly populate that column using a trigger; probably it >>doesn't change unless the postalcode changes. Then you could index >>that column and query against it directly, rather than joining to >>PostalCodeRegionCountyCity. Short of that, I don't see any obvious >>way to avoid reading most of the vehicleused table. There may or may >>not be an index that can speed that up slightly and of course you can >>always throw hardware at the problem, but fundamentally reading half a >>million or more rows isn't going to be instantaneous. >>Incidentally, it would probably simplify things to store postal codes >>in the same case throughout the system. If you can avoid the need to >>write lower(x) = lower(y) and just write x = y you may get better >>plans. I'm not sure that's the case in this particular example but >>it's something to think about. Something else you might test is bumping the read-ahead value. Most linux installs have this at 256, might try bumping the value to ~8Meg and tune from there . this may help you slightly for seq scan performance. As always: YMMV. It's not going to magically fix low performing I/O subsystems and it won't help many applications of PG but there are a few outlying instances where this change can help a little bit. I am sure someone will step in and tell you it is a bad idea - AND they will probably have perfectly valid reasons for why it is, so you will need to consider the ramifications.. if at all possible test and tune to see. ..: Mark >>-- >>Robert Haas >>EnterpriseDB: http://www.enterprisedb.com >>The Enterprise Postgres Company -- >>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >>To make changes to your subscription: >>http://www.postgresql.org/mailpref/pgsql-performance