WG: Re: Re: select ... where ='' does a seqscan [auf Viren geprueft] - Mailing list pgsql-general

From Silvio Matthes
Subject WG: Re: Re: select ... where ='' does a seqscan [auf Viren geprueft]
Date
Msg-id OF44F87C3C.9D34FDA4-ONC1256EFA.005AF6EE-C1256EFA.005AFD20@xcom.de
Whole thread Raw
Responses Re: WG: Re: Re: select ... where ='' does a seqscan [auf Viren geprueft]
List pgsql-general

>Am Dienstag, 24. August 2004 11:59 schrieb Silvio Matthes:
>> So using the index does need more time than a sequential scan?

>It's possible.  If you want to prove the opposite, please post the output of
>EXPLAIN ANALYZE in both cases.

On my system, with PostgresQL 8.0Beta1, I could prove the opposite, but that's not the point for me.
Both cases are too slow:

set enable_seqscan=true  
explain analyze select count(*) from document_params where param_name='KONTO_NR' and param_value='';

                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=241980.45..241980.45 rows=1 width=0) (actual time=47766.000..47766.000 rows=1 loops=1)
   ->  Seq Scan on document_params  (cost=0.00..241600.27 rows=152073 width=0) (actual time=0.000..47719.000 rows=6672 loops=1)
         Filter: (((param_name)::text = 'KONTO_NR'::text) AND ((param_value)::text = ''::text))
 Total runtime: 47938.000 ms
(4 rows)



set enable_seqscan=false
explain analyze select count(*) from document_params where param_name='KONTO_NR' and param_value='';

                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=592164.03..592164.03 rows=1 width=0) (actual time=31828.000..31828.000 rows=1 loops=1)
   ->  Index Scan using idx_empty on document_params  (cost=0.00..591783.84 rows=152073 width=0) (actual time=140.000..31782.000 rows=6672 loops=1)
         Index Cond: ((param_name)::text = 'KONTO_NR'::text)
         Filter: ((param_value)::text = ''::text)
 Total runtime: 31828.000 ms
(5 rows)


In other tests the server cached the responses so I cannot compare the results, in other cases the server did need some 20s for an index-scan, probably due to too low configured memory.

This brings me to the following questions:

1.) What can I do to avoid seqscans if I want rows with empty param_value?
2.) Would it be wiser to change all the empty values to some placeholder (e.g. 'empty')?
    From the tests made up to now, this would be the fastest solution, but wouldn't be very nice in concern of readability...

a bit offtopic, but related for me:
3.) how can I prevent the server from caching or empty the cache under win32 (stopping and re-starting the service doesn't help) for comparing similar statements?
4.) how can I change the memory settings for the postmaster (v8.0beta) in win32?


As always, any help would be highly appreciated,

Ciao,

Silvio Matthes

pgsql-general by date:

Previous
From: Gaetano Mendola
Date:
Subject: Re: Postgresql 8.0 beta 1 - strange cpu usage statistics and slow
Next
From: Silvio Matthes
Date:
Subject: Re: select ... where ='' does a seqscan [auf Viren