Re: Speeding up query, Joining 55mil and 43mil records. - Mailing list pgsql-performance

From nicky
Subject Re: Speeding up query, Joining 55mil and 43mil records.
Date
Msg-id 449A88CA.3000108@valuecare.nl
Whole thread Raw
In response to Re: Speeding up query, Joining 55mil and 43mil records.  (Sven Geisler <sgeisler@aeccom.com>)
Responses Re: Speeding up query, Joining 55mil and 43mil records.
Re: Speeding up query, Joining 55mil and 43mil records.
List pgsql-performance
Hello Sven,

We have the following indexes on src_faktuur_verrsec
/
    CREATE INDEX src_faktuur_verrsec_idx0
      ON src.src_faktuur_verrsec
      USING btree
      (id);

    CREATE INDEX src_faktuur_verrsec_idx1
      ON src.src_faktuur_verrsec
      USING btree
      (substr(code::text, 1, 2));

    CREATE INDEX src_faktuur_verrsec_idx2
      ON src.src_faktuur_verrsec
      USING btree
      (substr(correctie::text, 4, 1));/

and another two on src_faktuur_verricht

/    CREATE INDEX src_faktuur_verricht_idx0
      ON src.src_faktuur_verricht
      USING btree
      (id);

    CREATE INDEX src_faktuur_verricht_idx1
      ON src.src_faktuur_verricht
      USING btree
      (date_part('year'::text, datum))
      TABLESPACE src_index;/

PostgreSQL elects not to use them. I assume, because it most likely
needs to traverse the entire table anyway.

if i change: /              substr(t0.code,1,2) not in
('14','15','16','17')/
to (removing the NOT): /    substr(t0.code,1,2) in ('14','15','16','17')/

it uses the index, but it's not the query that needs to be run anymore.

Greetings,
Nick




Sven Geisler wrote:
> Hi Nicky,
>
> Did you tried to create an index to avoid the sequential scans?
>
> Seq Scan on src_faktuur_verrsec t0...
>
> I think, you should try
>
> CREATE INDEX src.src_faktuur_verrsec_codesubstr ON
> src.src_faktuur_verrsec (substr(src.src_faktuur_verrsec.code,1,2))
>
> Cheers
> Sven.
>
> nicky schrieb:
>> Hello again,
>>
>> thanks for all the quick replies.
>>
>> It seems i wasn't entirely correct on my previous post, i've mixed up
>> some times/numbers.
>>
>> Below the correct numbers
>>
>> MSSQL:      SELECT COUNT(*) from JOIN (without insert)   17 minutes
>> PostgreSQL: SELECT COUNT(*) from JOIN (without insert)   33 minutes
>> PostgreSQL: complete query                               55 minutes
>
>  <snip snip snip>
>>
>> A lot of improvement also in the select count: 33 minutes vs 10 minutes.
>>
>>
>> To us, the speeds are good. Very happy with the performance increase
>> on that select with join, since 90% of the queries are SELECT based.
>>
>> The query results in 7551616 records, so that's about 4500 inserts
>> per second. I'm not sure if that is fast or not. Any further tips
>> would be welcome.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>
>

pgsql-performance by date:

Previous
From: Sven Geisler
Date:
Subject: Re: Speeding up query, Joining 55mil and 43mil records.
Next
From: "Magnus Hagander"
Date:
Subject: Re: Speeding up query, Joining 55mil and 43mil records.