Re: checking for a NULL date in a partitioned table kills performance - Mailing list pgsql-admin

From Sbob
Subject Re: checking for a NULL date in a partitioned table kills performance
Date
Msg-id b094a0b6-680e-43f0-b466-20868606eea9@quadratum-braccas.com
Whole thread Raw
In response to Re: checking for a NULL date in a partitioned table kills performance  (Sbob <sbob@quadratum-braccas.com>)
Responses Re: checking for a NULL date in a partitioned table kills performance
Re: checking for a NULL date in a partitioned table kills performance
Re: checking for a NULL date in a partitioned table kills performance
List pgsql-admin
On 8/22/24 5:26 PM, Sbob wrote:
>
> On 8/22/24 5:06 PM, Rui DeSousa wrote:
>>
>>> On Aug 22, 2024, at 5:44 PM, Sbob <sbob@quadratum-braccas.com> wrote:
>>>
>>> All;
>>>
>>> I am running a select from a partitioned table. The table (and all 
>>> the partitions) have an index on contract_date like this:
>>> CREATE INDEX on part_tab (contract_date) where contract_date > 
>>> '2022-01-01'::date
>>>
>>> The table (including all partitions) has 32million rows
>>> The db server is an aurora postgresql instance with 128GB of ram and 
>>> 16 vcpu's
>>>
>>> The shared buffers is set to 90GB and effective_cache_size is also 90GB
>>> I set default_statistics_target to 1000 and ram a vacuum analyze on 
>>> the table
>>>
>>> I am selecting a number of columns and specifying this where clause:
>>>
>>> WHERE (
>>>                      (contract_date IS NULL)
>>>                      OR
>>>                      (contract_date > '2022-01-01'::date)
>>>               )
>>>
>>> This takes 15 seconds to run and an explain says it's doing a table 
>>> scan on all partitions (the query is not specifying the partition key)
>>> If I change the where clause to look like this:
>>>
>>> WHERE (
>>>                    (contract_date > '2022-01-01'::date)
>>>               )
>>>
>>> Then it performs index scans on all the partitions and runs in about 
>>> 600ms
>>>
>>> If i leave the where clause off entirely it performs table scans of 
>>> the partitions and takes approx 18 seconds to run
>>>
>>> I am trying to get the performance to less than 2sec,
>>> I have tried adding indexes on the table and all partitions like this:
>>> CREATE INDEX ON table (contract_date NULLS FIRST) ;
>>> but the performance with the full where clause is the same:
>>>
>>> WHERE (
>>>                      (contract_date IS NULL)
>>>                      OR
>>>                      (contract_date > '2022-01-01'::date)
>>>               )
>>>
>>> runs in 15 seconds and scans all partitions
>>>
>>> I also tried indexes i=on the table and all partitions like this:
>>> CREATE INDEX ON table (contract_date) WHERE contract_date IS NULL;
>>>
>>> but I get the same result, table scans on all partitions and it runs 
>>> in 15 seconds
>>>
>>> Any help or advice ?
>>>
>>> Thanks in advance
>>>
>>>
>> What is contract_date and when will it be null?
>
>
> it's a date data type and it allows NULL's not sure why, this is a 
> client's system
>
>
29 million of the 32 million rows in the table have NULL for contract_date







pgsql-admin by date:

Previous
From: Sbob
Date:
Subject: Re: checking for a NULL date in a partitioned table kills performance
Next
From: "David G. Johnston"
Date:
Subject: Re: checking for a NULL date in a partitioned table kills performance