Re: R: [GENERAL] Slow queries on very big (and partitioned) table - Mailing list pgsql-general

From Stephen Frost
Subject Re: R: [GENERAL] Slow queries on very big (and partitioned) table
Date
Msg-id 20170220144959.GO9812@tamriel.snowman.net
Whole thread Raw
In response to R: [GENERAL] Slow queries on very big (and partitioned) table  (Job <Job@colliniconsulting.it>)
Responses Re: R: [GENERAL] Slow queries on very big (and partitioned) table  (Steven Winfield <Steven.Winfield@cantabcapital.com>)
R: R: [GENERAL] Slow queries on very big (and partitioned) table  (Job <Job@colliniconsulting.it>)
List pgsql-general
Greetings,

* Job (Job@colliniconsulting.it) wrote:
> here is primary a partitioned table (for 20/2/2017 logs):
> flashstart=# \d webtraffic_archive_day_2017_02_20;
>                                      Table "public.webtraffic_archive_day_2017_02_20"
>   Column   |            Type             |                                   Modifiers
>
-----------+-----------------------------+--------------------------------------------------------------------------------
>  id        | numeric(1000,1)             | not null default
function_get_next_sequence('webtraffic_archive_id_seq'::text)
>  timestamp | timestamp without time zone |
>  domain    | character varying(255)      |
>  action    | integer                     |
>  profile   | character varying(50)       |
>  accessi   | integer                     |
>  url       | text                        |
> Indexes:
>     "webtraffic_archive_day_2017_02_20_action_wbidx" btree (action)
>     "webtraffic_archive_day_2017_02_20_domain_wbidx" btree (domain)
>     "webtraffic_archive_day_2017_02_20_profile_wbidx" btree (profile)
>     "webtraffic_archive_day_2017_02_20_timestamp_date_wbidx" brin (("timestamp"::date))
>     "webtraffic_archive_day_2017_02_20_timestamp_time_wbidx" btree (("timestamp"::time without time zone))
>     "webtraffic_archive_day_2017_02_20_timestamp_wbidx" brin ("timestamp")

You don't have a CHECK constraint on your individual partitions, so PG
has no idea which of the partitions contains data from what time-period.
To have PostgreSQL's constraint exclusion work to eliminate entire
partitions, you need to add a CHECK constraint on the partition key.  In
this case, it looks like you'd want something like:

CHECK (timestamp >= partition_date AND timestamp < partition_date+1)

Unfortunately, that'll require locking each table and scanning it to
make sure that the CHECK constraint isn't violated.

A couple of other notes- you probably don't need both a BRIN and a btree
index on the same column.  If the BRIN works well enough for you then
you can drop the btree index.  If it doesn't, then you might as well
just keep the btree index and drop the BRIN.  It seems equally
unnecessary to have a BRIN on a cast of the column.  You should also be
using timestamp w/ timezone, really.

> This is the query planner:
> explain analyze
> select * from webtraffic_archive where timestamp::date='2017-02-20' and profile='f62467'

If this is really what you're mostly doing, having constraint exclusion
and an index on 'profile' would probably be enough, if you insist on
continuing to have the table partitioned by day (which I continue to
argue is a bad idea- based on the number of total rows you mentioned and
the number of partitions, you have partitions with less than 20M rows
each and that's really small, month-based partitions with a BRIN would
probably work better).  If you get to the point of having years worth of
daily partitions, you'd going to see increases in planning time.

Thanks!

Stephen

Attachment

pgsql-general by date:

Previous
From: Job
Date:
Subject: R: [GENERAL] Slow queries on very big (and partitioned) table
Next
From: Steven Winfield
Date:
Subject: Re: R: [GENERAL] Slow queries on very big (and partitioned) table