Home > mailing lists

Re: less than 2 sec for response - possible? - Mailing list pgsql-performance

From	trafdev
Subject	Re: less than 2 sec for response - possible?
Date	July 5, 2016 18:36:08
Msg-id	0461b3e8-cb38-0ded-6e8c-48e6dc545ea6@mail.ru Whole thread Raw
In response to	Re: less than 2 sec for response - possible? (Torsten Zuehlsdorff <mailinglists@toco-domains.de>)
Responses	Re: less than 2 sec for response - possible?
List	pgsql-performance

Tree view

Hi, yes I've tried it in the past, it makes no any difference at all:

With TIMESTAMP cast:

QUERY PLAN
HashAggregate  (cost=1405666.90..1416585.93 rows=335970 width=86)
(actual time=4794.585..4923.062 rows=126533 loops=1)
"  Group Key: subid, sid"
   Buffers: shared hit=1486949
   ->  Index Scan using ix_feed_sub_aid_date on feed_sub
(cost=0.44..1313275.32 rows=3359694 width=86) (actual
time=0.020..1736.005 rows=3588376 loops=1)
         Index Cond: ((aid = 3) AND (date >= '2016-06-01
00:00:00'::timestamp without time zone) AND (date <= '2016-06-30
00:00:00'::timestamp without time zone))
         Buffers: shared hit=1486949
Planning time: 0.158 ms
Execution time: 4939.965 ms


Without TIMESTAMP cast:

QUERY PLAN
HashAggregate  (cost=1405666.90..1416585.93 rows=335970 width=86)
(actual time=4797.272..4924.015 rows=126533 loops=1)
"  Group Key: subid, sid"
   Buffers: shared hit=1486949
   ->  Index Scan using ix_feed_sub_aid_date on feed_sub
(cost=0.44..1313275.32 rows=3359694 width=86) (actual
time=0.019..1783.104 rows=3588376 loops=1)
         Index Cond: ((aid = 3) AND (date >= '2016-06-01'::date) AND
(date <= '2016-06-30'::date))
         Buffers: shared hit=1486949
Planning time: 0.164 ms
Execution time: 4941.259 ms

I need to be sure it's a physical limitation of a Postgresql (when all
data is in a memory and fetching\joining 1.5 mln of rows can't be done
in less than 2-3 seconds) and there is no way to improve it.



On 07/05/16 04:39, Torsten Zuehlsdorff wrote:
>
>
> On 02.07.2016 02:54, trafdev wrote:
> > Hi.
> >
> > I'm trying to build an OLAP-oriented DB based on PostgresSQL.
> >
> > User works with a paginated report in the web-browser. Interface allows
> > to fetch data for a custom date-range selection,
> > display individual rows (20-50 per page) and totals (for entire
> > selection, even not visible on the current page) and sorting by any
> > column.
> >
> > The main goal is to deliver results of the basic SELECT queries to the
> > end-user in less than 2 seconds.
> >
> > I was able to achieve that except for one table (the biggest one).
> >
> > It consist of multiple dimensions (date, gran, aid, pid, sid, fid,
> > subid) and metrics (see below).
> > User can filter by any dimension and sort by any metric.
> >
> > Here is a CREATE script for this table:
> >
> > CREATE TABLE stats.feed_sub
> > (
> >   date date NOT NULL,
> >   gran interval NOT NULL,
> >   aid smallint NOT NULL,
> >   pid smallint NOT NULL,
> >   sid smallint NOT NULL,
> >   fid smallint NOT NULL,
> >   subid text NOT NULL,
> >   rev_est_pub real NOT NULL,
> >   rev_est_feed real NOT NULL,
> >   rev_raw real NOT NULL,
> >   c_total bigint NOT NULL,
> >   c_passed bigint NOT NULL,
> >   q_total bigint NOT NULL,
> >   q_passed bigint NOT NULL,
> >   q_filt_geo bigint NOT NULL,
> >   q_filt_browser bigint NOT NULL,
> >   q_filt_os bigint NOT NULL,
> >   q_filt_ip bigint NOT NULL,
> >   q_filt_subid bigint NOT NULL,
> >   q_filt_pause bigint NOT NULL,
> >   q_filt_click_cap_ip bigint NOT NULL,
> >   q_filt_query_cap bigint NOT NULL,
> >   q_filt_click_cap bigint NOT NULL,
> >   q_filt_rev_cap bigint NOT NULL,
> >   q_filt_erpm_floor bigint NOT NULL,
> >   c_filt_click_cap_ip bigint NOT NULL,
> >   c_filt_doubleclick bigint NOT NULL,
> >   c_filt_url_expired bigint NOT NULL,
> >   c_filt_fast_click bigint NOT NULL,
> >   c_filt_delay_clicks bigint NOT NULL,
> >   c_filt_ip_mismatch bigint NOT NULL,
> >   c_filt_ref_mismatch bigint NOT NULL,
> >   c_filt_lng_mismatch bigint NOT NULL,
> >   c_filt_ua_mismatch bigint NOT NULL,
> >   res_impr bigint NOT NULL,
> >   rev_ver_pub real,
> >   rev_ver_feed real,
> >   c_ver bigint,
> >   q_filt_ref bigint NOT NULL
> > )
> > WITH (
> >   OIDS=FALSE
> > );
> >
> > CREATE INDEX ix_feed_sub_date
> >   ON stats.feed_sub
> >   USING brin
> >   (date);
> >
> > CREATE UNIQUE INDEX ixu_feed_sub
> >   ON stats.feed_sub
> >   USING btree
> >   (date, gran, aid, pid, sid, fid, subid COLLATE pg_catalog."default");
> >
> > Here is some sizing info (https://wiki.postgresql.org/wiki/Disk_Usage):
> >
> > relation,size
> > stats.feed_sub,5644 MB
> > stats.ixu_feed_sub,1594 MB
> >
> > row_estimate
> > 15865627
> >
> > Here is the typical query (for totals beige):
> > SELECT
> >   sum(stats.feed_sub.c_filt_click_cap_ip) AS clicks_from_ip,
> >   sum(stats.feed_sub.c_filt_doubleclick)  AS clicks_on_target,
> >   sum(stats.feed_sub.c_filt_delay_clicks) AS ip_click_period,
> >   sum(stats.feed_sub.c_filt_fast_click)   AS fast_click,
> >   sum(stats.feed_sub.c_filt_ip_mismatch)  AS ip_mismatch,
> >   sum(stats.feed_sub.c_filt_lng_mismatch) AS lng_mismatch,
> >   sum(stats.feed_sub.c_filt_ref_mismatch) AS ref_mismatch,
> >   sum(stats.feed_sub.c_filt_ua_mismatch)  AS ua_mismatch,
> >   sum(stats.feed_sub.c_filt_url_expired)  AS url_expired,
> >   stats.feed_sub.subid                    AS stats_feed_sub_subid,
> >   stats.feed_sub.sid                      AS stats_feed_sub_sid
> > FROM stats.feed_sub
> > WHERE stats.feed_sub.date >= '2016-06-01' :: TIMESTAMP AND
> >       stats.feed_sub.date <= '2016-06-30' :: TIMESTAMP AND
> >       stats.feed_sub.gran = '1 day'
> >       AND stats.feed_sub.aid = 3
> > GROUP BY
> >   stats.feed_sub.subid, stats.feed_sub.sid;
>
> You cast every date to an timestamp. Why? You can adjust the index to:
>
> CREATE UNIQUE INDEX ixu_feed_sub
> ON stats.feed_sub
> USING btree
> (date::timestamp, gran, aid, pid, sid, fid, subid COLLATE
> pg_catalog."default");
>
> But since i see no need for the cast at all (maybe i missed it) try it
> without!
>
> Greetings,
> Torsten
>

pgsql-performance by date:

From: Kaixi Luo
Date: 05 July 2016, 17:51:00
Subject: Tuning guidelines for server with 256GB of RAM and SSDs?

From: Josh Berkus
Date: 05 July 2016, 22:06:45
Subject: Re: DELETE takes too much memory

Re: less than 2 sec for response - possible? - Mailing list pgsql-performance

Previous

Next