Re: Consecutive Query Executions with Increasing Execution Time - Mailing list pgsql-performance

From Shijia Wei
Subject Re: Consecutive Query Executions with Increasing Execution Time
Date
Msg-id CABSfb-5ZcO48K5sS0oM3njQYpTo-K2MdxMZ1pM0DpiBFGPqw9g@mail.gmail.com
Whole thread Raw
In response to Re: Consecutive Query Executions with Increasing Execution Time  (Olivier Gautherot <ogautherot@gautherot.net>)
Responses Re: Consecutive Query Executions with Increasing Execution Time  (Olivier Gautherot <ogautherot@gautherot.net>)
List pgsql-performance
Hi Olivier,

I do not think that the queries are executed concurrently. The bash for loop ensures that the next command fires only after the first returns.
Also for some 'complex' queries, even a wait-period that is longer than the total execution time does not completely avoid this effect.
For example, a wait-period of 5-second in between queries that take 2-second to run, does not help avoid the increasing runtime problem completely.

Thanks,
Shijia


On Mon, Dec 16, 2019 at 2:04 AM Olivier Gautherot <ogautherot@gautherot.net> wrote:
Hi Shijia,

It sounds like concurrency on the queries: the second starts before the first ends, and so on. With a short wait in between you ensure sequential execution. Notice that you also have the overhead of concurrent psql...

Sounds normal to me.

Best regards
Olivier


On Mon, Dec 16, 2019, 07:00 Shijia Wei <shijiawei@utexas.edu> wrote:
Hi all,

I am running TPC-H on recent postgresql (12.0 and 12.1).
On some of the queries (that may involve parallel scans) I see this interesting behavior:
When these queries are executed back-to-back (sent from psql interactive terminal), the total execution time of them increase monotonically.

I simplified query-1 to demonstrate this effect:
``` example.sql
explain (analyze, buffers) select
        max(l_shipdate) as max_data,
        count(*) as count_order
from
        lineitem
where
        l_shipdate <= date '1998-12-01' - interval '20' day;
```

When I execute (from fish) following command:
`for i in (seq 1 20); psql tpch < example.sql | grep Execution; end`
The results I get are as follows:
"
 Execution Time: 184.864 ms
 Execution Time: 192.758 ms
 Execution Time: 197.380 ms
 Execution Time: 200.384 ms
 Execution Time: 202.950 ms
 Execution Time: 205.695 ms
 Execution Time: 208.082 ms
 Execution Time: 209.108 ms
 Execution Time: 212.428 ms
 Execution Time: 214.539 ms
 Execution Time: 215.799 ms
 Execution Time: 219.057 ms
 Execution Time: 222.102 ms
 Execution Time: 223.779 ms
 Execution Time: 227.819 ms
 Execution Time: 229.710 ms
 Execution Time: 239.439 ms
 Execution Time: 237.649 ms
 Execution Time: 249.178 ms
 Execution Time: 261.268 ms
"
In addition, if the repeated more times, the total execution time can end up being 10X and more!!!

When there a wait period in-between queries, (e.g. sleep 10) in the above for loop, this increasing execution time behavior goes a way.
For more complex queries, the "wait period" needs to be longer to avoid the increase in execution time.

Some metadata about this table "lineitem":
tpch=# \d lineitem
                         Table "public.lineitem"
     Column      |         Type          | Collation | Nullable | Default
-----------------+-----------------------+-----------+----------+---------
 l_orderkey      | integer               |           | not null |
 l_partkey       | integer               |           | not null |
 l_suppkey       | integer               |           | not null |
 l_linenumber    | integer               |           | not null |
 l_quantity      | numeric(15,2)         |           | not null |
 l_extendedprice | numeric(15,2)         |           | not null |
 l_discount      | numeric(15,2)         |           | not null |
 l_tax           | numeric(15,2)         |           | not null |
 l_returnflag    | character(1)          |           | not null |
 l_linestatus    | character(1)          |           | not null |
 l_shipdate      | date                  |           | not null |
 l_commitdate    | date                  |           | not null |
 l_receiptdate   | date                  |           | not null |
 l_shipinstruct  | character(25)         |           | not null |
 l_shipmode      | character(10)         |           | not null |
 l_comment       | character varying(44) |           | not null |
Indexes:
    "i_l_commitdate" btree (l_commitdate)
    "i_l_orderkey" btree (l_orderkey)
    "i_l_orderkey_quantity" btree (l_orderkey, l_quantity)
    "i_l_partkey" btree (l_partkey)
    "i_l_receiptdate" btree (l_receiptdate)
    "i_l_shipdate" btree (l_shipdate)
    "i_l_suppkey" btree (l_suppkey)
    "i_l_suppkey_partkey" btree (l_partkey, l_suppkey)

tpch=# SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='lineitem';
 relname  | relpages |  reltuples   | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------+----------+--------------+---------------+---------+----------+----------------+------------+---------------
 lineitem |   112503 | 6.001167e+06 |        112503 | r       |       16 | f              |            |     921903104
(1 row)


Postgresql 12.0 and 12.1 are all manually installed from source.
Both are running on Ubuntu 16.04 kernel 4.4.0-142-generic, on Intel(R) Core(TM) i7-6700K.


Any help greatly appreciated!

Shijia


--
Shijia Wei
ECE, 
UT Austin | ACSES | 3rd Year PhD

pgsql-performance by date:

Previous
From: Olivier Gautherot
Date:
Subject: Re: Consecutive Query Executions with Increasing Execution Time
Next
From: Olivier Gautherot
Date:
Subject: Re: Consecutive Query Executions with Increasing Execution Time