Thread: Zero throughput on a query on a very large table.

Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

Hello,

 

We have been stuck for the past week on a query that simply won’t “execute”. We have a table with 1.2B rows that took around 14h to load, but a simple select takes forever and after 10h, no records are coming through still.

 

Environment:

     - Table tmp_outpatient_rev with 41 VARCHAR columns (desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …)

     - 1.2B rows (Billion with a ‘B’)

     - A single Unique Index on columns desy_sort_key, claim_no, clm_line_num

     - select pg_size_pretty(pg_relation_size('tmp_outpatient_rev')) --> 215GB

     - Database Server: 64GB, 8 cores/16 threads, HDDs 10K

     - Linux

     - PG 11.1

 

Query:

     select * from tmp_outpatient_rev order by desy_sort_key, claim_no

 

Plan:

    Gather Merge  (cost=61001461.16..216401602.29 rows=1242732290 width=250)

      Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …

      Workers Planned: 10

      ->  Sort  (cost=61000460.97..61311144.04 rows=124273229 width=250)

            Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …

            Sort Key: tmp_outpatient_rev.desy_sort_key, tmp_outpatient_rev.claim_no

            ->  Parallel Seq Scan on public.tmp_outpatient_rev  (cost=0.00..29425910.29 rows=124273229 width=250)

                  Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …

 

Method of access:

    - Using Pentaho Kettle (an ETL tool written in Java and using JDBC), we simply issue the query and expect records to start streaming in ASAP.

   - Issue was replicated with really basic JDBC code in a Java test program.

    - The database doesn't have much other data and the table was loaded from a CSV data source with LOAD over something like 14h (average throughput of about 25K rows/s)

    - Settings:

              alter database "CMS_TMP" set seq_page_cost=1;

              alter database "CMS_TMP" set random_page_cost=4;

              alter database "CMS_TMP" set enable_seqscan=true;

              JDBC connection string with no extra params.

              Database has been generally configured properly.

 

Problem:

    - The plan shows a full table scan followed by a sort, and then a gather merge. With 1.2B rows, that's crazy to try to sort that 😊

    - After 10h, the query is still "silent" and no record is streaming in. IO is very high (80-90% disk throughput utilization) on the machine (the sort…).

    - I have tried to hack the planner to force an index scan (which would avoid the sort/gather steps and should start streaming data right away), in particular, enable_seqscan=false or seq_page_cost=2. This had ZERO impact on the plan to my surprise.

   - I changed the “order by” to include all 3 columns from the index, or created a non-unique index with only the first 2 columns, all to no effect whatsoever either.

    - The table was written over almost 14h at about 25K row/s and it seems to me I should be able to read the data back at least as fast.

 

Why is a simple index scan not used? Why are all our efforts to try to force the use of the index failing?

 

Any help is very much appreciated as we are really hitting a wall here with that table.

 

Thank you so much.

 

 

Laurent Hasson

 

 

 

 

Re: Zero throughput on a query on a very large table.

From
Andreas Kretschmer
Date:

Am 25.01.19 um 06:20 schrieb ldh@laurent-hasson.com:
>
> Hello,
>
> We have been stuck for the past week on a query that simply won’t 
> “execute”. We have a table with 1.2B rows that took around 14h to 
> load, but a simple select takes forever and after 10h, no records are 
> coming through still.
>
> Environment:
>
>      - Table tmp_outpatient_rev with 41 VARCHAR columns 
> (desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, 
> rev_cntr, rev_cntr_dt, …)
>
>      - 1.2B rows (Billion with a ‘B’)
>
>      - A single Unique Index on columns desy_sort_key, claim_no, 
> clm_line_num
>
>      - select pg_size_pretty(pg_relation_size('tmp_outpatient_rev')) 
> --> 215GB
>
>      - Database Server: 64GB, 8 cores/16 threads, HDDs 10K
>
>      - Linux
>
>      - PG 11.1
>
> Query:
>
>      select * from tmp_outpatient_rev order by desy_sort_key, claim_no
>
> Plan:
>
>     Gather Merge (cost=61001461.16..216401602.29 rows=1242732290 
> width=250)
>
>       Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, 
> nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
>
>       Workers Planned: 10
>
>       ->  Sort (cost=61000460.97..61311144.04 rows=124273229 width=250)
>
>             Output: desy_sort_key, claim_no, clm_line_num, 
> clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
>
>             Sort Key: tmp_outpatient_rev.desy_sort_key, 
> tmp_outpatient_rev.claim_no
>
>             ->  Parallel Seq Scan on public.tmp_outpatient_rev  
> (cost=0.00..29425910.29 rows=124273229 width=250)
>
>                   Output: desy_sort_key, claim_no, clm_line_num, 
> clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
>
> Method of access:
>
>     - Using Pentaho Kettle (an ETL tool written in Java and using 
> JDBC), we simply issue the query and expect records to start streaming 
> in ASAP.
>
>    - Issue was replicated with really basic JDBC code in a Java test 
> program.
>
>     - The database doesn't have much other data and the table was 
> loaded from a CSV data source with LOAD over something like 14h 
> (average throughput of about 25K rows/s)
>
>     - Settings:
>
>               alter database "CMS_TMP" set seq_page_cost=1;
>
>               alter database "CMS_TMP" set random_page_cost=4;
>
>               alter database "CMS_TMP" set enable_seqscan=true;
>
>               JDBC connection string with no extra params.
>
>               Database has been generally configured properly.
>
> Problem:
>
>     - The plan shows a full table scan followed by a sort, and then a 
> gather merge. With 1.2B rows, that's crazy to try to sort that 😊
>
>     - After 10h, the query is still "silent" and no record is 
> streaming in. IO is very high (80-90% disk throughput utilization) on 
> the machine (the sort…).
>
>     - I have tried to hack the planner to force an index scan (which 
> would avoid the sort/gather steps and should start streaming data 
> right away), in particular, enable_seqscan=false or seq_page_cost=2. 
> This had ZERO impact on the plan to my surprise.
>
>    - I changed the “order by” to include all 3 columns from the index, 
> or created a non-unique index with only the first 2 columns, all to no 
> effect whatsoever either.
>
>     - The table was written over almost 14h at about 25K row/s and it 
> seems to me I should be able to read the data back at least as fast.
>
> Why is a simple index scan not used? Why are all our efforts to try to 
> force the use of the index failing?
>
>

the query isn't that simple, there is no where condition, so PG has to 
read the whole table and the index is useless. Would it be enought to 
select only the columns covered by the index?
(run a vacuum on the table after loading the data, that's can enable a 
index-only-scan in this case)




Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



Re: Zero throughput on a query on a very large table.

From
Tom Lane
Date:
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Query:
>      select * from tmp_outpatient_rev order by desy_sort_key, claim_no

> Plan:
> [ seqscan-and-sort ... parallelized, but still seqscan-and-sort ]

>     - I have tried to hack the planner to force an index scan (which would avoid the sort/gather steps and should
startstreaming data right away), in particular, enable_seqscan=false or seq_page_cost=2. This had ZERO impact on the
planto my surprise. 

If you can't get an indexscan plan despite setting enable_seqscan=false,
that typically means that the planner thinks the index's sort order
does not match what the query is asking for.  I wonder whether you
created the index with nondefault collation, or asc/desc ordering,
or something like that.  There's not enough detail here to diagnose
that.

It should also be noted that what enable_seqscan=false actually does
is to add a cost penalty of 1e10 to seqscan plans.  It's possible
that your table is so large and badly ordered that the estimated
cost differential between seqscan and indexscan is more than 1e10,
so that the planner goes for the seqscan anyway.  You could probably
overcome that by aggressively decreasing random_page_cost (and by
"aggressive" I don't mean 2, I mean 0.2, or maybe 0.00002, whatever
it takes).  However, if that's what's happening, I'm worried that
getting what you asked for may not really be the outcome you wanted.
Just because you start to see some data streaming to your app right
away doesn't mean the process is going to complete in less time than
it would if you waited for the sort to happen.

You didn't mention what you have work_mem set to, but a small value
of that would handicap the sort-based plan a lot.  I wonder whether
jacking up work_mem to help the sorts run faster won't end up being
the better idea in the end.

            regards, tom lane

PS: On the third hand, you mention having created new indexes on this
table with apparently not a lot of pain, which is a tad surprising
if you don't have the patience to wait for a sort to finish.  How
long did those index builds take?


Re: Zero throughput on a query on a very large table.

From
David Rowley
Date:
On Fri, 25 Jan 2019 at 19:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> PS: On the third hand, you mention having created new indexes on this
> table with apparently not a lot of pain, which is a tad surprising
> if you don't have the patience to wait for a sort to finish.  How
> long did those index builds take?

It would certainly be good to look at psql's \d tmp_outpatient_rev
output to ensure that the index is not marked as INVALID.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


RE: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:
> -----Original Message-----
> From: Andreas Kretschmer <andreas@a-kretschmer.de>
> Sent: Friday, January 25, 2019 00:55
> To: pgsql-performance@lists.postgresql.org
> Subject: Re: Zero throughput on a query on a very large table.
> 
> 
> 
> Am 25.01.19 um 06:20 schrieb ldh@laurent-hasson.com:
> >
> > Hello,
> >
> > We have been stuck for the past week on a query that simply won’t
> > “execute”. We have a table with 1.2B rows that took around 14h to
> > load, but a simple select takes forever and after 10h, no records are
> > coming through still.
> >
> > Environment:
> >
> >      - Table tmp_outpatient_rev with 41 VARCHAR columns
> > (desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd,
> > rev_cntr, rev_cntr_dt, …)
> >
> >      - 1.2B rows (Billion with a ‘B’)
> >
> >      - A single Unique Index on columns desy_sort_key, claim_no,
> > clm_line_num
> >
> >      - select pg_size_pretty(pg_relation_size('tmp_outpatient_rev'))
> > --> 215GB
> >
> >      - Database Server: 64GB, 8 cores/16 threads, HDDs 10K
> >
> >      - Linux
> >
> >      - PG 11.1
> >
> > Query:
> >
> >      select * from tmp_outpatient_rev order by desy_sort_key, claim_no
> >
> > Plan:
> >
> >     Gather Merge (cost=61001461.16..216401602.29 rows=1242732290
> > width=250)
> >
> >       Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt,
> > nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
> >
> >       Workers Planned: 10
> >
> >       ->  Sort (cost=61000460.97..61311144.04 rows=124273229
> > width=250)
> >
> >             Output: desy_sort_key, claim_no, clm_line_num,
> > clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
> >
> >             Sort Key: tmp_outpatient_rev.desy_sort_key,
> > tmp_outpatient_rev.claim_no
> >
> >             ->  Parallel Seq Scan on public.tmp_outpatient_rev
> > (cost=0.00..29425910.29 rows=124273229 width=250)
> >
> >                   Output: desy_sort_key, claim_no, clm_line_num,
> > clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
> >
> > Method of access:
> >
> >     - Using Pentaho Kettle (an ETL tool written in Java and using
> > JDBC), we simply issue the query and expect records to start streaming
> > in ASAP.
> >
> >    - Issue was replicated with really basic JDBC code in a Java test
> > program.
> >
> >     - The database doesn't have much other data and the table was
> > loaded from a CSV data source with LOAD over something like 14h
> > (average throughput of about 25K rows/s)
> >
> >     - Settings:
> >
> >               alter database "CMS_TMP" set seq_page_cost=1;
> >
> >               alter database "CMS_TMP" set random_page_cost=4;
> >
> >               alter database "CMS_TMP" set enable_seqscan=true;
> >
> >               JDBC connection string with no extra params.
> >
> >               Database has been generally configured properly.
> >
> > Problem:
> >
> >     - The plan shows a full table scan followed by a sort, and then a
> > gather merge. With 1.2B rows, that's crazy to try to sort that 😊
> >
> >     - After 10h, the query is still "silent" and no record is
> > streaming in. IO is very high (80-90% disk throughput utilization) on
> > the machine (the sort…).
> >
> >     - I have tried to hack the planner to force an index scan (which
> > would avoid the sort/gather steps and should start streaming data
> > right away), in particular, enable_seqscan=false or seq_page_cost=2.
> > This had ZERO impact on the plan to my surprise.
> >
> >    - I changed the “order by” to include all 3 columns from the index,
> > or created a non-unique index with only the first 2 columns, all to no
> > effect whatsoever either.
> >
> >     - The table was written over almost 14h at about 25K row/s and it
> > seems to me I should be able to read the data back at least as fast.
> >
> > Why is a simple index scan not used? Why are all our efforts to try to
> > force the use of the index failing?
> >
> >
> 
> the query isn't that simple, there is no where condition, so PG has to read the
> whole table and the index is useless. Would it be enought to select only the
> columns covered by the index?
> (run a vacuum on the table after loading the data, that's can enable a index-
> only-scan in this case)
> 
> 
> 
> 
> Regards, Andreas
> 
> --
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
> 

Well, even without a where clause, and a straight select with an order by on an index... The index may perform slightly
moreslowly, but stream data more rapidly... I guess what i am pointing out is that in ETL scenarios, enabling better
continuousthroughput would be better than total overall query performance?
 

Thank you,
Laurent.


 


Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

Sorry, the web outlook client may be "prepending" this message instead of appending, as is the custom on this mailing list.


The indices are defined as:

CREATE INDEX i_outprev_ptclaim
    ON public.tmp_outpatient_rev USING btree
    (desy_sort_key COLLATE pg_catalog."default", claim_no COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE UNIQUE INDEX ui_outprev_ptclaimline
    ON public.tmp_outpatient_rev USING btree
    (desy_sort_key COLLATE pg_catalog."default", claim_no COLLATE pg_catalog."default", clm_line_num COLLATE pg_catalog."default")
    TABLESPACE pg_default;

I am using PGAdmin4 and the client times out, so i don't have the exact timing, but each one of those indices completed under 5h (started at lunch time and was done before the end of the afternoon). So when i ran the query and it didn't move for about 10h, i figured it might "never end" :).


I'll try changing the random page cost and see. The work_men param is set to 128MB... So maybe that's something too? I'll try.


Additionally, do note that we have a second table, similar in structure, with 180M rows, select pg_size_pretty(pg_relation_size('tmp_inpatient_rev')) --> 18GB (so it's 10x smaller) but we get 40K rows/s read throughput on that with a similar query and index and the plan does chose an index scan and returns the first thousands of row almost immediately (a few secs).



From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, January 25, 2019 1:24:45 AM
To: ldh@laurent-hasson.com
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Query:
>      select * from tmp_outpatient_rev order by desy_sort_key, claim_no

> Plan:
> [ seqscan-and-sort ... parallelized, but still seqscan-and-sort ]

>     - I have tried to hack the planner to force an index scan (which would avoid the sort/gather steps and should start streaming data right away), in particular, enable_seqscan=false or seq_page_cost=2. This had ZERO impact on the plan to my surprise.

If you can't get an indexscan plan despite setting enable_seqscan=false,
that typically means that the planner thinks the index's sort order
does not match what the query is asking for.  I wonder whether you
created the index with nondefault collation, or asc/desc ordering,
or something like that.  There's not enough detail here to diagnose
that.

It should also be noted that what enable_seqscan=false actually does
is to add a cost penalty of 1e10 to seqscan plans.  It's possible
that your table is so large and badly ordered that the estimated
cost differential between seqscan and indexscan is more than 1e10,
so that the planner goes for the seqscan anyway.  You could probably
overcome that by aggressively decreasing random_page_cost (and by
"aggressive" I don't mean 2, I mean 0.2, or maybe 0.00002, whatever
it takes).  However, if that's what's happening, I'm worried that
getting what you asked for may not really be the outcome you wanted.
Just because you start to see some data streaming to your app right
away doesn't mean the process is going to complete in less time than
it would if you waited for the sort to happen.

You didn't mention what you have work_mem set to, but a small value
of that would handicap the sort-based plan a lot.  I wonder whether
jacking up work_mem to help the sorts run faster won't end up being
the better idea in the end.

                        regards, tom lane

PS: On the third hand, you mention having created new indexes on this
table with apparently not a lot of pain, which is a tad surprising
if you don't have the patience to wait for a sort to finish.  How
long did those index builds take?

Re: Zero throughput on a query on a very large table.

From
Tom Lane
Date:
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> The indices are defined as:

> CREATE INDEX i_outprev_ptclaim
>     ON public.tmp_outpatient_rev USING btree
>     (desy_sort_key COLLATE pg_catalog."default", claim_no COLLATE pg_catalog."default")
>     TABLESPACE pg_default;

> CREATE UNIQUE INDEX ui_outprev_ptclaimline
>     ON public.tmp_outpatient_rev USING btree
>     (desy_sort_key COLLATE pg_catalog."default", claim_no COLLATE pg_catalog."default", clm_line_num COLLATE
pg_catalog."default")
>     TABLESPACE pg_default;

I'm a bit suspicious of those explicit COLLATE clauses; seems like maybe
they could be accounting for not matching to the query-requested order.
Perhaps they're different from the collations specified on the underlying
table columns?

Also, it seems unlikely that it's worth the maintenance work to keep
both of these indexes, though that's not related to your immediate
problem.

            regards, tom lane


Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:


Since the PGADmin4 client timed out when creating the index, you picked my interest here and i was wondering if the index creation itself had failed... but:

\d tmp_outpatient_rev

Indexes:
    "ui_outprev_ptclaimline" UNIQUE, btree (desy_sort_key, claim_no, clm_line_num)
    "i_outprev_ptclaim" btree (desy_sort_key, claim_no)

So looks like the indices are file. I am pursuing some of the other recommendations you suggested before.


Thank you,

Laurent.


From: David Rowley <david.rowley@2ndquadrant.com>
Sent: Friday, January 25, 2019 1:55:31 AM
To: Tom Lane
Cc: ldh@laurent-hasson.com; pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
On Fri, 25 Jan 2019 at 19:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> PS: On the third hand, you mention having created new indexes on this
> table with apparently not a lot of pain, which is a tad surprising
> if you don't have the patience to wait for a sort to finish.  How
> long did those index builds take?

It would certainly be good to look at psql's \d tmp_outpatient_rev
output to ensure that the index is not marked as INVALID.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

Agreed on the 2 indices. I only added the second non-unique index to test the hypothesis that i was doing an order-by col1, col2 when the original unique index was on col1, col2, col3...


Also, the original statement i implemented did not have all of that. This is the normalized SQL that Postgres now gives when looking at the indices. Collation for the DB is "en_US.UTF-8" and that's used for the defaults i suspect?


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, January 25, 2019 1:10:55 PM
To: ldh@laurent-hasson.com
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> The indices are defined as:

> CREATE INDEX i_outprev_ptclaim
>     ON public.tmp_outpatient_rev USING btree
>     (desy_sort_key COLLATE pg_catalog."default", claim_no COLLATE pg_catalog."default")
>     TABLESPACE pg_default;

> CREATE UNIQUE INDEX ui_outprev_ptclaimline
>     ON public.tmp_outpatient_rev USING btree
>     (desy_sort_key COLLATE pg_catalog."default", claim_no COLLATE pg_catalog."default", clm_line_num COLLATE pg_catalog."default")
>     TABLESPACE pg_default;

I'm a bit suspicious of those explicit COLLATE clauses; seems like maybe
they could be accounting for not matching to the query-requested order.
Perhaps they're different from the collations specified on the underlying
table columns?

Also, it seems unlikely that it's worth the maintenance work to keep
both of these indexes, though that's not related to your immediate
problem.

                        regards, tom lane

Re: Zero throughput on a query on a very large table.

From
Tom Lane
Date:
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Also, the original statement i implemented did not have all of that. This is the normalized SQL that Postgres now
giveswhen looking at the indices. 

[ squint... ]  What do you mean exactly by "Postgres gives that"?
I don't see any redundant COLLATE clauses in e.g. psql \d.

            regards, tom lane


Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

Sorry :) When i look at the "SQL" tab in PGAdmin when i select the index in the schema browser. But you are right that /d doesn't show that.


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, January 25, 2019 1:34:01 PM
To: ldh@laurent-hasson.com
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Also, the original statement i implemented did not have all of that. This is the normalized SQL that Postgres now gives when looking at the indices.

[ squint... ]  What do you mean exactly by "Postgres gives that"?
I don't see any redundant COLLATE clauses in e.g. psql \d.

                        regards, tom lane

Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

OK... I think we may have cracked this.


First, do you think that 128MB work_mem is ok? We have a 64GB machine and expecting fewer than 100 connections. This is really an ETL workload environment at this time.


Second, here is what i found and what messed us up.

    select current_setting('random_page_cost'); --> 4

    alter database "CMS_TMP" set random_page_cost=0.00000001;
    select current_setting('random_page_cost'); --> 4 ????

I also tried:
    select current_setting('random_page_cost'); --> 4
    select set_config('random_page_cost', '0.000001', true);
    select current_setting('random_page_cost'); --> 4 ????

Is there something that is happening that is causing those settings to not stick? I then tried:


    select current_setting('random_page_cost'); --> 4
    select set_config('random_page_cost', '0.000001', false); -- false now, i.e., global
    select current_setting('random_page_cost'); --> 0.000001 !!!!

So i think we just spent 4 days on that issue. I then did

    select set_config('enable_seqscan', 'off', false);
And the plan is now using an index scan, and we are getting 12K rows/s in throughput immediately!!! 😊

So i guess my final question is that i really want to only affect that one query executing, and i seem to not be able to change the settings used by the planner just for that one transaction. I have to change it globally which i would prefer not to do. Any help here?

Thanks,

Laurent.


From: ldh@laurent-hasson.com <ldh@laurent-hasson.com>
Sent: Friday, January 25, 2019 1:36:21 PM
To: Tom Lane
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 

Sorry :) When i look at the "SQL" tab in PGAdmin when i select the index in the schema browser. But you are right that /d doesn't show that.


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, January 25, 2019 1:34:01 PM
To: ldh@laurent-hasson.com
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Also, the original statement i implemented did not have all of that. This is the normalized SQL that Postgres now gives when looking at the indices.

[ squint... ]  What do you mean exactly by "Postgres gives that"?
I don't see any redundant COLLATE clauses in e.g. psql \d.

                        regards, tom lane

Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

Just a correction from my previous message regarding the throughput we get.


On that one table with 1.2B row, the plan through the index scan delivers actually 50K rows/s in read speed to the application, almost immediately. It would go through the entire table in under 7h vs the other approach which still didn't deliver any data after 10h.


We do additional joins and logic and out final throughput is about 12K/s (what i quoted previously), but this is a case where clearly the index_scan plan delivers vastly better performance than the table_seq_scan+sort plan.


Any insight here?


Thank you,

Laurent.


From: ldh@laurent-hasson.com <ldh@laurent-hasson.com>
Sent: Friday, January 25, 2019 2:06:54 PM
To: Tom Lane
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 

OK... I think we may have cracked this.


First, do you think that 128MB work_mem is ok? We have a 64GB machine and expecting fewer than 100 connections. This is really an ETL workload environment at this time.


Second, here is what i found and what messed us up.

    select current_setting('random_page_cost'); --> 4

    alter database "CMS_TMP" set random_page_cost=0.00000001;
    select current_setting('random_page_cost'); --> 4 ????

I also tried:
    select current_setting('random_page_cost'); --> 4
    select set_config('random_page_cost', '0.000001', true);
    select current_setting('random_page_cost'); --> 4 ????

Is there something that is happening that is causing those settings to not stick? I then tried:


    select current_setting('random_page_cost'); --> 4
    select set_config('random_page_cost', '0.000001', false); -- false now, i.e., global
    select current_setting('random_page_cost'); --> 0.000001 !!!!

So i think we just spent 4 days on that issue. I then did

    select set_config('enable_seqscan', 'off', false);
And the plan is now using an index scan, and we are getting 12K rows/s in throughput immediately!!! 😊

So i guess my final question is that i really want to only affect that one query executing, and i seem to not be able to change the settings used by the planner just for that one transaction. I have to change it globally which i would prefer not to do. Any help here?

Thanks,

Laurent.


From: ldh@laurent-hasson.com <ldh@laurent-hasson.com>
Sent: Friday, January 25, 2019 1:36:21 PM
To: Tom Lane
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 

Sorry :) When i look at the "SQL" tab in PGAdmin when i select the index in the schema browser. But you are right that /d doesn't show that.


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, January 25, 2019 1:34:01 PM
To: ldh@laurent-hasson.com
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Also, the original statement i implemented did not have all of that. This is the normalized SQL that Postgres now gives when looking at the indices.

[ squint... ]  What do you mean exactly by "Postgres gives that"?
I don't see any redundant COLLATE clauses in e.g. psql \d.

                        regards, tom lane

Re: Zero throughput on a query on a very large table.

From
Tom Lane
Date:
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Second, here is what i found and what messed us up.

>     select current_setting('random_page_cost'); --> 4
>     alter database "CMS_TMP" set random_page_cost=0.00000001;
>     select current_setting('random_page_cost'); --> 4 ????

ALTER DATABASE only affects subsequently-started sessions.

> I also tried:
>     select current_setting('random_page_cost'); --> 4
>     select set_config('random_page_cost', '0.000001', true);
>     select current_setting('random_page_cost'); --> 4 ????

That "true" means "local to the current transaction", which is
just the one statement if you don't have a BEGIN.

            regards, tom lane


Re: Zero throughput on a query on a very large table.

From
"ldh@laurent-hasson.com"
Date:

Correct, but in the Java code, it's multiple statements in a single transaction, so it should stick. Not sure if something else stupid is going on.


Good to know about the ALTER DATABASE effect. I didn't realize that.


Thanks a billion.


Laurent.


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, January 25, 2019 3:04:37 PM
To: ldh@laurent-hasson.com
Cc: pgsql-performance@postgresql.org
Subject: Re: Zero throughput on a query on a very large table.
 
"ldh@laurent-hasson.com" <ldh@laurent-hasson.com> writes:
> Second, here is what i found and what messed us up.

>     select current_setting('random_page_cost'); --> 4
>     alter database "CMS_TMP" set random_page_cost=0.00000001;
>     select current_setting('random_page_cost'); --> 4 ????

ALTER DATABASE only affects subsequently-started sessions.

> I also tried:
>     select current_setting('random_page_cost'); --> 4
>     select set_config('random_page_cost', '0.000001', true);
>     select current_setting('random_page_cost'); --> 4 ????

That "true" means "local to the current transaction", which is
just the one statement if you don't have a BEGIN.

                        regards, tom lane