Re: Planner cost adjustments - Mailing list pgsql-general

From Daniel Begin
Subject Re: Planner cost adjustments
Date
Msg-id COL129-DS2787944731718F72D351E394A40@phx.gbl
Whole thread Raw
In response to Planner cost adjustments  (Daniel Begin <jfd553@hotmail.com>)
List pgsql-general
Here is a last follow-up on adjusting the planner costs calculation

1 - Francisco, you provide me with a clue I was looking for at this point!

I learned enough PLpgsql over the last week to reprogram all my processes (not always elegant but it works). By
processingindividual records using loop, I will be able to move forward with my work and save individual results as
theyare processed.  

Furthermore, it is now easier to predict jobs completion times, since the DB uses similar plans for each record. I did
atest on a stratified sampling of records and I expect the jobs will run for about two weeks (contrarily to months in
previousestimates!-) 

2 - PLpgsql helped me to bypass the problem. As PT suggested, I will try to setup a test case that demonstrates the
problemand post it to the developers' list so they might figure out what to do in such situation. 

Thanks to all,
Daniel

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Francisco Olarte
Sent: June-11-15 10:05
To: Daniel Begin
Cc: Bill Moran; pgsql-general@postgresql.org; Tomas Vondra; Melvin Davidson
Subject: Re: [GENERAL] Planner cost adjustments

Hi Daniel:

On Thu, Jun 11, 2015 at 2:38 PM, Daniel Begin <jfd553@hotmail.com> wrote:
.....
> The remaining problem seems related to the statistics of some large tables.
> On one hand, I might increase the statistic target for these tables to
> 500, or even to 1000 and look at the results (but I have doubts it
> will help). On the other hand, I could simply force enable_seqscan to
> OFF for queries dealing with them.
>
> Still not sure about the best solution but the problem is definitely
> narrower :-)

One somehow unrelated point. IIRC your problems where related to queries doing joins with selected sets of indexed IDs
ona smaller table which then have to be looked up on some very large tables. ( I'm not able to follow too well which is
which,users on changesets, I'm a bit lost ). Given your runtimes are always high ( in the seconds range, so it seems
wirespeed / latencies are not too much of an issue 
) and that selectivity estimates on huge tables are always problematic and may be thwarting your plans you may be able
toget faster results splitting your query. 

 If I read your plans correctly, that would be selecting your 600 users in one query and then preparing the changeset
queryfor a single user_id, which should be indexed, and looping it for every user. Given current machines can easily
send-receive600 queries in a second it may lead to a simpler solution. This mean you're using the DB as a somehow
inteligentplain old indexed file, but sometimes this is the simpler approach ( heck, some of my code uses algorithms
fromthe tape era as they were the faster way I could do it ). 

I needed to do this in one of my programs, the optimizer kept selecting bad plans so I did the first query, held the
resultsin memory, and then prepared and executed in a loop from the app, my query was selecting a couple thousand
valuesfrom submillion table, and joining with a multimillion one, getting a couple hundreds matches per original value.
Splittingit made the queries on the big table always run indexed and fast ( and as a side bonus avoided duplicating the
partsof the first record in the wire a hundred times, which was nice since the short table was wide and I only needed 3
shortfields from the second one, and that made the first query run at wire speed and the second at disk speed ). 

Francisco Olarte.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Transaction abort difference between psql, isql, ODBC and JDBC pgsql 8.4
Next
From: Scott Marlowe
Date:
Subject: Re: How to craft a query that uses memory?