Re: MYSQL Stats - Mailing list pgsql-performance

From Joe Proietti
Subject Re: MYSQL Stats
Date
Msg-id 9D2D7330CC5F1648BF564EE72EF158F0BB38F13FA0@DFW1MBX24.mex07a.mlsrvr.com
Whole thread Raw
In response to MYSQL Stats  (Joe Proietti <joe.proietti@cleargageinc.com>)
List pgsql-performance

My Apologies ,  was in the wrong email/forum,  please disregard my email!

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Joe Proietti
Sent: Friday, September 30, 2016 8:03 AM
To: Jake Nielsen <jake.k.nielsen@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-performance@postgresql.org
Subject: [PERFORM] MYSQL Stats

 

Hi,

I am relatively new to MYSQL and not really sure I am in the right forum for this.

 

I have a situation which I am not understanding.  I am performing a simple query :

 

Select * from tableA

Where date >= ‘2016’06-01’

And date < ‘2016-07-01’

 

Index is on date

Query returns 6271 rows

 

When doing explain on the same query

The rows column shows  11462,  nearly twice the amount  (this result is consistent on most all tables)

 

When selecting count from the table , returns  2668664

 

When selecting from information_schema.tables  table_rows column shows 2459114

 

While this is indicative of out dated statistics

 

Have done an analyze table but no changes.

 

Thanks,

Joe

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Jake Nielsen
Sent: Wednesday, September 28, 2016 2:11 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Unexpected expensive index scan

 

 

 

On Wed, Sep 28, 2016 at 6:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ Please don't re-quote the entire damn thread in each followup. Have
some respect for your readers' time, and assume that they have already
seen the previous traffic, or could go look it up if they haven't.
The point of quoting at all is just to quickly remind people where we
are in the discussion. ]

 

Sorry, understood.

 


If you say "well yeah, but it seems to perform fine when I force
it to use that index anyway", the answer may be that you need to
adjust random_page_cost.  The default value is OK for tables that
are mostly sitting on spinning rust, but if your database is
RAM-resident or SSD-resident you probably want a value closer to 1.

 

Ahhh, this could absolutely be the key right here. I could totally see why it would make sense for the planner to do what it's doing given that it's weighting sequential access more favorably than random access.

 

Beautiful! After changing the random_page_cost to 1.0 the original query went from ~3.5s to ~35ms. This is exactly the kind of insight I was fishing for in the original post. I'll keep in mind that the query planner is very tunable and has these sorts of hardware-related trade-offs in the future. I can't thank you enough!

 

Cheers!

 

pgsql-performance by date:

Previous
From: Joe Proietti
Date:
Subject: MYSQL Stats
Next
From: Jim Nasby
Date:
Subject: Re: Unexpected expensive index scan