vacuum analyze query performance - help me understand - Mailing list pgsql-general

From Scot Kreienkamp
Subject vacuum analyze query performance - help me understand
Date
Msg-id SJ0PR15MB524590D2BECF3283CDCE39289A30A@SJ0PR15MB5245.namprd15.prod.outlook.com
Whole thread Raw
Responses Re: vacuum analyze query performance - help me understand
Re: vacuum analyze query performance - help me understand
List pgsql-general
Hi list,

I'm struggling to understand why a query is having performance issues related to analyze.  Any help in understanding
what'sgoing on would be appreciated. 

My database servers are 48 core, 512 gigs of memory, backed by 1.2TB SAN, on 9.6 (not my choice, I've been asking for
anupgrade since 11 came out) that's having performance issues with one database.  I copied it by dump/restore to a
developmentserver (same hardware specs) where I could isolate it so nobody else can connect to it except me, so I could
guaranteeno changes are being made in the data or structure.  There are 3 other large (for my environment) databases
therethough also.  This morning at 2am this vacuum command was run via cron on the server: 

if pg_isready -q ; then vacuumdb -U postgres --quiet --all --analyze --jobs $(grep -c ^processor /proc/cpuinfo) ; fi

That command should vacuum analyze all tables in all databases, and it was proven that it worked as the last vacuum and
analyzedates on the tables in all databases are showing dates from 2am today. 

So here's the problem:  We ran a (admittedly poorly written) select query against a subset of tables which performed
poorly. Then we ran a vacuum analyze against just those tables involved in that query, then ran the same query again,
whichperformed exponentially better.    I verified by comparing before and after explains that the query plan did not
changebetween runs of the query, the only changes were stats like cost, rows, width, time, etc.  From what I know,
statsonly get outdated by database activity.  With no activity in the database and a recent vacuum analyze, why did
runninganother vacuum analyze make such a difference in performance?  Is there any way to prove the vacuum was or
wasn'tthe source of the performance increase?  I would be more apt to believe the difference in performance came from
somethingelse, like caching from the first query run, or caching from the vacuum? 

This was a contractor that gave me these results, and I'm having a hard time believing their conclusion.  Any insight
wouldbe appreciated. 




Scot Kreienkamp | Applications Infrastructure Architect | La-Z-Boy Corporate
(734) 384-6403 | 1-734-915-1444 | Scot.Kreienkamp@la-z-boy.com
One La-Z-Boy Drive | Monroe, Michigan 48162 | la-z-boy.com<http://www.la-z-boy.com/>
 facebook.com/lazboy<http://facebook.com/lazboy>  | instagram.com/lazboy<https://instagram.com/lazboy> |
youtube.com/lazboy<http://youtube.com/lazboy>


[cid:lazboy_2024_inc_navy_pms2189_rgb_rev2025_e40a94f2-e344-4a4a-a02c-fc31996e127c.png]

This message is intended only for the individual or entity to which it is addressed. It may contain privileged,
confidentialinformation which is exempt from disclosure under applicable laws. If you are not the intended recipient,
youare strictly prohibited from disseminating or distributing this information (other than to the intended recipient)
orcopying this information. If you have received this communication in error, please notify us immediately by e-mail or
bytelephone at the above number. Thank you. 

Attachment

pgsql-general by date:

Previous
From: hubert depesz lubaczewski
Date:
Subject: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Next
From: Tom Lane
Date:
Subject: Re: vacuum analyze query performance - help me understand