Thread: Poor performance after restoring database from snapshot on AWS RDS
We keep the staging environment of our application up to date with respect to production data by creating a new RDS instance for the staging environment and restoring the most recent production snapshot into it. We get very poor performance in the staging environment after this restore takes place - after some usage it seems to get better perhaps because of caching. The staging RDS instance is a smaller size than production (it has 32GB ram and 8 vCPU vs production's 128GB ram and 32 vCPU) but the performance seems to much worse than this decrease in resources would account for. I have seen some advice that vacuum analyze should be run after the snapshot restore but I thought this was supposed to happen automatically. If we did run it manually how would that help? Are there any other tools in postgres we can use to figure out why it might be so much slower? Best -- Sam Kidman Web Developer Melbourne
Sam Kidman schrieb am 03.06.2024 um 10:06: > We get very poor performance in the staging environment after this > restore takes place - after some usage it seems to get better perhaps > because of caching. > > The staging RDS instance is a smaller size than production (it has > 32GB ram and 8 vCPU vs production's 128GB ram and 32 vCPU) but the > performance seems to much worse than this decrease in resources would > account for. > > I have seen some advice that vacuum analyze should be run after the > snapshot restore but I thought this was supposed to happen > automatically. If we did run it manually how would that help? autovacuum will kick in eventually - but only after some time (which is what you are seeing). In general after a bulk load (e.g. restore of a backup or importing data in any other way) running vacuum to udpate statistics is highly recommended
On Wed, Jun 5, 2024 at 4:23 AM Sam Kidman <sam@fresho.com> wrote: > We get very poor performance in the staging environment after this > restore takes place - after some usage it seems to get better perhaps > because of caching. > This is due to the way that RDS restores snapshots. From the docs (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html): You can use the restored DB instance as soon as its status is available. The DB instance continues to load data in the background. This is known as lazy loading. If you access data that hasn't been loaded yet, the DB instance immediately downloads the requested data from Amazon S3, and then continues loading the rest of the data in the background. -Jeremy
> This is due to the way that RDS restores snapshots. Thanks, I never would have guessed. Would vacuum analyze be sufficient to defeat the lazy loading or would we need to do something more specific to our application? (for example. select(*) on some commonly used tables) I think vacuum full would certainly defeat the lazy loading since it would copy all of the table data, but that may take a very long time to run. I think vacuum analyze only scans a subset of rows but I might be wrong about that. Best, Sam On Wed, Jun 5, 2024 at 10:09 PM Jeremy Smith <jeremy@musicsmith.net> wrote: > > On Wed, Jun 5, 2024 at 4:23 AM Sam Kidman <sam@fresho.com> wrote: > > > We get very poor performance in the staging environment after this > > restore takes place - after some usage it seems to get better perhaps > > because of caching. > > > > This is due to the way that RDS restores snapshots. > > From the docs (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html): > > You can use the restored DB instance as soon as its status is > available. The DB instance continues to load data in the background. > This is known as lazy loading. > > If you access data that hasn't been loaded yet, the DB instance > immediately downloads the requested data from Amazon S3, and then > continues loading the rest of the data in the background. > > > > -Jeremy
On Fri, Jun 7, 2024 at 4:36 AM Sam Kidman <sam@fresho.com> wrote:
> This is due to the way that RDS restores snapshots.
Thanks, I never would have guessed. Would vacuum analyze be sufficient
to defeat the lazy loading or would we need to do something more
specific to our application? (for example. select(*) on some commonly
used tables)
pg_prewarm is probably what you want. Don't know if RDS Postgresql supports it or not, though.
I think vacuum full would certainly defeat the lazy loading since it
would copy all of the table data, but that may take a very long time
to run. I think vacuum analyze only scans a subset of rows but I might
be wrong about that.
Best, Sam
On Wed, Jun 5, 2024 at 10:09 PM Jeremy Smith <jeremy@musicsmith.net> wrote:
>
> On Wed, Jun 5, 2024 at 4:23 AM Sam Kidman <sam@fresho.com> wrote:
>
> > We get very poor performance in the staging environment after this
> > restore takes place - after some usage it seems to get better perhaps
> > because of caching.
> >
>
> This is due to the way that RDS restores snapshots.
>
> From the docs (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html):
>
> You can use the restored DB instance as soon as its status is
> available. The DB instance continues to load data in the background.
> This is known as lazy loading.
>
> If you access data that hasn't been loaded yet, the DB instance
> immediately downloads the requested data from Amazon S3, and then
> continues loading the rest of the data in the background.
>
>
>
> -Jeremy