Thread: RDS IO Read time

RDS IO Read time

From
Eden Aharoni
Date:

Hi all,

Hopefully, someone here could help us understand whats going on with our deployment..

We are running Postgres 17.4 on AWS RDS on an ec2 instance that has:

  • 32vCPU
  • 128GB RAM
  • gp3 with 25K IOPS and 4000MiB/s throughput
  • the instance supports up to 3125GB/s of throughput.

Whenever a query fetches data from the disk (index scan, bitmap scans, etc.) we’re reaching 23 – 30MB/s of IO Read time.

We changed our autovacuum settings to be more aggressive (0 scale factor and 10K threshold) so our index only scans will (hopefully) read data from the disk rarely. However, we can’t optimize all the queries to use index-only-scan since our users can dynamically select which columns to see and filtering.

Is this expected IO read rate? I can’t help but feel we’re missing something here..

 

Thanks a lot! 😊

 

Re: RDS IO Read time

From
Adrian Klaver
Date:
On 3/31/25 06:54, Eden Aharoni wrote:
> Hi all,
> 
> Hopefully, someone here could help us understand whats going on with our 
> deployment..
> 
> We are running Postgres 17.4 on AWS RDS on an ec2 instance that has:
> 
>   * 32vCPU
>   * 128GB RAM
>   * gp3 with 25K IOPS and 4000MiB/s throughput
>   * the instance supports up to 3125GB/s of throughput.
> 
> Whenever a query fetches data from the disk (index scan, bitmap scans, 
> etc.) we’re reaching 23 – 30MB/s of IO Read time.
> 
> We changed our autovacuum settings to be more aggressive (0 scale factor 
> and 10K threshold) so our index only scans will (hopefully) read data 
> from the disk rarely. However, we can’t optimize all the queries to use 
> index-only-scan since our users can dynamically select which columns to 
> see and filtering.
> 
> Is this expected IO read rate? I can’t help but feel we’re missing 
> something here..

RDS is a black box controlled by AWS, you are going to need to reach out 
to their tech support.

> 
> Thanks a lot! 😊
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: RDS IO Read time

From
Christophe Pettus
Date:

> On Mar 31, 2025, at 06:54, Eden Aharoni <edena@legitsecurity.com> wrote:
> Is this expected IO read rate? I can’t help but feel we’re missing something here..

Really, no particular I/O rate is "expected": if PostgreSQL needs that much data, it'll use that much I/O to get it.
Fromyour description, it's likely that it's a case of the working set for the database just not fitting into the memory
youhave, so PostgreSQL needs to go out to secondary storage a lot to fetch the data. 

The best first step is to use Performance Insights to see which queries are using I/O, and run sample ones with EXPLAIN
(ANALYZE,BUFFERS) to see where the I/O is being used within the query.  Given that you allow users to assemble
arbitraryqueries, it's likely that PostgreSQL is having to use a wide variety of indexes (or sequential scans), so it
can'tsuccessfully cache a particular set in memory. 


Re: [EXTERNAL] Re: RDS IO Read time

From
Eden Aharoni
Date:
First, thanks for the reply :)
So, I do know which part is taking a lot of IO time and it's to be honest any node that reads from the disk.. of course, we're running EXPLAIN on our queries (to be more specific we use auto_explain) but we can't seem to find what could cause an index scan that reads 34 MB to take more than a second (1.2 sec).. we do know that our dataset doesn't fit the memory and we were ok with that but with IO being so slow we just don't know what to do anymore. Any other suggestions other than contact AWS (which we did but it seems this path won't lead anywhere).

Thanks again 


From: Christophe Pettus <xof@thebuild.com>
Sent: Monday, March 31, 2025 6:15:39 PM
To: Eden Aharoni <edena@legitsecurity.com>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: [EXTERNAL] Re: RDS IO Read time
 
[You don't often get email from xof@thebuild.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

> On Mar 31, 2025, at 06:54, Eden Aharoni <edena@legitsecurity.com> wrote:
> Is this expected IO read rate? I can’t help but feel we’re missing something here..

Really, no particular I/O rate is "expected": if PostgreSQL needs that much data, it'll use that much I/O to get it.  From your description, it's likely that it's a case of the working set for the database just not fitting into the memory you have, so PostgreSQL needs to go out to secondary storage a lot to fetch the data.

The best first step is to use Performance Insights to see which queries are using I/O, and run sample ones with EXPLAIN (ANALYZE, BUFFERS) to see where the I/O is being used within the query.  Given that you allow users to assemble arbitrary queries, it's likely that PostgreSQL is having to use a wide variety of indexes (or sequential scans), so it can't successfully cache a particular set in memory.

Re: [EXTERNAL] RDS IO Read time

From
Christophe Pettus
Date:

> On Mar 31, 2025, at 10:32, Eden Aharoni <edena@legitsecurity.com> wrote:
>
> First, thanks for the reply :)
> So, I do know which part is taking a lot of IO time and it's to be honest any node that reads from the disk.. of
course,we're running EXPLAIN on our queries (to be more specific we use auto_explain) but we can't seem to find what
couldcause an index scan that reads 34 MB to take more than a second (1.2 sec).. we do know that our dataset doesn't
fitthe memory and we were ok with that but with IO being so slow we just don't know what to do anymore. Any other
suggestionsother than contact AWS (which we did but it seems this path won't lead anywhere). 

The performance of EBS is definitely a question for AWS.  I can say that, in our experience, EBS mounts for RDS almost
neverapproach the stated maximum throughput, although io2 tends to be closer than gp2 or gp3. 


Re: [EXTERNAL] RDS IO Read time

From
Eden Aharoni
Date:
So you believe it's strictly an EBS issue?


From: Christophe Pettus <xof@thebuild.com>
Sent: Monday, March 31, 2025 8:43:19 PM
To: Eden Aharoni <edena@legitsecurity.com>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: [EXTERNAL] RDS IO Read time
 
[You don't often get email from xof@thebuild.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

> On Mar 31, 2025, at 10:32, Eden Aharoni <edena@legitsecurity.com> wrote:
>
> First, thanks for the reply :)
> So, I do know which part is taking a lot of IO time and it's to be honest any node that reads from the disk.. of course, we're running EXPLAIN on our queries (to be more specific we use auto_explain) but we can't seem to find what could cause an index scan that reads 34 MB to take more than a second (1.2 sec).. we do know that our dataset doesn't fit the memory and we were ok with that but with IO being so slow we just don't know what to do anymore. Any other suggestions other than contact AWS (which we did but it seems this path won't lead anywhere).

The performance of EBS is definitely a question for AWS.  I can say that, in our experience, EBS mounts for RDS almost never approach the stated maximum throughput, although io2 tends to be closer than gp2 or gp3.

Re: [EXTERNAL] RDS IO Read time

From
Christophe Pettus
Date:

> On Mar 31, 2025, at 10:54, Eden Aharoni <edena@legitsecurity.com> wrote:
>
> So you believe it's strictly an EBS issue?

Well, PostgreSQL certainly can read faster than 34MB/s off of disk.  With the data you've given, I can't really say if
it'spurely an EBS issue. 


Re: [EXTERNAL] RDS IO Read time

From
Eden Aharoni
Date:
Can you please tell me what other data might help? Would you like an EXPLAIN output?


From: Christophe Pettus <xof@thebuild.com>
Sent: Monday, March 31, 2025 9:23:45 PM
To: Eden Aharoni <edena@legitsecurity.com>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: [EXTERNAL] RDS IO Read time
 


> On Mar 31, 2025, at 10:54, Eden Aharoni <edena@legitsecurity.com> wrote:
>
> So you believe it's strictly an EBS issue?

Well, PostgreSQL certainly can read faster than 34MB/s off of disk.  With the data you've given, I can't really say if it's purely an EBS issue.

Re: [EXTERNAL] RDS IO Read time

From
Adrian Klaver
Date:

On 3/31/25 11:23 AM, Christophe Pettus wrote:
> 
> 
>> On Mar 31, 2025, at 10:54, Eden Aharoni <edena@legitsecurity.com> wrote:
>>
>> So you believe it's strictly an EBS issue?

You are asking about disk I/O, that is the purview of the OS and 
hardware. In your case both are created and managed by AWS, only they 
know what is going on behind the scenes.

> 
> Well, PostgreSQL certainly can read faster than 34MB/s off of disk.  With the data you've given, I can't really say
ifit's purely an EBS issue.
 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: [EXTERNAL] RDS IO Read time

From
Christophe Pettus
Date:

> On Mar 31, 2025, at 11:30, Eden Aharoni <edena@legitsecurity.com> wrote:
>
> Can you please tell me what other data might help?

You can show your work on how you got the megabytes/second number.  (Be aware that on a general open-source mailing
list,there's only so much debugging that we can do of a specific problem that is not a PostgreSQL bug.) 


Re: [EXTERNAL] RDS IO Read time

From
Eden Aharoni
Date:

Aggregate  (cost=8581.22..8581.23 rows=1 width=4) (actual time=1209.498..1210.530 rows=1 loops=1)

  Buffers: shared hit=3108 read=4314 dirtied=1

  I/O Timings: shared read=1169.977

  ->  Gather  (cost=8580.80..8581.21 rows=4 width=8) (actual time=1209.371..1210.526 rows=5 loops=1)

          Workers Planned: 4

          Workers Launched: 4

        Buffers: shared hit=3108 read=4314 dirtied=1

        I/O Timings: shared read=1169.977

        ->  Aggregate  (cost=7580.80..7580.81 rows=1 width=8) (actual time=1206.518..1206.520 rows=1 loops=5)

              Buffers: shared hit=3105 read=4314 dirtied=1

              I/O Timings: shared read=1169.977

              ->  Hash Join  (cost=2620.70..7578.45 rows=937 width=0) (actual time=148.965..1206.330 rows=1897 loops=5)

                    Buffers: shared hit=3105 read=4314 dirtied=1

                    I/O Timings: shared read=1169.977

                    ->  Index Only Scan using IX_Collaborators_TenantId_IsDeleted_RunningIndex on Collaborators c  (cost=0.55..4619.26 rows=11407 width=80) (actual time=2.815..1058.793 rows=7474 loops=5)

                            Index Cond: ((c."TenantId" = '0f8026c775a5499fdd3ccd193a15f94c3b4ae58a'::text) AND (c."IsDeleted" = false))

                            Heap Fetches: 3711

                          Buffers: shared hit=2332 read=3563

                          I/O Timings: shared read=1028.548

                    ->  Hash  (cost=2479.01..2479.01 rows=11292 width=79) (actual time=144.988..144.989 rows=1983 loops=5)

                          Buffers: shared hit=637 read=751 dirtied=1

                          I/O Timings: shared read=141.429

                          ->  Index Only Scan using IX_IntegrationCollaboration_TenantId_IsIntegrationDeleted_IsDe~ on IntegrationCollaboration i  (cost=0.55..2479.01 rows=11292 width=79) (actual time=1.804..144.122 rows=1983 loops=5)

                                  Index Cond: (i."TenantId" = '0f8026c775a5499fdd3ccd193a15f94c3b4ae58a'::text)

                                  Heap Fetches: 112

                                Buffers: shared hit=637 read=751 dirtied=1

                                I/O Timings: shared read=141.429

Planning:

  Buffers: shared hit=440

Execution time: 1210.592 ms

 

So we can see here that the query read 4314 blocks in 1.2 sec.. (ive done 4314 * 8kb / 1.2sec which is about 28 MB/s)…

 

Hope you’ll have any clue 😊

From: Christophe Pettus <xof@thebuild.com>
Date: Monday, 31 March 2025 at 21:44
To: Eden Aharoni <edena@legitsecurity.com>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: [EXTERNAL] RDS IO Read time



> On Mar 31, 2025, at 11:30, Eden Aharoni <edena@legitsecurity.com> wrote:
>
> Can you please tell me what other data might help?

You can show your work on how you got the megabytes/second number.  (Be aware that on a general open-source mailing list, there's only so much debugging that we can do of a specific problem that is not a PostgreSQL bug.)

Re: [EXTERNAL] RDS IO Read time

From
Christophe Pettus
Date:

> On Mar 31, 2025, at 12:36, Eden Aharoni <edena@legitsecurity.com> wrote:
>  Hope you’ll have any clue 😊

Based on that, I'd take it up with AWS.  It does seem that the EBS mount is under-performing.