Home > mailing lists

Re: Distributing data over "spindles" even on AWS EBS, (followup tothe work queue saga) - Mailing list pgsql-performance

From	Sam Gendler
Subject	Re: Distributing data over "spindles" even on AWS EBS, (followup tothe work queue saga)
Date	March 19, 2019 15:18:10
Msg-id	CAEV0TzALNhT0UFo7BRz_XFCCd9wQJh4twW6gRpR=mrN=-Mo3oQ@mail.gmail.com Whole thread
In response to	Re: Distributing data over "spindles" even on AWS EBS, (followup tothe work queue saga) (Gunther <raj@gusw.net>)
List	pgsql-performance

Tree view

You do have a finite amount of bandwidth per-instance. On c5.xlarge, it is 3500 Mbit/sec, no matter how many iops you buy. Keep an eye on yur overall EBS bandwidth utilization.

On Sun, Mar 17, 2019 at 11:42 AM Gunther <raj@gusw.net> wrote:

On 3/14/2019 11:11, Jeremy Schneider wrote:
> On 3/14/19 07:53, Gunther wrote:
>> 2. build a low level "spreading" scheme which is to take the partial
>> files 4653828 and 4653828.1, .2, _fsm, etc. and move each to another
>> device and then symlink it back to that directory (I come back to this!)
> ...
>> To 2. I find that it would be a nice feature of PostgreSQL if we could
>> just use symlinks and a symlink rule, for example, when PostgreSQL finds
>> that 4653828 is in fact a symlink to /otherdisk/PG/16284/4653828, then
>> it would
>>
>> * by default also create 4653828.1 as a symlink and place the actual
>> data file on /otherdisk/PG/16284/4653828.1
> How about if we could just specify multiple tablespaces for an object,
> and then PostgreSQL would round-robin new segments across the presently
> configured tablespaces? This seems like a simple and elegant solution
> to me.

Very good idea! I agree.

Very important also would be to take out the existing patch someone had
contributed to allow toast tables to be assigned to different tablespaces.

>> 4. maybe I can configure in AWS EBS to reserve more IOPS -- but why
>> would I pay for more IOPS if my cost is by volume size? I can just
>> make another volume? or does AWS play a similar trick on us with
>> IOPS being limited on some "credit" system???
> Not credits, but if you're using gp2 volumes then pay close attention to
> how burst balance works. A single large volume is the same price as two
> striped volumes at half size -- but the striped volumes will have double
> the burst speed and take twice as long to refill the burst balance.

Yes, I learned that too. It seems a very interesting "bug" of the Amazon
GP2 IOPS allocation scheme. They say it's like 3 IOPS per GiB, so if I
have 100 GiB I get 300 IOPS. But it also says minimum 100. So that means
if I have 10 volumes of 10 GiB each, I get 1000 IOPS minimum between
them all. But if I have it all on one 100 GiB volume I only get 300 IOPS.

I wonder if Amazon is aware of this. I hope they are and think that's
just fine. Because I like it.

It also is a clear sign to me that I want to use page sizes > 4k for the
file system. I have tried on Amazon Linux to use 8k block sizes of the
XFS volume, but I cannot mount those, since the Linux says it can
currently only deal with 4k blocks. This is another reason I consider
switching the database server(s) to FreeBSD. OTOH, who knows may be
this 4k is a limit of the AWS EBS infrastructure. After all, if I am
scraping the 300 or 1000 IOPS limit already and if I can suddenly
upgrade my block sizes per IO, I double my IO throughput.

regards,
-Gunther

pgsql-performance by date:

From: Gunther
Date: 17 March 2019, 18:42:04
Subject: Re: Distributing data over "spindles" even on AWS EBS, (followup tothe work queue saga)

From: Maracska Ádám
Date: 20 March 2019, 11:05:15
Subject: Performance issue with order by clause on

Re: Distributing data over "spindles" even on AWS EBS, (followup tothe work queue saga) - Mailing list pgsql-performance

Previous

Next