Thread: PostgreSQL and HugePage

PostgreSQL and HugePage

From
Hsien-Wen Chu
Date:
Dear All

I want to use hugepage function on Linux platform, my question is if PostgreSQL supports hugepage in default, if not, what's the code need to be modified?

Thank you for your greate support

Hsien-Wen


Re: PostgreSQL and HugePage

From
Robert Haas
Date:
On Tue, Oct 19, 2010 at 8:39 PM, Hsien-Wen Chu <chu.hsien.wen@gmail.com> wrote:
> I want to use hugepage function on Linux platform, my question is if
> PostgreSQL supports hugepage in default, if not, what's the code need to be
> modified?

Unfortunately, I don't think this is too simple.  PostgreSQL uses sysv
shared memory, not POSIX shared memory.  I don't know of a way to use
sysv shared memory with hugetlb.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: PostgreSQL and HugePage

From
Mark Kirkwood
Date:
On 20/10/10 15:10, Robert Haas wrote:
> On Tue, Oct 19, 2010 at 8:39 PM, Hsien-Wen Chu<chu.hsien.wen@gmail.com>  wrote:
>    
>> I want to use hugepage function on Linux platform, my question is if
>> PostgreSQL supports hugepage in default, if not, what's the code need to be
>> modified?
>>      
> Unfortunately, I don't think this is too simple.  PostgreSQL uses sysv
> shared memory, not POSIX shared memory.  I don't know of a way to use
> sysv shared memory with hugetlb.
>
>    

According to:

http://devresources.linuxfoundation.org/dev/robustmutexes/src/fusyn.hg/Documentation/vm/hugetlbpage.txt

shmget and friends are hugetlbpage  aware, so it seems it should 'just 
work'. However, I have not checked...

Cheers

Mark


Re: PostgreSQL and HugePage

From
Mark Kirkwood
Date:
On 20/10/10 16:05, Mark Kirkwood wrote:
>
>
> shmget and friends are hugetlbpage  aware, so it seems it should 'just 
> work'.
>

Heh - provided you specify

SHM_HUGETLB


in the relevant call that is :-)



Re: PostgreSQL and HugePage

From
daveg
Date:
On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
> On 20/10/10 16:05, Mark Kirkwood wrote:
> >
> >
> >shmget and friends are hugetlbpage  aware, so it seems it should 'just 
> >work'.
> >
> 
> Heh - provided you specify
> 
> SHM_HUGETLB
> 
> 
> in the relevant call that is :-)

I had a patch for this against 8.3 that I could update if there is any
interest. I suspect it is helpful.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.


Re: PostgreSQL and HugePage

From
Robert Haas
Date:
On Tue, Oct 19, 2010 at 11:30 PM, daveg <daveg@sonic.net> wrote:
> On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
>> On 20/10/10 16:05, Mark Kirkwood wrote:
>> >
>> >
>> >shmget and friends are hugetlbpage  aware, so it seems it should 'just
>> >work'.
>> >
>>
>> Heh - provided you specify
>>
>> SHM_HUGETLB
>>
>>
>> in the relevant call that is :-)
>
> I had a patch for this against 8.3 that I could update if there is any
> interest. I suspect it is helpful.

I think it would be a good feature.  Of course, we would need
appropriate documentation, and some benchmarks showing that it really
works.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: PostgreSQL and HugePage

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 19, 2010 at 11:30 PM, daveg <daveg@sonic.net> wrote:
>> On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
>>> Heh - provided you specify
>>> SHM_HUGETLB
>>> in the relevant call that is :-)

>> I had a patch for this against 8.3 that I could update if there is any
>> interest. I suspect it is helpful.

> I think it would be a good feature.  Of course, we would need
> appropriate documentation, and some benchmarks showing that it really
> works.

I believe that for the equivalent Solaris option, we just automatically
enable it when available.  So there'd be no need for user documentation.
However, I definitely *would* like to see some benchmarks proving that
the change actually does something useful.  I've always harbored the
suspicion that this is just a knob to satisfy people who need knobs to
frob.
        regards, tom lane


Re: PostgreSQL and HugePage

From
Kenneth Marshall
Date:
On Wed, Oct 20, 2010 at 10:10:00AM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Tue, Oct 19, 2010 at 11:30 PM, daveg <daveg@sonic.net> wrote:
> >> On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
> >>> Heh - provided you specify
> >>> SHM_HUGETLB
> >>> in the relevant call that is :-)
> 
> >> I had a patch for this against 8.3 that I could update if there is any
> >> interest. I suspect it is helpful.
> 
> > I think it would be a good feature.  Of course, we would need
> > appropriate documentation, and some benchmarks showing that it really
> > works.
> 
> I believe that for the equivalent Solaris option, we just automatically
> enable it when available.  So there'd be no need for user documentation.
> However, I definitely *would* like to see some benchmarks proving that
> the change actually does something useful.  I've always harbored the
> suspicion that this is just a knob to satisfy people who need knobs to
> frob.
> 
>             regards, tom lane
> 

Oracle apparently uses hugepages if they are available by first trying
with the SHM_HUGETLB option. If it fails, they reissue the command
without that option. This article does mention some of the benefits
of the larger pagesizes with large shared memory regions:

http://appcrawler.com/wordpress/?p=686

Regard,
Ken


Re: PostgreSQL and HugePage

From
Greg Stark
Date:
On Wed, Oct 20, 2010 at 7:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I believe that for the equivalent Solaris option, we just automatically
> enable it when available.  So there'd be no need for user documentation.
> However, I definitely *would* like to see some benchmarks proving that
> the change actually does something useful.  I've always harbored the
> suspicion that this is just a knob to satisfy people who need knobs to
> frob.

Well saving a few megabytes of kernel space memory isn't a bad thing.
But I think the major effect is on forking new processes. Having to
copy that page map is a major cost when you're talking about very
large memory footprints. While machine memory has gotten larger the 4k
page size hasn't. I don't think it's a big cost once all the processes
have been forked if you're reusing them beyond perhaps slightly more
efficient cache usage.


--
greg


Re: PostgreSQL and HugePage

From
Greg Stark
Date:
On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark <gsstark@mit.edu> wrote:
> I don't think it's a big cost once all the processes
> have been forked if you're reusing them beyond perhaps slightly more
> efficient cache usage.

Hm, this site claims to get a 13% win just from the reduced tlb misses
using a preload hack with Pg 8.2. That would be pretty substantial.

http://oss.linbit.com/hugetlb/


-- 
greg


Re: PostgreSQL and HugePage

From
Alvaro Herrera
Date:
Excerpts from Greg Stark's message of mié oct 20 16:28:25 -0300 2010:
> On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark <gsstark@mit.edu> wrote:
> > I don't think it's a big cost once all the processes
> > have been forked if you're reusing them beyond perhaps slightly more
> > efficient cache usage.
> 
> Hm, this site claims to get a 13% win just from the reduced tlb misses
> using a preload hack with Pg 8.2. That would be pretty substantial.
> 
> http://oss.linbit.com/hugetlb/

Wow, is there no other way to get the huge page size other than opening
and reading /proc/meminfo?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: PostgreSQL and HugePage

From
Robert Haas
Date:
On Wed, Oct 20, 2010 at 3:47 PM, daveg <daveg@sonic.net> wrote:
> On Wed, Oct 20, 2010 at 12:28:25PM -0700, Greg Stark wrote:
>> On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark <gsstark@mit.edu> wrote:
>> > I don't think it's a big cost once all the processes
>> > have been forked if you're reusing them beyond perhaps slightly more
>> > efficient cache usage.
>>
>> Hm, this site claims to get a 13% win just from the reduced tlb misses
>> using a preload hack with Pg 8.2. That would be pretty substantial.
>>
>> http://oss.linbit.com/hugetlb/
>
> That was my motivation in trying a patch. TLB misses can be a substantial
> overhead. I'm not current on the state of play, but working at Sun's
> benchmark lab on a DB TPC-B benchmark something for the first generation
> of MP systems, something like 30% of all bus traffic was TLB misses. The
> next iteration of the hardward had a much larger TLB.
>
> I have a client with 512GB memory systems, currently with 128GB configured
> as postgresql buffer cache. Which is 32M TLB entires trying to fit in the
> few dozed cpu TLB slots. I suspect there may be some contention.
>
> I'll benchmark of course.

Do you mean 128GB shared buffers, or shared buffers + OS cache?  I
think that the general wisdom is that performance tails off beyond
8-10GB of shared buffers anyway, so a performance improvement on 128GB
shared buffers might not mean much unless you can also show that 128GB
shared buffers actually performs better than some smaller amount.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: PostgreSQL and HugePage

From
daveg
Date:
On Wed, Oct 20, 2010 at 12:28:25PM -0700, Greg Stark wrote:
> On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark <gsstark@mit.edu> wrote:
> > I don't think it's a big cost once all the processes
> > have been forked if you're reusing them beyond perhaps slightly more
> > efficient cache usage.
> 
> Hm, this site claims to get a 13% win just from the reduced tlb misses
> using a preload hack with Pg 8.2. That would be pretty substantial.
> 
> http://oss.linbit.com/hugetlb/

That was my motivation in trying a patch. TLB misses can be a substantial
overhead. I'm not current on the state of play, but working at Sun's
benchmark lab on a DB TPC-B benchmark something for the first generation
of MP systems, something like 30% of all bus traffic was TLB misses. The
next iteration of the hardward had a much larger TLB.

I have a client with 512GB memory systems, currently with 128GB configured
as postgresql buffer cache. Which is 32M TLB entires trying to fit in the
few dozed cpu TLB slots. I suspect there may be some contention.

I'll benchmark of course.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.


Re: PostgreSQL and HugePage

From
Mark Wong
Date:
On Wed, Oct 20, 2010 at 1:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Oct 20, 2010 at 3:47 PM, daveg <daveg@sonic.net> wrote:
>> On Wed, Oct 20, 2010 at 12:28:25PM -0700, Greg Stark wrote:
>>> On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark <gsstark@mit.edu> wrote:
>>> > I don't think it's a big cost once all the processes
>>> > have been forked if you're reusing them beyond perhaps slightly more
>>> > efficient cache usage.
>>>
>>> Hm, this site claims to get a 13% win just from the reduced tlb misses
>>> using a preload hack with Pg 8.2. That would be pretty substantial.
>>>
>>> http://oss.linbit.com/hugetlb/
>>
>> That was my motivation in trying a patch. TLB misses can be a substantial
>> overhead. I'm not current on the state of play, but working at Sun's
>> benchmark lab on a DB TPC-B benchmark something for the first generation
>> of MP systems, something like 30% of all bus traffic was TLB misses. The
>> next iteration of the hardward had a much larger TLB.
>>
>> I have a client with 512GB memory systems, currently with 128GB configured
>> as postgresql buffer cache. Which is 32M TLB entires trying to fit in the
>> few dozed cpu TLB slots. I suspect there may be some contention.
>>
>> I'll benchmark of course.
>
> Do you mean 128GB shared buffers, or shared buffers + OS cache?  I
> think that the general wisdom is that performance tails off beyond
> 8-10GB of shared buffers anyway, so a performance improvement on 128GB
> shared buffers might not mean much unless you can also show that 128GB
> shared buffers actually performs better than some smaller amount.

I'm sure someone will correct me if I'm wrong, but when I looked at
this a couple years ago I believe a side effect of using hugetlbs is
that these segments are never swapped out.

I made a weak attempt to patch postgres to use hugetlbs when
allocating shared memory.  If I can find that patch I'll send it out..

Mark


Re: PostgreSQL and HugePage

From
Mark Wong
Date:
On Tue, Oct 19, 2010 at 8:30 PM, daveg <daveg@sonic.net> wrote:
> On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
>> On 20/10/10 16:05, Mark Kirkwood wrote:
>> >
>> >
>> >shmget and friends are hugetlbpage  aware, so it seems it should 'just
>> >work'.
>> >
>>
>> Heh - provided you specify
>>
>> SHM_HUGETLB
>>
>>
>> in the relevant call that is :-)
>
> I had a patch for this against 8.3 that I could update if there is any
> interest. I suspect it is helpful.

Oh, probably better than me digging up my broken one.  Send it out as
is if you don't want to update it. :)

Regards,
Mark


Re: PostgreSQL and HugePage

From
daveg
Date:
On Thu, Oct 21, 2010 at 08:16:27AM -0700, Mark Wong wrote:
> On Tue, Oct 19, 2010 at 8:30 PM, daveg <daveg@sonic.net> wrote:
> > On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
> >> On 20/10/10 16:05, Mark Kirkwood wrote:
> >> >
> >> >
> >> >shmget and friends are hugetlbpage  aware, so it seems it should 'just
> >> >work'.
> >> >
> >>
> >> Heh - provided you specify
> >>
> >> SHM_HUGETLB
> >>
> >>
> >> in the relevant call that is :-)
> >
> > I had a patch for this against 8.3 that I could update if there is any
> > interest. I suspect it is helpful.
> 
> Oh, probably better than me digging up my broken one.  Send it out as
> is if you don't want to update it. :)

I'll update it and see if I can get a largish machine to test, at least with
pgbench on. But not today alas.

-dg
-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.


Re: PostgreSQL and HugePage

From
David Fetter
Date:
On Thu, Oct 21, 2010 at 12:10:22PM -0700, David Gould wrote:
> On Thu, Oct 21, 2010 at 08:16:27AM -0700, Mark Wong wrote:
> > On Tue, Oct 19, 2010 at 8:30 PM, daveg <daveg@sonic.net> wrote:
> > > On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
> > >> On 20/10/10 16:05, Mark Kirkwood wrote:
> > >> >
> > >> >
> > >> >shmget and friends are hugetlbpage  aware, so it seems it should 'just
> > >> >work'.
> > >> >
> > >>
> > >> Heh - provided you specify
> > >>
> > >> SHM_HUGETLB
> > >>
> > >>
> > >> in the relevant call that is :-)
> > >
> > > I had a patch for this against 8.3 that I could update if there is any
> > > interest. I suspect it is helpful.
> > 
> > Oh, probably better than me digging up my broken one.  Send it out as
> > is if you don't want to update it. :)
> 
> I'll update it and see if I can get a largish machine to test, at least with
> pgbench on. But not today alas.

If you'd be so kind as to update it, others can probably find the
aforementioned largish machine to test it on :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: PostgreSQL and HugePage

From
Hsien-Wen Chu
Date:
as my known, FreeBSD implements this feature called superpage, it's similar with Solaris, so is it enabled in default? or any default parameter need to be set?

Many thank

Hsien-Wen


On Wed, Oct 20, 2010 at 10:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 19, 2010 at 11:30 PM, daveg <daveg@sonic.net> wrote:
>> On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
>>> Heh - provided you specify
>>> SHM_HUGETLB
>>> in the relevant call that is :-)

>> I had a patch for this against 8.3 that I could update if there is any
>> interest. I suspect it is helpful.

> I think it would be a good feature.  Of course, we would need
> appropriate documentation, and some benchmarks showing that it really
> works.

I believe that for the equivalent Solaris option, we just automatically
enable it when available.  So there'd be no need for user documentation.
However, I definitely *would* like to see some benchmarks proving that
the change actually does something useful.  I've always harbored the
suspicion that this is just a knob to satisfy people who need knobs to
frob.

                       regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: PostgreSQL and HugePage

From
Tom Lane
Date:
Hsien-Wen Chu <chu.hsien.wen@gmail.com> writes:
> as my known, FreeBSD implements this feature called superpage, it's similar
> with Solaris, so is it enabled in default? or any default parameter need to
> be set?

The Solaris-specific code is just that if SHM_SHARE_MMU is defined (by
<sys/ipc.h>, I think) we include it in the flags parameter for shmat(2).
If FreeBSD does it the same way as Solaris then that should work.
        regards, tom lane