Thread: Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

"Takayuki Tsunakawa"

Date:

09 February 2007, 00:05:16

From: "Magnus Hagander" <magnus@hagander.net>
> hnetcfg.dll is a part of Windows. "Home Networking Configuration
> Manager". LPK.DLL is also a part of Windows - it's the language
pack.

Thank you for information.

> On Thu, Feb 08, 2007 at 09:50:26PM +0900, Takayuki Tsunakawa wrote:
>> When I try to start PostgreSQL 8.2.1 on Windows 2003 Server with
>> shared_buffers=1024MB, I get the following error messages in the
Event
>> Log (with log_min_messages=debug5) and can't start PostgreSQL:
>
> Is this for testing, or for production? From what I've heard, you
would
> normally never want that much shared memory - I've seen more reports
on
> taht you shuld keep it as low as possible, really. For performance
> reasons.

For testing.  I wanted to place all data in shared buffers to
eliminate reads from disk while I run pgbench repeatedly (actually
most reads should come from kernel cache, though.)

Does PostgreSQL for Windows have any problem when using a large
database cache unlike UNIX versions?  I'm excited about your current
great work to enable building all of PostgreSQL with MSVC.  I thought
you are aiming at making PostgreSQL 64-bit on Windows in the near
future (though you may not have signified in ML.)  I'm afraid MinGW
will not extend to 64-bit (for x64 and Itanium) at least reliably and
immediately, due to the difference of data model -- 'long' is still
32-bit in 64-bit applications on Windows.  I thought Magnus-san got
worried about it and started the activity of completely switching to
MSVC.

BTW, the current PostgreSQL for Windows is very slow, isn't it?  I
compared the performance of PostgreSQL 8.2.x for Linux (RHEL4 for x86,
kernel 2.6.x) and Windows Server 2003.  I ran 'pgbench -c32 -t500' on
the same machine with the same disk layout for data files and WAL,
i.e. they are stored on separate disks.  The settings in
postgresql.conf is the same, except for wal_sync_method -- it is set
to open_sync on Linux and open_datasync on Windows, because they are
the best for each platform.
Linux version shows 1100 tps, but Windows version shows only 450 tps.

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

"Takayuki Tsunakawa"

Date:

09 February 2007, 07:30:27

From: "Magnus Hagander" <magnus@hagander.net>
> Right. Which is why you're likely to see better performance if you
keep
> shared buffers smaller. There is something in dealing with it that's
> slow on win32, per reports from the field. It needs to be
investigated
> further...
> We've had reports that it's slow with large shared_buffers, yes.

That's a shocking news.  I'm sad.
I wonder whether the field you are talking about set Windows to use
more memory for programs than for filesystem cache, which is
selectable from [System] applet of Control Panel (Oh, I wonder how my
machine is set in this respect... have to check.)  If filesystem cache
is preferred, the following senario may be possible:

1. PostgreSQL tries to read data from disk into database cache.
2. The kernel tries to allocate filesystem buffers by paging out
PostgreSQL's memory (possibly shared buffers).
3. PostgreSQL finds data requested by its clients in database cache,
and tries to get it in memory.
4. But the shared buffers are paged out, and page-ins happen.

> Are you sure you're not running this on for example
> IDE disks with write-cache that lies? Windows will write through
that
> write-cache even if the disk lies, whereas most linux versions
won't. At
> least that used to be the case not too long ago, but there has also
been
> talking about fixign that in linux, so maybe that's done...

I'm using a PC server whose disks are all SCSI.  It has no IDE disk.

> Also note that when you run pg_bench on the local machine, you take
a
> much higher hit from the fact that context switching between
processes
> is a lot more expensive on Windows than it is on Linux. But it
shouldn't
> be big enough to explain the huge difference you had in your test.

Yes, I suspect it, too.  So, Oracle uses one multi-threaded server
process on Windows, while it employs multi-process architecture.  SQL
Server is of course multi-threaded.  SRA's original PostgreSQL for
Windows (based on 7.x) was also multi-threaded.

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

Magnus Hagander

Date:

09 February 2007, 09:16:26

On Fri, Feb 09, 2007 at 01:06:13PM +0900, Takayuki Tsunakawa wrote:
> > On Thu, Feb 08, 2007 at 09:50:26PM +0900, Takayuki Tsunakawa wrote:
> >> When I try to start PostgreSQL 8.2.1 on Windows 2003 Server with
> >> shared_buffers=1024MB, I get the following error messages in the
> Event
> >> Log (with log_min_messages=debug5) and can't start PostgreSQL:
> >
> > Is this for testing, or for production? From what I've heard, you
> would
> > normally never want that much shared memory - I've seen more reports
> on
> > taht you shuld keep it as low as possible, really. For performance
> > reasons.
> 
> For testing.  I wanted to place all data in shared buffers to
> eliminate reads from disk while I run pgbench repeatedly (actually
> most reads should come from kernel cache, though.)

Right. Which is why you're likely to see better performance if you keep
shared buffers smaller. There is something in dealing with it that's
slow on win32, per reports from the field. It needs to be investigated
further...

> Does PostgreSQL for Windows have any problem when using a large
> database cache unlike UNIX versions? 

We've had reports that it's slow with large shared_buffers, yes.

> I'm excited about your current
> great work to enable building all of PostgreSQL with MSVC.  I thought
> you are aiming at making PostgreSQL 64-bit on Windows in the near
> future (though you may not have signified in ML.)  I'm afraid MinGW
> will not extend to 64-bit (for x64 and Itanium) at least reliably and
> immediately, due to the difference of data model -- 'long' is still
> 32-bit in 64-bit applications on Windows.  I thought Magnus-san got
> worried about it and started the activity of completely switching to
> MSVC.

Well, that's one of the reasons, yes. But being able to build 64-bit
won't automatically mean that larger shared buffers is the way to go.

> BTW, the current PostgreSQL for Windows is very slow, isn't it?  I
> compared the performance of PostgreSQL 8.2.x for Linux (RHEL4 for x86,
> kernel 2.6.x) and Windows Server 2003.  I ran 'pgbench -c32 -t500' on
> the same machine with the same disk layout for data files and WAL,
> i.e. they are stored on separate disks.  The settings in
> postgresql.conf is the same, except for wal_sync_method -- it is set
> to open_sync on Linux and open_datasync on Windows, because they are
> the best for each platform.
> Linux version shows 1100 tps, but Windows version shows only 450 tps.

It's certainly slower than on Linux, but that's a larger difference than
i've usually seen. Are you sure you're not running this on for example
IDE disks with write-cache that lies? Windows will write through that
write-cache even if the disk lies, whereas most linux versions won't. At
least that used to be the case not too long ago, but there has also been
talking about fixign that in linux, so maybe that's done...

Also note that when you run pg_bench on the local machine, you take a
much higher hit from the fact that context switching between processes
is a lot more expensive on Windows than it is on Linux. But it shouldn't
be big enough to explain the huge difference you had in your test.

//Magnus

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

Martijn van Oosterhout

Date:

09 February 2007, 10:38:52

On Fri, Feb 09, 2007 at 07:31:33PM +0900, Takayuki Tsunakawa wrote:
> 1. PostgreSQL tries to read data from disk into database cache.
> 2. The kernel tries to allocate filesystem buffers by paging out
> PostgreSQL's memory (possibly shared buffers).
> 3. PostgreSQL finds data requested by its clients in database cache,
> and tries to get it in memory.
> 4. But the shared buffers are paged out, and page-ins happen.

One of the biggest issues with large shared memory segments is that we
can't be sure they'll actually stay in memory. So your shared memory
block should not be too much larger than your working set.

So yes, if you make a really large segment, the problem you describe
may happen.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

Tom Lane

Date:

09 February 2007, 10:49:40

"Takayuki Tsunakawa" <tsunakawa.takay@jp.fujitsu.com> writes:
> I wonder whether the field you are talking about set Windows to use
> more memory for programs than for filesystem cache, which is
> selectable from [System] applet of Control Panel (Oh, I wonder how my
> machine is set in this respect... have to check.)  If filesystem cache
> is preferred, the following senario may be possible:

> 1. PostgreSQL tries to read data from disk into database cache.
> 2. The kernel tries to allocate filesystem buffers by paging out
> PostgreSQL's memory (possibly shared buffers).
> 3. PostgreSQL finds data requested by its clients in database cache,
> and tries to get it in memory.
> 4. But the shared buffers are paged out, and page-ins happen.

It's certainly true that if shared_buffers is large enough to make the
kernel try to swap out parts of the shared buffer array, then you've got
a counterproductive situation resulting in net *more* I/O than if you'd
used a smaller setting.  On some Unixen shared memory is implicitly
locked in RAM, and on others it's possible to request locking it (though
I'm not sure we try to at the moment).  Perhaps it's always swappable on
Windows?  Or maybe Windows is just more eager to swap it out?
        regards, tom lane

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

Magnus Hagander

Date:

09 February 2007, 10:55:31

On Fri, Feb 09, 2007 at 09:49:25AM -0500, Tom Lane wrote:
> "Takayuki Tsunakawa" <tsunakawa.takay@jp.fujitsu.com> writes:
> > I wonder whether the field you are talking about set Windows to use
> > more memory for programs than for filesystem cache, which is
> > selectable from [System] applet of Control Panel (Oh, I wonder how my
> > machine is set in this respect... have to check.)  If filesystem cache
> > is preferred, the following senario may be possible:
> 
> > 1. PostgreSQL tries to read data from disk into database cache.
> > 2. The kernel tries to allocate filesystem buffers by paging out
> > PostgreSQL's memory (possibly shared buffers).
> > 3. PostgreSQL finds data requested by its clients in database cache,
> > and tries to get it in memory.
> > 4. But the shared buffers are paged out, and page-ins happen.
> 
> It's certainly true that if shared_buffers is large enough to make the
> kernel try to swap out parts of the shared buffer array, then you've got
> a counterproductive situation resulting in net *more* I/O than if you'd
> used a smaller setting.  On some Unixen shared memory is implicitly
> locked in RAM, and on others it's possible to request locking it (though
> I'm not sure we try to at the moment).  Perhaps it's always swappable on
> Windows?  Or maybe Windows is just more eager to swap it out?

The way it is it is definitly always swappable. I've been thinking of
digging into that, but haven't had the time. There are API calls to mark
memory as non-swappable, but I'm not sure it works on shared memory the
way we do it.

Apart from saying that, I will refrain from speculatnig more in *why*
it's slower with more shared memory before someone (yeah, I realise that
could be me) does some actual investigation into what happens.

//Magnus

Re: [PATCHES] How can I use 2GB of shared buffers on Windows?

From

Magnus Hagander

Date:

10 February 2007, 05:30:32

Takayuki Tsunakawa wrote:
> From: "Magnus Hagander" <magnus@hagander.net>
>> Right. Which is why you're likely to see better performance if you
> keep
>> shared buffers smaller. There is something in dealing with it that's
>> slow on win32, per reports from the field. It needs to be
> investigated
>> further...
>> We've had reports that it's slow with large shared_buffers, yes.
> 
> That's a shocking news.  I'm sad.

It shouldn't make you sad until it has been proven that it's actually a
problem. And if we can do something about it :-)

> I wonder whether the field you are talking about set Windows to use
> more memory for programs than for filesystem cache, which is
> selectable from [System] applet of Control Panel (Oh, I wonder how my
> machine is set in this respect... have to check.)  If filesystem cache
> is preferred, the following senario may be possible:

Could be, I haven't run the tests myself, and it was tests on production
systems, and not actual benchmark runs.

>> Are you sure you're not running this on for example
>> IDE disks with write-cache that lies? Windows will write through
> that
>> write-cache even if the disk lies, whereas most linux versions
> won't. At
>> least that used to be the case not too long ago, but there has also
> been
>> talking about fixign that in linux, so maybe that's done...
> 
> I'm using a PC server whose disks are all SCSI.  It has no IDE disk.

There goes that idea :-) Then you need to dig further into why it's such
a big difference.

>> Also note that when you run pg_bench on the local machine, you take
> a
>> much higher hit from the fact that context switching between
> processes
>> is a lot more expensive on Windows than it is on Linux. But it
> shouldn't
>> be big enough to explain the huge difference you had in your test.
> 
> Yes, I suspect it, too.  So, Oracle uses one multi-threaded server
> process on Windows, while it employs multi-process architecture.  SQL
> Server is of course multi-threaded.  SRA's original PostgreSQL for
> Windows (based on 7.x) was also multi-threaded.

Right. The windows MM and scheduler system is definitely geared for
multithreaded. But that would make pg on win32 too different from pg on
unix to be maintained without a significantly larger effort than today,
so unless you can find some non-intrusive way to sneak it in (which I
doubt), that's just not going to happen.

That said, the context switching overhead shouldn't be *that* large. but
it'd be interesting to see what the performance difference would be for
the same machine with pg_bench running on a different machine across a
fast network connection.

(BTW, another difference is that pg_bench on unix would be using unix
domain sockets and on windows it would be using tcp/ip. But that really
shouldn't make such a huge difference either, but I guess it would add a
bit to the factor)

//Magnus