Re: profiling connection overhead - Mailing list pgsql-hackers

From Robert Haas
Subject Re: profiling connection overhead
Date
Msg-id AANLkTi=OBqvJtrvbZu7=aSu-GwpBRBGOYnVRLRAQhSDN@mail.gmail.com
Whole thread Raw
In response to Re: profiling connection overhead  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: profiling connection overhead
Re: profiling connection overhead
Re: profiling connection overhead
List pgsql-hackers
On Sun, Nov 28, 2010 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sat, Nov 27, 2010 at 11:18 PM, Bruce Momjian <bruce@momjian.us> wrote:
>>> Not sure that information moves us forward.  If the postmaster cleared
>>> the memory, we would have COW in the child and probably be even slower.
>
>> Well, we can determine the answers to these questions empirically.
>
> Not really.  Per Bruce's description, a page would become COW the moment
> the postmaster touched (either write or read) any variable on it.  Since
> we have no control over how the loader lays out static variables, the
> actual behavior of a particular build would be pretty random and subject
> to unexpected changes caused by seemingly unrelated edits.

Well, one big character array pretty much has to be laid out
contiguously, and it would be pretty surprising (but not entirely
impossible) to find that the linker randomly sprinkles symbols from
other files in between consecutive definitions in the same source
file.  I think the next question to answer is to try to allocate blame
for the memset/memcpy overhead between page faults and the zeroing
itself.  That seems like something we can easily member by writing a
test program that zeroes the same region twice and kicks out timing
numbers.  If, as you and Andres are arguing, the actual zeroing is
minor, then we can forget this whole line of discussion and move on to
other possible optimizations.  If that turns out not to be true then
we can worry about how best to avoid the zeroing.  I have to believe
that's a solvable problem; the question is whether there's a benefit.

In a close race, I don't think we should get bogged down in
micro-optimization here, both because micro-optimizations may not gain
much and because what works well on one platform may not do much at
all on another.  The more general issue here is what to do about our
high backend startup costs.  Beyond trying to recycle backends for new
connections, as I've previous proposed and with all the problems it
entails, the only thing that looks promising here is to try to somehow
cut down on the cost of populating the catcache and relcache, not that
I have a very clear idea how to do that.  This has to be a soluble
problem because other people have solved it.  To some degree we're a
victim of our own flexible and extensible architecture here, but I
find it pretty unsatisfying to just say, OK, well, we're slow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Report: Linux huge pages with Postgres
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Report: Linux huge pages with Postgres