Re: _mdfd_getseg can be expensive - Mailing list pgsql-hackers

From Andres Freund
Subject Re: _mdfd_getseg can be expensive
Date
Msg-id 20141031225851.GO13584@awork2.anarazel.de
Whole thread Raw
In response to Re: _mdfd_getseg can be expensive  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: _mdfd_getseg can be expensive  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2014-10-31 18:48:45 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I wrote the attached patch that get rids of that essentially quadratic
> > behaviour, by replacing the mdfd chain/singly linked list with an
> > array. Since we seldomly grow files by a whole segment I can't see the
> > slightly bigger memory reallocations matter significantly. In pretty
> > much every other case the array is bound to be a winner.
> 
> > Does anybody have fundamental arguments against that idea?
> 
> While the basic idea is sound, this particular implementation seems
> pretty bizarre.  What's with the "md_seg_no" stuff, and why is that
> array typed size_t?
> IOW, why didn't you *just* replace the linked list with an array?

It stores the length of the array of _MdfdVec entries. To know whether
it's safe to access some element we first need to check whether we've
loaded that many entries. It's size_t[] because that seemed to be the
most appropriate type for the lenght of an array. It's an array because
md.c had already chosen to represent relation forks via an array indexed
by the fork.

So  size_t           md_seg_no[MAX_FORKNUM + 1];
contains the length of the _MdfdVec array for each fork. These arrays
are stored in:  struct _MdfdVec *md_seg_fds[MAX_FORKNUM + 1];

> This patch seems to be making some other changes that you've failed to
> explain.

I'm not aware of any that aren't just a consequence of not iterating
through the linked list anymore. What change are you thinking of?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: _mdfd_getseg can be expensive
Next
From: Jim Nasby
Date:
Subject: Re: pg_background (and more parallelism infrastructure patches)