Thread: equal() perf tweak

equal() perf tweak

From

Neil Conway

Date:

03 November 2003, 10:01:05

Currently, equal() does the following for List nodes:

case T_List:
{
    List       *la = (List *) a;
    List       *lb = (List *) b;
    List       *l;

    /*
     * Try to reject by length check before we grovel through
     * all the elements...
     */
    if (length(la) != length(lb))
        return false;
    foreach(l, la)
    {
        if (!equal(lfirst(l), lfirst(lb)))
            return false;
        lb = lnext(lb);
    }
    retval = true;
}
break;

This code is inefficient, however: length() requires iterating through
the entire list, so this code scans both lists twice. The attached patch
improves this by implementing equal() with a single scan of both lists.

-Neil

Attachment

equal-list-perf-1.patch

Re: equal() perf tweak

From

Tom Lane

Date:

03 November 2003, 11:05:40

Neil Conway <neilc@samurai.com> writes:
> This code is inefficient, however: length() requires iterating through
> the entire list, so this code scans both lists twice. The attached patch
> improves this by implementing equal() with a single scan of both lists.

You have effectively reverted the code to its previous slow state.

The problem with what you've done is that it recursively applies equal()
to the list contents, which may take a very large number of cycles, only
to eventually fail because one list is a prefix of the other.  The point
of the current coding is to detect that condition before we recurse.

            regards, tom lane

Re: equal() perf tweak

From

Neil Conway

Date:

03 November 2003, 12:48:33

On Mon, 2003-11-03 at 10:04, Tom Lane wrote:
> You have effectively reverted the code to its previous slow state.

Erm, the original version of this code in CVS (from ~7 years ago) is the
following:

    case T_List:
    {
        List *la = (List*)a;
        List *lb = (List*)b;
        List *l;

        if (a==NULL && b==NULL)
        return (true);
        if (length(a)!=length(b))
        return (false);
        foreach(l, la) {
        if (!equal(lfirst(l), lfirst(lb)))
            return (false);
        lb = lnext(lb);
        }
        retval = true;
    }
    break;

i.e. it is effectively the same as the code in CVS HEAD. To what
"previous state" does this patch revert the code?

> The problem with what you've done is that it recursively applies equal()
> to the list contents, which may take a very large number of cycles, only
> to eventually fail because one list is a prefix of the other.  The point
> of the current coding is to detect that condition before we recurse.

I don't understand: granted, this applies equal() to each element of the
list, but why would that be particularly slow?

equal() applied to the lfirst() of a list doesn't invoke equal() on a
T_List node (the tag of the lfirst() of the list is the data type of the
elements of the list), so there should only be a single level of
recursion.

I was going to benchmark this, but pulling the list code out of the rest
of the source is a bit hairy. Let me know if you think this would be
helpful.

-Neil

Re: equal() perf tweak

From

Tom Lane

Date:

03 November 2003, 17:40:04

Neil Conway <neilc@samurai.com> writes:
> On Mon, 2003-11-03 at 10:04, Tom Lane wrote:
>> You have effectively reverted the code to its previous slow state.

> Erm, the original version of this code in CVS (from ~7 years ago) is the
> following:

Hm.  I coulda sworn that at some point I changed that code from
essentially-what-you're-proposing to the current state.  Memory is
misserving me I guess.

> I don't understand: granted, this applies equal() to each element of the
> list, but why would that be particularly slow?

The point is that the elements of the lists could be large, complicated
structures that will take noticeable amounts of time to compare.  A
typical case would be that the elements are targetlist structures, with
a minimum of three contained nodes each with several fields.  equal() on
that will take much longer than just counting the presence of the list
element.

I will grant that there are situations where your proposed change would
be a win --- in particular, for long lists wherein the first few
elements differ, it'd be faster to do it as you suggest.  But Postgres
mostly deals with relatively short lists, in which situation the
compare-the-lengths step is quite cheap and can save a fair amount of
per-node comparison processing.  So I don't think it'd be a win overall
to change.

FYI, equali() (now in list.c) used to look pretty nearly like the code
in equal(), but now looks exactly as you propose for equal().  The
difference is that in equali(), we know for certain that comparing two
list members is a very cheap operation.  In equal() it's likely to be
nontrivial.

            regards, tom lane

Re: equal() perf tweak

From

Neil Conway

Date:

03 November 2003, 18:12:50

On Mon, 2003-11-03 at 16:39, Tom Lane wrote:
> The point is that the elements of the lists could be large, complicated
> structures that will take noticeable amounts of time to compare.

Good point, I agree with your analysis. At the very least, I'll add a
comment to equal() explaining why the algorithm works as it does.

> I will grant that there are situations where your proposed change would
> be a win --- in particular, for long lists wherein the first few
> elements differ, it'd be faster to do it as you suggest.

Do you think it would be worth the trouble to use both algorithms, and
then test on the node tag of the first element to decide which one to
use? (The assumption being lists are homogeneous). We wouldn't need to
enumerate all the possible node tags -- just a few common,
known-to-be-small-and-easy-to-compare types would use the new algorithm,
whereas everything else would use the existing code.

One thing I've been wondering about is whether it would be worth ripping
out the existing List code wholesale, and replacing it with something
like the following:

struct ListNode
{
    union
    {
        void   *ptr_value;
    int     int_value;
    Oid    oid_value;
    } elem;

    struct ListNode *next;
};

struct List
{
    struct ListNode *head;
    struct ListNode *tail;
    int length;
};

That would fix the O(n) behavior of both lappend() and length() in one
swell foop -- we've had performance problems from the former on a couple
of occasions before, and given the ~350 odd uses of lappend() in the
backend, I suspect there are more problems waiting to be found. For
example, try creating a 75,000 line pg_hba.conf file, and watch the
postmaster chew minutes (hours?) of CPU time trying to parse it. (That
particular example is unrealistic, of course. )I'd also expect that
keeping track of the length of most lists will end up saving cycles,
overall.

One downside would be that we wouldn't get the nice Lisp-like semantics
of the existing List, but I'm not sure how necessary those are.

Comments?

-Neil

Re: equal() perf tweak

From

Tom Lane

Date:

03 November 2003, 19:20:09

Neil Conway <neilc@samurai.com> writes:
> Do you think it would be worth the trouble to use both algorithms, and
> then test on the node tag of the first element to decide which one to
> use? (The assumption being lists are homogeneous).

Hard to tell.  Since I haven't seen any evidence that equal() on lists
is a particular hotspot, I'd lean against adding complexity and
maintenance burden here.

> One thing I've been wondering about is whether it would be worth ripping
> out the existing List code wholesale, and replacing it with something
> like the following:

I have already done something much like this in a few hotspots using the
FastList structure.  But notationally, it's a pain in the neck compared
to the existing List code.  If you can think of a way to implement this
without adding a lot of notational cruft, I'd sure be for it.  I'm not
for it if it imposes as much messiness as the FastList approach does...

            regards, tom lane

Re: equal() perf tweak

From

Neil Conway

Date:

04 November 2003, 00:49:59

On Mon, 2003-11-03 at 18:19, Tom Lane wrote:
> Hard to tell.  Since I haven't seen any evidence that equal() on lists
> is a particular hotspot, I'd lean against adding complexity and
> maintenance burden here.

Ok, I agree -- probably not worth doing, then.

> I have already done something much like this in a few hotspots using the
> FastList structure.  But notationally, it's a pain in the neck compared
> to the existing List code.  If you can think of a way to implement this
> without adding a lot of notational cruft, I'd sure be for it.

Which specific notational capabilities did you have in mind? i.e. is it
just something like 'foreach' loop syntax sugar, or something else in
particular?

-Neil

Re: equal() perf tweak

From

Tom Lane

Date:

04 November 2003, 12:41:22

Neil Conway <neilc@samurai.com> writes:
> On Mon, 2003-11-03 at 18:19, Tom Lane wrote:
>> I have already done something much like this in a few hotspots using the
>> FastList structure.  But notationally, it's a pain in the neck compared
>> to the existing List code.  If you can think of a way to implement this
>> without adding a lot of notational cruft, I'd sure be for it.

> Which specific notational capabilities did you have in mind? i.e. is it
> just something like 'foreach' loop syntax sugar, or something else in
> particular?

Well, if you look at the FastList mods I made, such as this one:
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/optimizer/plan/createplan.c.diff?r1=1.143&r2=1.144
they are just plain messy.  Part of this comes from the fact that I
didn't want to engage in a wholesale rewrite, and so the code had to
interoperate with List-based stuff instead of replacing it.

I had an idea this morning about a way to reduce the notational changes
needed.  Suppose we redefine typedef List as the list header structure,
so that references to Lists are still of type "List *":

typedef struct List
{
    NodeTag     type;   // T_List, T_IntList, or T_OidList
    struct ListCell *lhead;
    struct ListCell *ltail;
    int         llength;
} List;

typedef struct ListCell
{
    union
    {
        void       *ptr_value;
        int         int_value;
        Oid         oid_value;
    }           elem;
    struct ListCell *next;
} ListCell;

(This is the same as what you suggested except that I put a NodeTag
back into List; we really can't do without that.)

The idea that makes this work is to stipulate that "(List *) NULL"
is still a valid representation of an empty list.  Then it still works
to initialize a List* variable with

        List    *mylist = NIL;

which immediately gets rid of half of the notational problem with
FastList (messy initialization).  The first lcons() or lappend()
operation would have to palloc the List header as well as the first
ListCell, but that's no problem.

I think actually you'd want to specify that "(List *) NULL" is the
*only* valid representation of an empty list, because otherwise you
have to insert some exceedingly ugly special-case code in equal() to
get it to treat a NULL pointer as equal to a pointer to a List header
that has zero llength.  But this is not so bad, because it means that the
special-case code for "(List *) NULL" replaces the special cases that
would otherwise deal with lhead or ltail being NULL, instead of being an
additional special case.

Hmm ... taking that one step further, you could probably avoid the need
for two palloc's in the first lcons/lappend if the List header were
combined with the first ListCell.  This would make for some grottiness
in ldelete when removing the first list entry, but that's probably
a good tradeoff.

The above has a number of notational advantages over what we have now
--- for example, we can eliminate the separate routines for equali()
and equalo(), because the list header's NodeTag will tell which kind of
elements the list holds, and so the general equal() routine can be made
to handle all cases correctly.  Also, the contents of ListCell are
reduced from 3 words to 2 (we don't bother storing a NodeTag in each
individual cell) which makes for a nice space savings on big lists.
(I think to exploit this, aset.c would have to be tweaked to make the
minimum allocation chunk 8 bytes rather than 16, but that's easy.)

Most of this change could be hidden within the List-related code.  The
only serious objection I see is that the iterator variable needed for
traversing a List now needs to be of type ListCell* rather than List*.
That would require touching a lot of places, though it's a fairly simple
change.  All the other List operations could keep their existing API.

I'm starting to like this ... I'd love to revert those FastList changes.

            regards, tom lane

Re: equal() perf tweak

From

Neil Conway

Date:

05 November 2003, 05:25:32

Ok, I've attached new versions of list.c and pg_list.h -- this is just a
*rough sketch* of the new List code -- it definitely won't compile, the
intent is just to concretely specify the new List design. Also, I've
only updated the important list functions: I stopped at nth().

(I've attached the whole files, rather than diffs, because I thought
that would be easier to read...)

Any comments would be welcome -- I'm sure I've mucked a few things up.
Also, since it doesn't compile and I haven't done any testing, there are
probably some thinkos & typos in the code.

On Tue, 2003-11-04 at 11:40, Tom Lane wrote:
> Hmm ... taking that one step further, you could probably avoid the need
> for two palloc's in the first lcons/lappend if the List header were
> combined with the first ListCell.  This would make for some grottiness
> in ldelete when removing the first list entry, but that's probably
> a good tradeoff.

I haven't implemented this (yet), or the aset.c changes you suggested.

-Neil

Attachment

Re: equal() perf tweak

From

Tom Lane

Date:

05 November 2003, 13:08:47

Neil Conway <neilc@samurai.com> writes:
> #define length(l)            ((l)->length)
> #define value(cell)            ((cell)->elem.ptr_value)
> #define ivalue(cell)        ((cell)->elem.int_value)
> #define ovalue(cell)        ((cell)->elem.oid_value)

Couple of objections here: one is that defining such common words as
"length" or "value" as macros is going to break everything in sight
(starting with the List struct itself ;-)).  Also, if we are going
to use NIL to represent an empty list, length() is really going to
need to be like
    (l) ? l->length : 0
Coupling that objection with the fear of multiple evaluations of
arguments (not sure doing so would break anything, but definitely not
sure that it wouldn't), I think that length() will have to continue
to be a function not a macro.  Possibly for the gcc case we could make
it a "static inline" function.  As for value() and friends, use a less
generic name like cellvalue() ... and please respect the naming
convention stated just five lines earlier.

> #define lfirst(l)             value((l)->head)
> #define lfirsti(l)            ivalue((l)->head)
> #define lfirsto(l)            ovalue((l)->head)

No, lfirst and friends will need to apply to ListCells not to Lists,
else the coding of foreach loops will break everywhere.  I think there
are many more places that will want lfirst to apply to a ListCell than
will want it to apply to a raw List.

You will probably want to invent a function (not macro for fear of
multiple evals) like

    ListCell *firstcell(l) {
        return l ? l->head : NULL;
    }

and use this (not a direct reference to l->head) in foreach().  Then
the places that do want a true lfirst() on a list will need to be
rewritten as lfirst(firstcell(list)).

Not sure what to do about the lsecond/lthird/lfourth macros.  Those are
not used much.  Maybe we could rename them to ListSecond etc and then
define ListFirst as lfirst(firstcell(list)) for the places that do want
it to work that way.

> #define foreach(_cell_,_list_) \
>     for (_cell_ = (_list_)->head; _elt_->next != NULL; _elt_ = _elt->next)

The test condition still needs to be _elt_ != NULL.

>  *      XXX a few of the following functions are duplicated to handle
>  *          List of pointers and List of integers separately. Some day,
>  *          someone should unify them.            - ay 11/2/94

How much of this NOTE can go away now?

>     List *retval;

>     retval = palloc(sizeof(*retval));

Can't we use makeNode()?

> /*
>  * Add a new, and as yet undefined, head element to the list, which
>  * must be non-NIL. The list is mutated in place. The caller should
>  * fill in the value of the new head element.
>  */
> static List *
> new_head_elem(List *list)

I don't think you want to do things this way; it won't scale up to the
merge-the-first-element-into-the-List-header approach.  What would
probably be better is to have a combined
build-the-List-header-and-first-cell subroutine.  Also, don't put in
cases to handle initially-empty lists, because there won't be any.

check_invariant() could reasonably also test that list->tail->next is
NULL.

> nconc(List *l1, List *l2)
> {
>     /*
>      * XXX: memory allocation is ambiguous here: does the caller free
>      * the memory for l1 or l2?
>      */

At present, the caller doesn't pfree anything, since the result list
needs every cell in both inputs.  I think what we will want nconc to do
is pfree the second List header; in the event that the second List's
first element is palloc'd with the header, it will have to be
re-allocated (or maybe better, just re-use the storage in place).
You'll need to look at all the uses of nconc (fortunately there are not
many) to verify that the second List is never again used as an
independent entity.  If it is, then we can't use nconc anyway unless we
first copy the second List, as later mods to either list could destroy
the invariant for the other list.

            regards, tom lane

Re: equal() perf tweak

From

Tom Lane

Date:

05 November 2003, 14:23:09

I said:
> No, lfirst and friends will need to apply to ListCells not to Lists,
> else the coding of foreach loops will break everywhere.  I think there
> are many more places that will want lfirst to apply to a ListCell than
> will want it to apply to a raw List.

On the other hand, we will need to touch every foreach loop anyway to
update the datatype of the iteration variable.  I can see merit in the
idea of *deliberately* breaking any un-updated code by means of removing
the lfirst and lnext macros completely.  If we go that path, we could
use names like

    nextcell(cl)        replaces lnext()
    cellvalue(cl)        replaces lfirst()
    cellvaluei(cl)        replaces lfirsti()
    cellvalueo(cl)        replaces lfirsto()

plus invent names like ListFirst(), ListSecond(), etc for the cases
where you really are fetching the n'th element of a List rather than
the contents of the current ListCell.

I'm not wedded to these particular naming suggestions, just thinking
that maybe backwards-compatibility of the macro names isn't such a hot
idea after all.  What do you think?

            regards, tom lane

Re: equal() perf tweak

From

Gaetano Mendola

Date:

05 November 2003, 20:27:02

Neil Conway wrote:
> Ok, I've attached new versions of list.c and pg_list.h -- this is just a
> *rough sketch* of the new List code -- it definitely won't compile, the
> intent is just to concretely specify the new List design. Also, I've
> only updated the important list functions: I stopped at nth().
>
> (I've attached the whole files, rather than diffs, because I thought
> that would be easier to read...)
>
> Any comments would be welcome -- I'm sure I've mucked a few things up.
> Also, since it doesn't compile and I haven't done any testing, there are
> probably some thinkos & typos in the code.
>
> On Tue, 2003-11-04 at 11:40, Tom Lane wrote:
>
>>Hmm ... taking that one step further, you could probably avoid the need
>>for two palloc's in the first lcons/lappend if the List header were
>>combined with the first ListCell.  This would make for some grottiness
>>in ldelete when removing the first list entry, but that's probably
>>a good tradeoff.
>
>
> I haven't implemented this (yet), or the aset.c changes you suggested.

Why instead of reinvent the whell not use, or at least do a "C" port of
stl::list ? At my knowledge is the best/optimized list implementation.

And another my wish is too see Postgres compilable with a c++ compiler
so a day we can use ( in a absolutely portable way):  vector, list, map
already implemented, considering also the fact that all stl container
are "memory allocator" customizable, for example you can use a map using
an allocator in shared memory.

Regards
Gaeatano Mendola

PS: My 2 cents: I don't like too much have the lenght inside the list
struct.

Re: equal() perf tweak

From

Neil Conway

Date:

05 November 2003, 20:45:09

Gaetano Mendola <mendola@bigfoot.com> writes:
> Why instead of reinvent the whell not use, or at least do a "C" port of
> stl::list ?

Because (a) implementing a linked list is pretty trivial (b) the only
difficult part is getting the semantics / API right. I don't see how
std::list would help with (b), and (a) negates the benefit of
importing the code from elsewhere.

We'd also have to gut std::list, since we wouldn't be able to make use
of C++ templates.

That said, if you know of any specific techniques from std::list
implementations that would be useful, please let me know.

> PS: My 2 cents: I don't like too much have the lenght inside the list
> struct.

Why not?

-Neil

Re: equal() perf tweak

From

Gaetano Mendola

Date:

05 November 2003, 23:30:39

Neil Conway wrote:

> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>>Why instead of reinvent the whell not use, or at least do a "C" port of
>>stl::list ?
>
>
> Because (a) implementing a linked list is pretty trivial (b) the only
> difficult part is getting the semantics / API right. I don't see how
> std::list would help with (b), and (a) negates the benefit of
> importing the code from elsewhere.
>
> We'd also have to gut std::list, since we wouldn't be able to make use
> of C++ templates.
>
> That said, if you know of any specific techniques from std::list
> implementations that would be useful, please let me know.

An interesting think that stl::list do is to never do
an "if" branch during an insert/remove phase, an stl::list is a struct
with a Master Node:

struct node
{
     void* value;
     node* next;
     node* prev;
}


struct list
{
    node* master_node;
};

here respect your proposal a node is saved.


an empty list is initialized with:

master_node = [ ... create a node ... ]
master_node->next = master_node;
master_node->prev = master_node;


and the insertion phase sound like:

void
AddHead(list *l, node *e)
{
    node *where = l->master_node->next;
    e->next = where;
    e->prev = where->prev;

    where->prev->next = e;
    where->prev = e;
}

void
AddTail(list *l, node *e)
{
    node *where = l->master_node;
    e->next = where;
    e->prev = where->prev;

    where->prev->next = e;
    where->prev = e;
}


now is very late here ( 04:17 AM ) so I wrote
the wrong code ( not checked ) but show the idea of
avoid "if".


>>PS: My 2 cents: I don't like too much have the lenght inside the list
>>struct.
>
>
> Why not?

What if you will never call the size() on a list doing massive
insert ? May be we need two list, depending on the way we are going
to use it?

I'm too much C++ oriented and another very weak argument is: for the
same reason that STL (at least in gcc) doesn't have it:
http://gcc.gnu.org/ml/gcc-bugs/1998-12/msg00270.html

may be the third point not apply to us.



Regards
Gaetano Mendola

Re: equal() perf tweak

From

Neil Conway

Date:

05 November 2003, 23:58:39

Gaetano Mendola <mendola@bigfoot.com> writes:
> An interesting think that stl::list do is to never do
> an "if" branch during an insert/remove phase

Why is this significant? Surely you're not claiming that avoid the
branch will affect performance in a meaningful way...

> What if you will never call the size() on a list doing massive
> insert?

Then you do a few extra integer increments. Compared to, say, a
palloc() for each new list node, I'm skeptical those increments will
significantly affect performance.

> May be we need two list, depending on the way we are going to use
> it?

You mean one in which we keep track of the length, and another in
which we don't? That seems like a lot of pain for a very little gain.

> I'm too much C++ oriented and another very weak argument is: for the
> same reason that STL (at least in gcc) doesn't have it:
> http://gcc.gnu.org/ml/gcc-bugs/1998-12/msg00270.html

(1) Not relevant to us

(2) An additional word per List is hardly a prohibitive amount of
    overhead, as we don't create an obscene number of Lists (saving a
    word per ListCell would be a different matter).

(3) STL needs to be a general-purpose library, while we only need to
    worry about what the requirements of the backend: passing around
    and manually incrementing list lengths would be horribly painful.

My intuition/judgement is that we call length() often enough in the
backend that it is worth keeping track of the list length:

$ grep -rI ' length(' src/backend/* | grep -v Assert | \
grep -v '\.sh' | grep -v '/\*' | wc -l
139

But I'm not adamant about it. That said, I'm unconvinced by the
arguments for removing the list length I've heard so far.

BTW, thanks for the feedback, Tom and Gaetano.

-Neil

Re: equal() perf tweak

From

Tom Lane

Date:

05 November 2003, 23:59:30

Gaetano Mendola <mendola@bigfoot.com> writes:
> [ use list with master node and circular linking ]

> now is very late here ( 04:17 AM ) so I wrote
> the wrong code ( not checked ) but show the idea of
> avoid "if".

This saves an "if" during addition of a list item, at the cost of
greater expense during list traversal.  (Instead of a loop termination
test like "ptr != NULL", you need something like "ptr != master_node".
This may not take an extra cycle, but only if you can afford to dedicate
a register to holding the value of master_node within the loop.  That
hurts, especially on architectures with not very many registers.)

Offhand I do not see a win here.  Node addition implies a palloc which
means you are going to be jumping control all over the place anyway;
there's no way that eliminating one "if" in that path of control is
going to be a meaningful improvement.  Saving a cycle or two during list
traversal could be a meaningful improvement, though.

By the same token, keeping a length count in the master node might be a
waste of cycles, but it's going to be a pretty tiny waste compared to
the other overhead of adding or removing a node.  The potential savings
from making length() be a constant-time operation seem like a win to me.
Anyway, we should build the code that way and then do profiling with and
without support for constant-cost length().

            regards, tom lane

Re: equal() perf tweak

From

Tom Lane

Date:

06 November 2003, 00:17:37

Neil Conway <neilc@samurai.com> writes:
> (2) An additional word per List is hardly a prohibitive amount of
>     overhead, as we don't create an obscene number of Lists (saving a
>     word per ListCell would be a different matter).

Actually, it's free, considering that the alternatives are

    int NodeTag;
    foo *head;
    foo *tail;
    int length;   -- or not.

On a 32-bit-pointer machine, you got your choice of 12 or 16 bytes ---
but palloc will round 12 to 16 anyway.  On a 64-bit-pointer machine,
you got your choice of 20 or 24 bytes ... still no win, even if palloc
weren't going to round it to 32, because the MAXALIGN quantum is almost
certainly 8 on such a machine.

This does suggest that it might be worth making the struct layout be

    int NodeTag;
    int length;
    foo *head;
    foo *tail;

since this would avoid some padding overhead on a machine where pointers
are 8 bytes and need 8-byte alignment.  It probably doesn't help given
the current implementation of palloc, but why throw away padding space?

> My intuition/judgement is that we call length() often enough in the
> backend that it is worth keeping track of the list length:

Its use in equal() and potential use for existing equali() and equalo()
calls seems like alone sufficient reason to do so.  But it should be
relatively painless to prove this one way or the other by building the
code to keep track and then ripping out that code and putting back the
O(N) length() implementation.  Benchmarking the two would settle the
matter, or at least give us real data to argue about...

            regards, tom

Re: equal() perf tweak

From

Neil Conway

Date:

06 November 2003, 01:11:53

Tom Lane <tgl@sss.pgh.pa.us> writes:
> This does suggest that it might be worth making the struct layout be
>
>     int NodeTag;
>     int length;
>     foo *head;
>     foo *tail;
>
> since this would avoid some padding overhead on a machine where pointers
> are 8 bytes and need 8-byte alignment.  It probably doesn't help given
> the current implementation of palloc, but why throw away padding
> space?

Interesting. I've heard in some shops it is standard policy to order
the fields in all structs by their descending sizes (making allowances
for platform-specific variations), so as to reduce padding. Do you
think it would be worthwhile to systematically make this kind of
reorganization throughout the backend?

(Of course, we'd need to be weary of code that depends on order of the
fields in structs, naturally -- such as the "NodeTag must be the first
field" rule.)

-Neil

Re: equal() perf tweak

From

Gaetano Mendola

Date:

06 November 2003, 07:17:56

Tom Lane wrote:

> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>>[ use list with master node and circular linking ]
>
>
>>now is very late here ( 04:17 AM ) so I wrote
>>the wrong code ( not checked ) but show the idea of
>>avoid "if".
>
>
> This saves an "if" during addition of a list item, at the cost of
> greater expense during list traversal.  (Instead of a loop termination
> test like "ptr != NULL", you need something like "ptr != master_node".
> This may not take an extra cycle, but only if you can afford to dedicate
> a register to holding the value of master_node within the loop.  That
> hurts, especially on architectures with not very many registers.)


Well, if the same architecture have not so many register I guess don't
have a double prefetch instruction queue and an "if" branch
will hurt/destroy that queue, am I wrong ?

I'm going to post on comp.lang.c++.moderated and
comp.std.c++ in order to have better argumentations



> Anyway, we should build the code that way and then do profiling with and
> without support for constant-cost length().

I totally agree with you.


Regards
Gaetano Mendola

Re: equal() perf tweak

From

Gaetano Mendola

Date:

06 November 2003, 07:18:03

Tom Lane wrote:

> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>>[ use list with master node and circular linking ]
>
>
>>now is very late here ( 04:17 AM ) so I wrote
>>the wrong code ( not checked ) but show the idea of
>>avoid "if".
>
>
> This saves an "if" during addition of a list item, at the cost of
> greater expense during list traversal.  (Instead of a loop termination
> test like "ptr != NULL", you need something like "ptr != master_node".
> This may not take an extra cycle, but only if you can afford to dedicate
> a register to holding the value of master_node within the loop.  That
> hurts, especially on architectures with not very many registers.)


Well, if the same architecture have not so many register I guess don't
have a double prefetch instruction queue and an "if" branch
will hurt/destroy that queue, am I wrong ?

I'm going to post on comp.lang.c++.moderated and
comp.std.c++ in order to have better argumentations



> Anyway, we should build the code that way and then do profiling with and
> without support for constant-cost length().

I totally agree with you.


Regards
Gaetano Mendola

Re: equal() perf tweak

From

Tom Lane

Date:

08 November 2003, 16:24:15

Neil Conway <neilc@samurai.com> writes:
> Interesting. I've heard in some shops it is standard policy to order
> the fields in all structs by their descending sizes (making allowances
> for platform-specific variations), so as to reduce padding. Do you
> think it would be worthwhile to systematically make this kind of
> reorganization throughout the backend?

Not really.  I'll go with ordering fields in a logical fashion (related
fields together) every time.  But when there's no particular semantic
reason to choose one ordering over another, you might as well look at
padding concerns for a tiebreaker.  In this case there isn't any
particular logical reason AFAICS to prefer whether length appears before
or after head/tail, so why not pick the more compact form?

(Note that I put a higher premium on avoiding padding in on-disk
structures.  But for transient in-memory structures, it definitely
seems like a secondary consideration.)

            regards, tom lane

Re: equal() perf tweak

From

Bruce Momjian

Date:

11 November 2003, 23:04:48

Is there a TODO here?

---------------------------------------------------------------------------

Neil Conway wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > This does suggest that it might be worth making the struct layout be
> >
> >     int NodeTag;
> >     int length;
> >     foo *head;
> >     foo *tail;
> >
> > since this would avoid some padding overhead on a machine where pointers
> > are 8 bytes and need 8-byte alignment.  It probably doesn't help given
> > the current implementation of palloc, but why throw away padding
> > space?
>
> Interesting. I've heard in some shops it is standard policy to order
> the fields in all structs by their descending sizes (making allowances
> for platform-specific variations), so as to reduce padding. Do you
> think it would be worthwhile to systematically make this kind of
> reorganization throughout the backend?
>
> (Of course, we'd need to be weary of code that depends on order of the
> fields in structs, naturally -- such as the "NodeTag must be the first
> field" rule.)
>
> -Neil
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: equal() perf tweak

From

Neil Conway

Date:

11 November 2003, 23:43:33

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Is there a TODO here?

I've been short on time recently, so I haven't had a chance to make
the suggested changes to the List patch. I'll send in an updated
version of the List changes, and if there aren't any gripes, I'll
begin making the changes to the List API usage throughout the
tree.

So there's no need for a TODO item -- I should hopefully have this
wrapped up fairly soon.

-Neil

Re: equal() perf tweak

From

Gaetano Mendola

Date:

12 November 2003, 20:14:59

Bruce Momjian wrote:

> Is there a TODO here?
>

I'm just evaluating the performances
of two version the one proposed by
Neil and the one in a "stl::list" fashion.

Tom suggested also to see if is the
case to implement the function size()
constant time or not.


Regards
Gaetano Mendola.