Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Date
Msg-id 20150902221313.GA8555@awork2.anarazel.de
Whole thread Raw
In response to Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 2015-09-02 12:24:33 -0700, Peter Geoghegan wrote:
> "Read fetch". One argument past to the intrinsic here specifies that
> the variable will be read only. I did things this way because I
> imagined that there would be very limited uses for the macro only. I
> probably cover almost all interesting cases for explicit memory
> prefetching already.

I'd be less brief in this case then, no need to be super short here.

> > I don't think you should specify the read/write and locality parameters
> > if we don't hand-pick them - right now you're saying the memory will
> > only be read and that it has no temporal locality.
> >
> > I think there should be a empty fallback definition even if the current
> > only caller ends up not needing it - not all callers will require it.
> 
> It started out that way, but Tom felt that it was better to have a
> USE_MEM_PREFETCH because of the branch below...

That doesn't mean we shouldn't still provide an empty definition.

> > Hm. Why do we need a branch here? The target of prefetches don't need to
> > be valid addresses and adding a bunch of arithmetic and a branch for the
> > corner case doesn't sound worthwhile to me.
> 
> ...which is needed, because I'm still required to not dereference wild
> pointers. > In other words, although pg_rfetch()/__builtin_prefetch()
> does not require that a valid pointer be passed, it is not okay to
> read past an array's bounds to read that pointer. The GCC docs are
> clear on this -- "Data prefetch does not generate faults if addr is
> invalid, but the address expression itself must be valid".

That's just a question of how to formulate this though?

pg_rfetch(((char *) state->memtuples ) + 3 * sizeof(SortTuple) + offsetof(SortTuple, tuple))?

For something heavily platform dependent like this that seems ok.

> Because that was the fastest value following testing on my laptop. You
> are absolutely right to point out that this isn't a good reason to go
> with the patch -- I share your concern. All I can say in defense of
> that is that other major system software does the same, without any
> regard for the underlying microarchitecture AFAICT.

I know linux stripped out most prefetches at some point, even from x86
specific code, because it showed that they aged very badly. I.e. they
removed a bunch of them and stuff got faster, whereas they were
beneficial on earlier architectures.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Allow a per-tablespace effective_io_concurrency setting
Next
From: Andres Freund
Date:
Subject: Re: Improving replay of XLOG_BTREE_VACUUM records