Thread: Storing a dynahash for an entire connection or transaction?

Storing a dynahash for an entire connection or transaction?

From

Greg Mitchell

Date:

27 November 2006, 12:22:39

I've now got to custom datatypes that map from an int2 value on disk to 
a string by way of a table for each. Currently, I load these tables into 
a dynahash per function call (fcinfo->flinfo->fn_extra). This is working
great is most situations. The problem situation is where there are many
queries (often INSERTS) that need to happen in a short amount of time.
This causes reloads of the dynahash (or hashes) for each query, making
them orders of magnitude slower than when these columns were varchar.

What I'd like to do is change the way I store this hash so that it 
doesn't need to be as frequently updated. I'm open to most any solution. 
One type rarely ever has new values (maybe once every several months) 
and the other gets new values once a night.

Thoughts?

Greg


***************************************

Automated Trading Desk, LLC (ATD) is the sole owner of Automated Trading
Desk Financial Services, LLC (AUTO) and Automated Trading Desk Brokerage
Services, LLC (ATDB), both NASD members and Members SIPC. ATD does not
offer any brokerage services and is not a NASD member. All brokerage
services, trading functions, execution of order flow and related matters
are performed through AUTO and ATDB utilizing ATD's proprietary
technology and software. Any reference to ATD trading, ATD trading
services, ATD trading performance, ATD orders, we, us, our or other such
usage refers to the services and trading activities of AUTO and ATDB
utilizing ATD's proprietary technology and software. Periods of market
volatility or other system delays may adversely affect trade execution
and related services.

Re: Storing a dynahash for an entire connection or transaction?

From

Volkan YAZICI

Date:

27 November 2006, 13:27:18

Hi,

On Nov 27 10:57, Greg Mitchell wrote:
> I've now got to custom datatypes that map from an int2 value on disk to 
> a string by way of a table for each. Currently, I load these tables into 
> a dynahash per function call (fcinfo->flinfo->fn_extra). This is working
> great is most situations. The problem situation is where there are many
> queries (often INSERTS) that need to happen in a short amount of time.
> This causes reloads of the dynahash (or hashes) for each query, making
> them orders of magnitude slower than when these columns were varchar.
> 
> What I'd like to do is change the way I store this hash so that it 
> doesn't need to be as frequently updated. I'm open to most any solution. 
> One type rarely ever has new values (maybe once every several months) 
> and the other gets new values once a night.

You may want to cache these values using a static variable (which will
make itself to be valid per session) that stores its values in the
(for instance) TopTransactionContext.


Regards.

Re: Storing a dynahash for an entire connection or transaction?

From

Tom Dunstan

Date:

27 November 2006, 16:12:47

Volkan YAZICI wrote:
>> What I'd like to do is change the way I store this hash so that it 
>> doesn't need to be as frequently updated. I'm open to most any solution. 
>> One type rarely ever has new values (maybe once every several months) 
>> and the other gets new values once a night.
> 
> You may want to cache these values using a static variable (which will
> make itself to be valid per session) that stores its values in the
> (for instance) TopTransactionContext.

That's the obvious solution (or perhaps in CurTransactionContext), but 
when the function is called in a subsequent transaction, how does it 
determine that the static pointer was allocated from a context which has 
since vanished? I suppose that you could store the memory context 
pointer for later comparison, but while it seems unlikely that you'd get 
the same pointer twice in a row, that's not exactly a guarantee.

I note (reading src/backend/utils/mmgr/README) that there are reset and 
delete methods attached to memory contexts, so perhaps the best way to 
accomplish this would be to create a separate memory context as a child 
of CurTransactionContext, and register a cleanup function which could 
clear the static var when the context is torn down. I'm assuming that 
those methods do get called in such cases, but I haven't delved much.

Cheers

Tom

Re: Storing a dynahash for an entire connection or

From

Neil Conway

Date:

27 November 2006, 16:28:38

On Mon, 2006-11-27 at 20:11 +0000, Tom Dunstan wrote:
> That's the obvious solution (or perhaps in CurTransactionContext), but 
> when the function is called in a subsequent transaction, how does it 
> determine that the static pointer was allocated from a context which has 
> since vanished?

If you're content with your allocations never being automatically
released for the duration of the session (which sounds like the behavior
Greg would like, I'm guessing), you can just allocate the hash table in
TopMemoryContext, in which case you wouldn't need to worry about the
context of allocation vanishing beneath your feet.

A nicer technique is to create a new child context of TopMemoryContext,
and use that context for all the session-duration allocations made by
your extension. This avoids making too many allocations in
TopMemoryContext, lets you get information on the allocations made by
your UDF via MemoryContextStats(), and allows you to easily release the
UDF's allocations by deleting or resetting a single memory context. For
example, deleting your UDF's context in _PG_fini() cleanly avoids
leaking memory when your shared object is unloaded from the backend.

BTW, one common error when using long-lived memory contexts is assuming
that allocations made in these contexts will be released after an
elog(ERROR). This is not true when the memory context's lifetime exceeds
that of a single transaction (as is the case with TopMemoryContext).

-Neil

Re: Storing a dynahash for an entire connection or

From

Andrew Dunstan

Date:

27 November 2006, 18:05:09

Neil Conway wrote:
> On Mon, 2006-11-27 at 20:11 +0000, Tom Dunstan wrote:
>   
>> That's the obvious solution (or perhaps in CurTransactionContext), but 
>> when the function is called in a subsequent transaction, how does it 
>> determine that the static pointer was allocated from a context which has 
>> since vanished?
>>     
>
> If you're content with your allocations never being automatically
> released for the duration of the session (which sounds like the behavior
> Greg would like, I'm guessing), you can just allocate the hash table in
> TopMemoryContext, in which case you wouldn't need to worry about the
> context of allocation vanishing beneath your feet.
>
>   

Maybe I have misunderstood, but I don't see in this case how to 
determine that the cached data is still valid.

cheers

andrew

Re: Storing a dynahash for an entire connection or

From

Neil Conway

Date:

27 November 2006, 18:45:17

On Mon, 2006-11-27 at 17:04 -0500, Andrew Dunstan wrote:
> Maybe I have misunderstood, but I don't see in this case how to 
> determine that the cached data is still valid.

Well, I was saying that if you want to cache something for the duration
of the current session, checking for the validity of the context of
allocation is moot, since you can just use a long-lived context.

If you want to cache stuff for the duration of a transaction, one
technique would be to maintain the cache in a child of TopMemoryContext,
stamp the cache data with the XID that created it, and then use the XID
to decide when to invalidate cached data.

Note that regardless of the memory context that is used, implementing
correct transactional behavior is non-trivial: the int2 -> text mapping
will still be incorrect in the face of changes to the mapping table by
your own transaction (or another committed txn, in the case of read
committed). From Greg's description I guessed that a session-length
cache was what he needed anyway...

-Neil

Re: Storing a dynahash for an entire connection or

From

Greg Mitchell

Date:

05 December 2006, 13:41:38

Ok, I implemented the cache in the TopMemoryContext and then stored the 
pointer in a global static * in the type's .c file.

Now, I'm wondering if there is way I can effectively do a LISTEN on a 
given event such that when the table that represents the map is updated, 
a trigger could call NOTIFY xyz; ? Upon receiving this event, each 
connection should refresh its cache.

Ideas?

Thanks,
Greg

Neil Conway wrote:
> On Mon, 2006-11-27 at 17:04 -0500, Andrew Dunstan wrote:
>> Maybe I have misunderstood, but I don't see in this case how to 
>> determine that the cached data is still valid.
> 
> Well, I was saying that if you want to cache something for the duration
> of the current session, checking for the validity of the context of
> allocation is moot, since you can just use a long-lived context.
> 
> If you want to cache stuff for the duration of a transaction, one
> technique would be to maintain the cache in a child of TopMemoryContext,
> stamp the cache data with the XID that created it, and then use the XID
> to decide when to invalidate cached data.
> 
> Note that regardless of the memory context that is used, implementing
> correct transactional behavior is non-trivial: the int2 -> text mapping
> will still be incorrect in the face of changes to the mapping table by
> your own transaction (or another committed txn, in the case of read
> committed). From Greg's description I guessed that a session-length
> cache was what he needed anyway...
> 
> -Neil
> 
>