Thread: Storing a dynahash for an entire connection or transaction?
I've now got to custom datatypes that map from an int2 value on disk to a string by way of a table for each. Currently, I load these tables into a dynahash per function call (fcinfo->flinfo->fn_extra). This is working great is most situations. The problem situation is where there are many queries (often INSERTS) that need to happen in a short amount of time. This causes reloads of the dynahash (or hashes) for each query, making them orders of magnitude slower than when these columns were varchar. What I'd like to do is change the way I store this hash so that it doesn't need to be as frequently updated. I'm open to most any solution. One type rarely ever has new values (maybe once every several months) and the other gets new values once a night. Thoughts? Greg *************************************** Automated Trading Desk, LLC (ATD) is the sole owner of Automated Trading Desk Financial Services, LLC (AUTO) and Automated Trading Desk Brokerage Services, LLC (ATDB), both NASD members and Members SIPC. ATD does not offer any brokerage services and is not a NASD member. All brokerage services, trading functions, execution of order flow and related matters are performed through AUTO and ATDB utilizing ATD's proprietary technology and software. Any reference to ATD trading, ATD trading services, ATD trading performance, ATD orders, we, us, our or other such usage refers to the services and trading activities of AUTO and ATDB utilizing ATD's proprietary technology and software. Periods of market volatility or other system delays may adversely affect trade execution and related services.
Hi, On Nov 27 10:57, Greg Mitchell wrote: > I've now got to custom datatypes that map from an int2 value on disk to > a string by way of a table for each. Currently, I load these tables into > a dynahash per function call (fcinfo->flinfo->fn_extra). This is working > great is most situations. The problem situation is where there are many > queries (often INSERTS) that need to happen in a short amount of time. > This causes reloads of the dynahash (or hashes) for each query, making > them orders of magnitude slower than when these columns were varchar. > > What I'd like to do is change the way I store this hash so that it > doesn't need to be as frequently updated. I'm open to most any solution. > One type rarely ever has new values (maybe once every several months) > and the other gets new values once a night. You may want to cache these values using a static variable (which will make itself to be valid per session) that stores its values in the (for instance) TopTransactionContext. Regards.
Volkan YAZICI wrote: >> What I'd like to do is change the way I store this hash so that it >> doesn't need to be as frequently updated. I'm open to most any solution. >> One type rarely ever has new values (maybe once every several months) >> and the other gets new values once a night. > > You may want to cache these values using a static variable (which will > make itself to be valid per session) that stores its values in the > (for instance) TopTransactionContext. That's the obvious solution (or perhaps in CurTransactionContext), but when the function is called in a subsequent transaction, how does it determine that the static pointer was allocated from a context which has since vanished? I suppose that you could store the memory context pointer for later comparison, but while it seems unlikely that you'd get the same pointer twice in a row, that's not exactly a guarantee. I note (reading src/backend/utils/mmgr/README) that there are reset and delete methods attached to memory contexts, so perhaps the best way to accomplish this would be to create a separate memory context as a child of CurTransactionContext, and register a cleanup function which could clear the static var when the context is torn down. I'm assuming that those methods do get called in such cases, but I haven't delved much. Cheers Tom
On Mon, 2006-11-27 at 20:11 +0000, Tom Dunstan wrote: > That's the obvious solution (or perhaps in CurTransactionContext), but > when the function is called in a subsequent transaction, how does it > determine that the static pointer was allocated from a context which has > since vanished? If you're content with your allocations never being automatically released for the duration of the session (which sounds like the behavior Greg would like, I'm guessing), you can just allocate the hash table in TopMemoryContext, in which case you wouldn't need to worry about the context of allocation vanishing beneath your feet. A nicer technique is to create a new child context of TopMemoryContext, and use that context for all the session-duration allocations made by your extension. This avoids making too many allocations in TopMemoryContext, lets you get information on the allocations made by your UDF via MemoryContextStats(), and allows you to easily release the UDF's allocations by deleting or resetting a single memory context. For example, deleting your UDF's context in _PG_fini() cleanly avoids leaking memory when your shared object is unloaded from the backend. BTW, one common error when using long-lived memory contexts is assuming that allocations made in these contexts will be released after an elog(ERROR). This is not true when the memory context's lifetime exceeds that of a single transaction (as is the case with TopMemoryContext). -Neil
Neil Conway wrote: > On Mon, 2006-11-27 at 20:11 +0000, Tom Dunstan wrote: > >> That's the obvious solution (or perhaps in CurTransactionContext), but >> when the function is called in a subsequent transaction, how does it >> determine that the static pointer was allocated from a context which has >> since vanished? >> > > If you're content with your allocations never being automatically > released for the duration of the session (which sounds like the behavior > Greg would like, I'm guessing), you can just allocate the hash table in > TopMemoryContext, in which case you wouldn't need to worry about the > context of allocation vanishing beneath your feet. > > Maybe I have misunderstood, but I don't see in this case how to determine that the cached data is still valid. cheers andrew
On Mon, 2006-11-27 at 17:04 -0500, Andrew Dunstan wrote: > Maybe I have misunderstood, but I don't see in this case how to > determine that the cached data is still valid. Well, I was saying that if you want to cache something for the duration of the current session, checking for the validity of the context of allocation is moot, since you can just use a long-lived context. If you want to cache stuff for the duration of a transaction, one technique would be to maintain the cache in a child of TopMemoryContext, stamp the cache data with the XID that created it, and then use the XID to decide when to invalidate cached data. Note that regardless of the memory context that is used, implementing correct transactional behavior is non-trivial: the int2 -> text mapping will still be incorrect in the face of changes to the mapping table by your own transaction (or another committed txn, in the case of read committed). From Greg's description I guessed that a session-length cache was what he needed anyway... -Neil
Ok, I implemented the cache in the TopMemoryContext and then stored the pointer in a global static * in the type's .c file. Now, I'm wondering if there is way I can effectively do a LISTEN on a given event such that when the table that represents the map is updated, a trigger could call NOTIFY xyz; ? Upon receiving this event, each connection should refresh its cache. Ideas? Thanks, Greg Neil Conway wrote: > On Mon, 2006-11-27 at 17:04 -0500, Andrew Dunstan wrote: >> Maybe I have misunderstood, but I don't see in this case how to >> determine that the cached data is still valid. > > Well, I was saying that if you want to cache something for the duration > of the current session, checking for the validity of the context of > allocation is moot, since you can just use a long-lived context. > > If you want to cache stuff for the duration of a transaction, one > technique would be to maintain the cache in a child of TopMemoryContext, > stamp the cache data with the XID that created it, and then use the XID > to decide when to invalidate cached data. > > Note that regardless of the memory context that is used, implementing > correct transactional behavior is non-trivial: the int2 -> text mapping > will still be incorrect in the face of changes to the mapping table by > your own transaction (or another committed txn, in the case of read > committed). From Greg's description I guessed that a session-length > cache was what he needed anyway... > > -Neil > >