Re: eWeek Poll: Which database is most critical to your - Mailing list pgsql-hackers

From Dann Corbit
Subject Re: eWeek Poll: Which database is most critical to your
Date
Msg-id D90A5A6C612A39408103E6ECDD77B82906F3F4@voyager.corporate.connx.com
Whole thread Raw
List pgsql-hackers
[snip]
> Very likely, it is only my limited understanding not really grasping
> what it is that you are trying to do.  Even so,  I don't think it
really
> helps even for read only queries, unless it is exactly the same query
> with the same parameter markers and everything that was issued before.
> That is very unusual.  Normally, you won't have the entire query hard-
> wired, but with allow the customer to do some sort of filtering of the
> data.

Hmmm... the more I think about it, the more unusual it would be for
_exactly_ the same query to be repeated a lot. However, the article
reported a significant performance gain when this feature was enabled.
That could mean that:
   (a) the performance measurements/benchmarks used by the article were
synthetic and don't reflect real database applications
   (b) the feature MySQL implements is different than the one I am
describing

When I get a chance I'll investigate further the technique used by MySQL
to see if (b) is the case. However, it is beginning to look like this
isn't a good idea, overall.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I did not read the article at all, but I am familiar with query cache
and in fact, I do it a lot (I work for a database company).  Here is
how the algorithm works:

You intercept every incoming query and parse it.  Any physical data
gets replaced with parameter markers.  A 64 bit hash is formed from the
parsed query with the parameter markers removed.  The hash is used as
an index into a skiplist which also stores the original query.  After
all, if a client has a million dollar request, he won't be happy that
the unbelievably rare thing happened and the checksums agreed.

You can add a counter to the data in the skiplist so that you know how
often the query happens.  The parsed query will only be useful to a
system that can save time from having a query prepared (most systems
call it preparing the query).  I was kind of surprised to see that
PostgreSQL does not have a prepare stage in libpq.  This can be a
very large speedup in query execution (for obvious reasons).
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<


pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: Re: eWeek Poll: Which database is most critical to your
Next
From: "Christopher Kings-Lynne"
Date:
Subject: Re: quotes in SET grammar