Thread: In theory question

In theory question

From
Naz Gassiep
Date:
This may be a question for -hackers, but I don't like disturbing them
unnecessarily.

I've been having a look at memcached. I would like to ask, is there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?

I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

Such a mechanism could be a) transparent to any and all APIs accessing
the back end thus instantly providing the benefits of caching to all
apps transparently, and b) assist with replication by providing a way
for a setup to have n serving daemons (effectively db caches) on
different machines accessing <n replicated DBs. Such a setup would be
far easier to set up than n fully fledged DB servers, and would likely
scale better anyway.

Thoughts?

Re: In theory question

From
Michael Glaesemann
Date:
On May 9, 2007, at 9:13 , Naz Gassiep wrote:

> I've been having a look at memcached. I would like to ask, is there
> any
> reason that, theoretically, a similar caching system could be built
> right into the db serving daemon?

This is all a bit above my head, but have you looked at pgmemcached?

http://people.freebsd.org/~seanc/pgmemcache/

Michael Glaesemann
grzm seespotcode net



Re: In theory question

From
Hannes Dorbath
Date:
On 09.05.2007 16:13, Naz Gassiep wrote:
> This may be a question for -hackers, but I don't like disturbing them
> unnecessarily.
>
> I've been having a look at memcached. I would like to ask, is there any
> reason that, theoretically, a similar caching system could be built
> right into the db serving daemon?
>
> I.e., the hash tables and libevent could sit on top of postmaster as an
> optional component caching data on a per-query basis and only hitting
> the actual db in the event of a cache miss?

I think this is close to what MySQL's query cache does. The question is
if this should be the job of the DBMS and not another layer. At least
the pgmemcache author and I think that it's better done outside the
DBMS. See http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for
the idea.


--
Regards,
Hannes Dorbath

Re: In theory question

From
Erik Jones
Date:
On May 9, 2007, at 10:22 AM, Hannes Dorbath wrote:

> On 09.05.2007 16:13, Naz Gassiep wrote:
>> This may be a question for -hackers, but I don't like disturbing them
>> unnecessarily.
>> I've been having a look at memcached. I would like to ask, is
>> there any
>> reason that, theoretically, a similar caching system could be built
>> right into the db serving daemon?
>> I.e., the hash tables and libevent could sit on top of postmaster
>> as an
>> optional component caching data on a per-query basis and only hitting
>> the actual db in the event of a cache miss?
>
> I think this is close to what MySQL's query cache does. The
> question is if this should be the job of the DBMS and not another
> layer. At least the pgmemcache author and I think that it's better
> done outside the DBMS. See http://people.FreeBSD.org/~seanc/
> pgmemcache/pgmemcache.pdf for the idea.

I just read through that pdf.  How does implementing a memcached
system with table triggers qualify as outside the database?

erik jones <erik@myemma.com>
software developer
615-296-0838
emma(r)




Re: In theory question

From
Tom Lane
Date:
Naz Gassiep <naz@mira.net> writes:
> I.e., the hash tables and libevent could sit on top of postmaster as an
> optional component caching data on a per-query basis and only hitting
> the actual db in the event of a cache miss?

How does the cache know when the database contents change?

            regards, tom lane

Re: In theory question

From
Naz Gassiep
Date:
Hannes Dorbath wrote:
> I think this is close to what MySQL's query cache does. The question
> is if this should be the job of the DBMS and not another layer. At
> least the pgmemcache author and I think that it's better done outside
> the DBMS. See
> http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.
This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Re: In theory question

From
"Joshua D. Drake"
Date:
Naz Gassiep wrote:
> Hannes Dorbath wrote:
>> I think this is close to what MySQL's query cache does. The question
>> is if this should be the job of the DBMS and not another layer. At
>> least the pgmemcache author and I think that it's better done outside
>> the DBMS. See
>> http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.
> This is exactly what I was asking about. So my theoretical idea has
> already been implemented. Now if only *all* my ideas were done for me by
> the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: In theory question

From
Naz Gassiep
Date:
>> This is exactly what I was asking about. So my theoretical idea has
>> already been implemented. Now if only *all* my ideas were done for me by
>> the time I came up with them :)
>
> Then you wouldn't be able to eventually patent them ;)
What an un-BSD licensish thing to say :P

Re: In theory question

From
PFC
Date:
>>> This may be a question for -hackers, but I don't like disturbing them
>>> unnecessarily.
>>> I've been having a look at memcached. I would like to ask, is there any
>>> reason that, theoretically, a similar caching system could be built
>>> right into the db serving daemon?
>>> I.e., the hash tables and libevent could sit on top of postmaster as an
>>> optional component caching data on a per-query basis and only hitting
>>> the actual db in the event of a cache miss?
>>
>> I think this is close to what MySQL's query cache does. The question is
>> if this should be the job of the DBMS and not another layer. At least
>> the pgmemcache author and I think that it's better done outside the
>> DBMS. See http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf
>> for the idea.

    I have always found MySQL's query cache to be utterly useless.

    Think about it this way :

    It only works for tables that seldom change.
    It does not work for big tables (like the posts table of a forum) because
the cache would have to be huge.

    So, the most frequent usage of MySQL's query cache is for dumb
applications who use, for instance, PHP, store their configuration in
MySQL, and reload it on each and every page with a SELECT * FROM
configuration.

    In this case, you save the query time, but you don't save : the roundtrip
between PHP and the database, extracting query results, building objects
 from them, time spent in ORMs, etc.

    A much better solution is to do your own caching, for instance using
shared memory in the application server, and then you cache native
language objects. You not only save the query time, but also all the time
spent building those objects on every page load.

Re: In theory question

From
Karsten Hilbert
Date:
On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:

> >This is exactly what I was asking about. So my theoretical idea has
> >already been implemented. Now if only *all* my ideas were done for me by
> >the time I came up with them :)
>
> Then you wouldn't be able to eventually patent them ;)

I think you are overly optimistic  ;-)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: In theory question

From
"Joshua D. Drake"
Date:
Karsten Hilbert wrote:
> On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:
>
>>> This is exactly what I was asking about. So my theoretical idea has
>>> already been implemented. Now if only *all* my ideas were done for me by
>>> the time I came up with them :)
>> Then you wouldn't be able to eventually patent them ;)
>
> I think you are overly optimistic  ;-)

You obviously haven't reviewed the US patent system. I just patented "A
textual method for conveying expression without using any standardized
language and only predetermined, but arbitrary symbols."

I noticed you used the ;-). You owe me 75.00.

Joshua D. Drake

>
> Karsten


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: In theory question

From
Alvaro Herrera
Date:
Joshua D. Drake wrote:
> Karsten Hilbert wrote:
> >On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:
> >
> >>>This is exactly what I was asking about. So my theoretical idea has
> >>>already been implemented. Now if only *all* my ideas were done for me by
> >>>the time I came up with them :)
> >>Then you wouldn't be able to eventually patent them ;)
> >
> >I think you are overly optimistic  ;-)
>
> You obviously haven't reviewed the US patent system. I just patented "A
> textual method for conveying expression without using any standardized
> language and only predetermined, but arbitrary symbols."

That's already patented by Microsoft.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: In theory question

From
"Joshua D. Drake"
Date:
Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>> Karsten Hilbert wrote:
>>> On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:
>>>
>>>>> This is exactly what I was asking about. So my theoretical idea has
>>>>> already been implemented. Now if only *all* my ideas were done for me by
>>>>> the time I came up with them :)
>>>> Then you wouldn't be able to eventually patent them ;)
>>> I think you are overly optimistic  ;-)
>> You obviously haven't reviewed the US patent system. I just patented "A
>> textual method for conveying expression without using any standardized
>> language and only predetermined, but arbitrary symbols."
>
> That's already patented by Microsoft.

No they patented:


A textual method for conveying expression without using standardized
language and only predetermined, but arbitrary symbols."



--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: In theory question

From
David Fetter
Date:
On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:
> Naz Gassiep wrote:
> >Hannes Dorbath wrote:
> >>I think this is close to what MySQL's query cache does. The question
> >>is if this should be the job of the DBMS and not another layer. At
> >>least the pgmemcache author and I think that it's better done outside
> >>the DBMS. See
> >>http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.
> >This is exactly what I was asking about. So my theoretical idea has
> >already been implemented. Now if only *all* my ideas were done for me by
> >the time I came up with them :)
>
> Then you wouldn't be able to eventually patent them ;)

You have no faith in the patent system ;)

Cheers,
D (remember, Mr. Naz is from a country where somebody patented the wheel.)
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

Re: In theory question

From
Naz Gassiep
Date:
>     I have always found MySQL's query cache to be utterly useless.
>
>     Think about it this way :
>
>     It only works for tables that seldom change.
>     It does not work for big tables (like the posts table of a forum)
> because the cache would have to be huge.
>
>     So, the most frequent usage of MySQL's query cache is for dumb
> applications who use, for instance, PHP, store their configuration in
> MySQL, and reload it on each and every page with a SELECT * FROM
> configuration.
>
>     In this case, you save the query time, but you don't save : the
> roundtrip between PHP and the database, extracting query results,
> building objects from them, time spent in ORMs, etc.
>
>     A much better solution is to do your own caching, for instance
> using shared memory in the application server, and then you cache
> native language objects. You not only save the query time, but also
> all the time spent building those objects on every page load.
This was actually my original question. In my web page, I cache the
config, user preferences and other static material in session vars and
only rerun the function that fetches them if the app ever changes them

If you are clever about db fetches in this way and store as much stuff
in session vars (which is just like storing it in a memcached instance
really) is there much or even any benefit in running memcached?

Re: In theory question

From
Hannes Dorbath
Date:
On 09.05.2007 17:30, Erik Jones wrote:
>> On 09.05.2007 16:13, Naz Gassiep wrote:
>> I think this is close to what MySQL's query cache does. The question
>> is if this should be the job of the DBMS and not another layer. At
>> least the pgmemcache author and I think that it's better done outside
>> the DBMS. See
>> http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.
>
> I just read through that pdf.  How does implementing a memcached system
> with table triggers qualify as outside the database?

The point is to have the DBMS _invalidate_ an external Cache, not to
fill or use it.

Caching in that case should not be done for single SQL statements. You
should cache things that have been produced using that query, a rendered
part of an HTML page is an example.

Think of a news selection on your website, the pages changes when the
content of 2-3 tables in your database changes. Here you have the DBMS
clear the page from the cache and your application layer re-render it
and put the new version in the cache.


--
Regards,
Hannes Dorbath