In theory question

Started by Naz Gassiepalmost 19 years ago16 messagesgeneral
Jump to latest
#1Naz Gassiep
naz@mira.net

This may be a question for -hackers, but I don't like disturbing them
unnecessarily.

I've been having a look at memcached. I would like to ask, is there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?

I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

Such a mechanism could be a) transparent to any and all APIs accessing
the back end thus instantly providing the benefits of caching to all
apps transparently, and b) assist with replication by providing a way
for a setup to have n serving daemons (effectively db caches) on
different machines accessing <n replicated DBs. Such a setup would be
far easier to set up than n fully fledged DB servers, and would likely
scale better anyway.

Thoughts?

#2Michael Glaesemann
grzm@seespotcode.net
In reply to: Naz Gassiep (#1)
Re: In theory question

On May 9, 2007, at 9:13 , Naz Gassiep wrote:

I've been having a look at memcached. I would like to ask, is there
any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?

This is all a bit above my head, but have you looked at pgmemcached?

http://people.freebsd.org/~seanc/pgmemcache/

Michael Glaesemann
grzm seespotcode net

#3Hannes Dorbath
light@theendofthetunnel.de
In reply to: Naz Gassiep (#1)
Re: In theory question

On 09.05.2007 16:13, Naz Gassiep wrote:

This may be a question for -hackers, but I don't like disturbing them
unnecessarily.

I've been having a look at memcached. I would like to ask, is there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?

I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

I think this is close to what MySQL's query cache does. The question is
if this should be the job of the DBMS and not another layer. At least
the pgmemcache author and I think that it's better done outside the
DBMS. See http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for
the idea.

--
Regards,
Hannes Dorbath

#4Erik Jones
erik@myemma.com
In reply to: Hannes Dorbath (#3)
Re: In theory question

On May 9, 2007, at 10:22 AM, Hannes Dorbath wrote:

On 09.05.2007 16:13, Naz Gassiep wrote:

This may be a question for -hackers, but I don't like disturbing them
unnecessarily.
I've been having a look at memcached. I would like to ask, is
there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?
I.e., the hash tables and libevent could sit on top of postmaster
as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

I think this is close to what MySQL's query cache does. The
question is if this should be the job of the DBMS and not another
layer. At least the pgmemcache author and I think that it's better
done outside the DBMS. See http://people.FreeBSD.org/~seanc/
pgmemcache/pgmemcache.pdf for the idea.

I just read through that pdf. How does implementing a memcached
system with table triggers qualify as outside the database?

erik jones <erik@myemma.com>
software developer
615-296-0838
emma(r)

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Naz Gassiep (#1)
Re: In theory question

Naz Gassiep <naz@mira.net> writes:

I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

How does the cache know when the database contents change?

regards, tom lane

#6Naz Gassiep
naz@mira.net
In reply to: Hannes Dorbath (#3)
Re: In theory question

Hannes Dorbath wrote:

I think this is close to what MySQL's query cache does. The question
is if this should be the job of the DBMS and not another layer. At
least the pgmemcache author and I think that it's better done outside
the DBMS. See
http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

#7Joshua D. Drake
jd@commandprompt.com
In reply to: Naz Gassiep (#6)
Re: In theory question

Naz Gassiep wrote:

Hannes Dorbath wrote:

I think this is close to what MySQL's query cache does. The question
is if this should be the job of the DBMS and not another layer. At
least the pgmemcache author and I think that it's better done outside
the DBMS. See
http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

#8Naz Gassiep
naz@mira.net
In reply to: Joshua D. Drake (#7)
Re: In theory question

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

What an un-BSD licensish thing to say :P

#9PFC
lists@peufeu.com
In reply to: Erik Jones (#4)
Re: In theory question

This may be a question for -hackers, but I don't like disturbing them
unnecessarily.
I've been having a look at memcached. I would like to ask, is there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?
I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

I think this is close to what MySQL's query cache does. The question is
if this should be the job of the DBMS and not another layer. At least
the pgmemcache author and I think that it's better done outside the
DBMS. See http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf
for the idea.

I have always found MySQL's query cache to be utterly useless.

Think about it this way :

It only works for tables that seldom change.
It does not work for big tables (like the posts table of a forum) because
the cache would have to be huge.

So, the most frequent usage of MySQL's query cache is for dumb
applications who use, for instance, PHP, store their configuration in
MySQL, and reload it on each and every page with a SELECT * FROM
configuration.

In this case, you save the query time, but you don't save : the roundtrip
between PHP and the database, extracting query results, building objects
from them, time spent in ORMs, etc.

A much better solution is to do your own caching, for instance using
shared memory in the application server, and then you cache native
language objects. You not only save the query time, but also all the time
spent building those objects on every page load.

#10Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Joshua D. Drake (#7)
Re: In theory question

On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

I think you are overly optimistic ;-)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#11Joshua D. Drake
jd@commandprompt.com
In reply to: Karsten Hilbert (#10)
Re: In theory question

Karsten Hilbert wrote:

On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

I think you are overly optimistic ;-)

You obviously haven't reviewed the US patent system. I just patented "A
textual method for conveying expression without using any standardized
language and only predetermined, but arbitrary symbols."

I noticed you used the ;-). You owe me 75.00.

Joshua D. Drake

Karsten

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Joshua D. Drake (#11)
Re: In theory question

Joshua D. Drake wrote:

Karsten Hilbert wrote:

On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

I think you are overly optimistic ;-)

You obviously haven't reviewed the US patent system. I just patented "A
textual method for conveying expression without using any standardized
language and only predetermined, but arbitrary symbols."

That's already patented by Microsoft.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#13Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#12)
Re: In theory question

Alvaro Herrera wrote:

Joshua D. Drake wrote:

Karsten Hilbert wrote:

On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

I think you are overly optimistic ;-)

You obviously haven't reviewed the US patent system. I just patented "A
textual method for conveying expression without using any standardized
language and only predetermined, but arbitrary symbols."

That's already patented by Microsoft.

No they patented:

A textual method for conveying expression without using standardized
language and only predetermined, but arbitrary symbols."

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

#14David Fetter
david@fetter.org
In reply to: Joshua D. Drake (#7)
Re: In theory question

On Wed, May 09, 2007 at 10:29:02AM -0700, Joshua D. Drake wrote:

Naz Gassiep wrote:

Hannes Dorbath wrote:

I think this is close to what MySQL's query cache does. The question
is if this should be the job of the DBMS and not another layer. At
least the pgmemcache author and I think that it's better done outside
the DBMS. See
http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.

This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

Then you wouldn't be able to eventually patent them ;)

You have no faith in the patent system ;)

Cheers,
D (remember, Mr. Naz is from a country where somebody patented the wheel.)
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

#15Naz Gassiep
naz@mira.net
In reply to: PFC (#9)
Re: In theory question

I have always found MySQL's query cache to be utterly useless.

Think about it this way :

It only works for tables that seldom change.
It does not work for big tables (like the posts table of a forum)
because the cache would have to be huge.

So, the most frequent usage of MySQL's query cache is for dumb
applications who use, for instance, PHP, store their configuration in
MySQL, and reload it on each and every page with a SELECT * FROM
configuration.

In this case, you save the query time, but you don't save : the
roundtrip between PHP and the database, extracting query results,
building objects from them, time spent in ORMs, etc.

A much better solution is to do your own caching, for instance
using shared memory in the application server, and then you cache
native language objects. You not only save the query time, but also
all the time spent building those objects on every page load.

This was actually my original question. In my web page, I cache the
config, user preferences and other static material in session vars and
only rerun the function that fetches them if the app ever changes them

If you are clever about db fetches in this way and store as much stuff
in session vars (which is just like storing it in a memcached instance
really) is there much or even any benefit in running memcached?

#16Hannes Dorbath
light@theendofthetunnel.de
In reply to: Erik Jones (#4)
Re: In theory question

On 09.05.2007 17:30, Erik Jones wrote:

On 09.05.2007 16:13, Naz Gassiep wrote:
I think this is close to what MySQL's query cache does. The question
is if this should be the job of the DBMS and not another layer. At
least the pgmemcache author and I think that it's better done outside
the DBMS. See
http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.

I just read through that pdf. How does implementing a memcached system
with table triggers qualify as outside the database?

The point is to have the DBMS _invalidate_ an external Cache, not to
fill or use it.

Caching in that case should not be done for single SQL statements. You
should cache things that have been produced using that query, a rendered
part of an HTML page is an example.

Think of a news selection on your website, the pages changes when the
content of 2-3 tables in your database changes. Here you have the DBMS
clear the page from the cache and your application layer re-render it
and put the new version in the cache.

--
Regards,
Hannes Dorbath