Temporary Tables and Web Application

Started by Tim Tassonisalmost 18 years ago9 messagesgeneral
Jump to latest
#1Tim Tassonis
timtas@cubic.ch

Hi all

I assume this is not an uncommon problem, but so far, I haven't been
able to find a good answer to it.

I've got a table that holds log entries and fills up very fast during
the day, it gets approx. 25 million rows per day. I'm now building a web
application using apache/mod_php where you can query the database and
then should be able to page through the results.

My idea was that whenever a user constructs a query, I create a
temporary table holding the results and then page through this table,
which should work very well in principle.

But from what I've been able to find out, temporary tables live only in
the Postgres Session they have been created in and are destroyed upon
session descructuion.

Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request, thus
he will not see the temporary table.

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they have
been written to disk/read from disk?

Tim

#2Bill Moran
wmoran@collaborativefusion.com
In reply to: Tim Tassonis (#1)
Re: Temporary Tables and Web Application

In response to Tim Tassonis <timtas@cubic.ch>:

Hi all

I assume this is not an uncommon problem, but so far, I haven't been
able to find a good answer to it.

I've got a table that holds log entries and fills up very fast during
the day, it gets approx. 25 million rows per day. I'm now building a web
application using apache/mod_php where you can query the database and
then should be able to page through the results.

My idea was that whenever a user constructs a query, I create a
temporary table holding the results and then page through this table,
which should work very well in principle.

But from what I've been able to find out, temporary tables live only in
the Postgres Session they have been created in and are destroyed upon
session descructuion.

Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request, thus
he will not see the temporary table.

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they have
been written to disk/read from disk?

Build a framework that creates the tables in a special schema, and then
can access them through any session. Use some method to generate unique
table names and store the names in the HTTP session. Create some sort
of garbage collection routines that removes tables when they're no longer
needed.

The details of exactly how you pull this off are going to depend heavily
on the rest of your application architecture.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

#3Tino Wildenhain
tino@wildenhain.de
In reply to: Tim Tassonis (#1)
Re: Temporary Tables and Web Application

Hi,

Tim Tassonis wrote:

Hi all

I assume this is not an uncommon problem, but so far, I haven't been
able to find a good answer to it.

I've got a table that holds log entries and fills up very fast during
the day, it gets approx. 25 million rows per day. I'm now building a web
application using apache/mod_php where you can query the database and
then should be able to page through the results.

you should be aware that PHP isnt the only scripting language with an
apache module and not neccessary the best choice among them.

My idea was that whenever a user constructs a query, I create a
temporary table holding the results and then page through this table,
which should work very well in principle.

That means you are more or less constructing materialized views :-)
But if you hold the session anyway, then see below.

But from what I've been able to find out, temporary tables live only in
the Postgres Session they have been created in and are destroyed upon
session descructuion.

Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request, thus
he will not see the temporary table.

Thats the problem and if you have failover/loadbalancing situations,
even more so.

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they have
been written to disk/read from disk?

To start with, you should avoid reconnecting to the database for every
request. Not only because of loosing the session context but also
to avoid connection overhead.

Usually this is done by connection pooling. You can then try to trac
user:connection relationship as much as possible thru the connection pool.

If you have that, there is actually no need for the temp tables. Instead
you can just use a regular cursor and scroll it as neccessary.

Almost all frameworks should give you reasonable pool implementations,
some additional memory caching on top of it and there are also a lot
of other methods to help you with that, for example pgpool and
pgbouncer.

Regards
Tino

#4Bill Moran
wmoran@collaborativefusion.com
In reply to: Tim Tassonis (#1)
Re: Temporary Tables and Web Application

In response to Tim Tassonis <timtas@cubic.ch>:

Bill Moran wrote:

In response to Tim Tassonis <timtas@cubic.ch>:

Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request, thus
he will not see the temporary table.

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they have
been written to disk/read from disk?

Build a framework that creates the tables in a special schema, and then
can access them through any session. Use some method to generate unique
table names and store the names in the HTTP session. Create some sort
of garbage collection routines that removes tables when they're no longer
needed.

The details of exactly how you pull this off are going to depend heavily
on the rest of your application architecture.

What you describe is what I referred to as "create real tables". I've
done that and it works, but I wondered if there's something similar
built in postgres apart from classical temporary tables.

Not that I'm aware of.

If you keep the mailing list in the CC, others can answer as well.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

#5Tim Tassonis
timtas@cubic.ch
In reply to: Tino Wildenhain (#3)
Re: Temporary Tables and Web Application

Tino Wildenhain wrote:

Hi,

Tim Tassonis wrote:

Hi all

I assume this is not an uncommon problem, but so far, I haven't been
able to find a good answer to it.

I've got a table that holds log entries and fills up very fast during
the day, it gets approx. 25 million rows per day. I'm now building a
web application using apache/mod_php where you can query the database
and then should be able to page through the results.

you should be aware that PHP isnt the only scripting language with an
apache module and not neccessary the best choice among them.

There's no need to become insulting. I am aware of the truly astonishing
fact that there are other scripting languages apart from php and that
not everybody loves php.

Apart from the sad fact that I quite like php, the problem is not the
choice of scripting language, but the nature of apache mpm processing,
making the postgres connection stuck to an apache process.

My idea was that whenever a user constructs a query, I create a
temporary table holding the results and then page through this table,
which should work very well in principle.

That means you are more or less constructing materialized views :-)

No, I want the data to remain fixed after the query is executed.

But if you hold the session anyway, then see below.

I don't hold the session, see above.

But from what I've been able to find out, temporary tables live only
in the Postgres Session they have been created in and are destroyed
upon session descructuion.

Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request,
thus he will not see the temporary table.

Thats the problem and if you have failover/loadbalancing situations,
even more so.

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they
have been written to disk/read from disk?

To start with, you should avoid reconnecting to the database for every
request. Not only because of loosing the session context but also
to avoid connection overhead.

I don't reconnect after every request, but I'm not guaranteed by mpm
that I get the same session/process. I might, but that's hardly what I'd
call a stable application, even as a php programmer.

Usually this is done by connection pooling. You can then try to trac
user:connection relationship as much as possible thru the connection pool.

As far as I can see, there is no implementation of a multi client
process connection pool in mod_php.
I admit that my interprocess communication know-how is not very deep,
but that would mean the client postgres/tcpip connection part would have
to be held somewhere in shared memory between the different apache
processes. From reading the documentation, php does not do that.

If you have that, there is actually no need for the temp tables. Instead
you can just use a regular cursor and scroll it as neccessary.

My problem ist that I don't have that.

Almost all frameworks should give you reasonable pool implementations,
some additional memory caching on top of it and there are also a lot
of other methods to help you with that, for example pgpool and
pgbouncer.

I'm afraid you somehow missed the point, but thanks for your response.

Bye
Tim

#6Marco Bizzarri
marco.bizzarri@gmail.com
In reply to: Tim Tassonis (#1)
Re: Temporary Tables and Web Application

On Thu, Jun 5, 2008 at 5:36 PM, Tim Tassonis <timtas@cubic.ch> wrote:

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my purpose?
And is the perfomance penalty big for real tables, as they have been written
to disk/read from disk?

You could create a real table on disk, inserting just the primary keys
of the table; then, you could join on the main table, to get the real
results.

Regards
Marco

--
Marco Bizzarri
http://iliveinpisa.blogspot.com/

#7Brent Wood
b.wood@niwa.co.nz
In reply to: Marco Bizzarri (#6)
Re: Temporary Tables and Web Application

Hi Tim,

Off the top of my head, from somewhat left field, using filesystems to manage this sort of effect.

Would "real" tables in a tablespace defined on a ramdisk meet this need? So the functionality/accessibility of a
physical table is provided, along with the performance of a filesystem actually residing in memory. Presumeably viable if you have the memory to spare & know the size of the temp tables won't exceed this.

You could also mount a tablespace on a physical disk with a filesystem which has delayed/deferred writes to disk, so that if it is created & deleted quickly enough, it is never actually written to disk, but just generally sits in the cache.

Cheers,

Brent Wood

Bill Moran <wmoran@collaborativefusion.com> 06/06/08 8:01 AM >>>

In response to Tim Tassonis <timtas@cubic.ch>:

Bill Moran wrote:

In response to Tim Tassonis <timtas@cubic.ch>:

Now, with apache/php in a mpm environment, I have no guarantee that a
user will get the same postgresql session for a subsequent request, thus
he will not see the temporary table.

Is there a way to create temporary tables in another way, so they are
visible between sessions, or do I need to create real tables for my
purpose? And is the perfomance penalty big for real tables, as they have
been written to disk/read from disk?

Build a framework that creates the tables in a special schema, and then
can access them through any session. Use some method to generate unique
table names and store the names in the HTTP session. Create some sort
of garbage collection routines that removes tables when they're no longer
needed.

The details of exactly how you pull this off are going to depend heavily
on the rest of your application architecture.

What you describe is what I referred to as "create real tables". I've
done that and it works, but I wondered if there's something similar
built in postgres apart from classical temporary tables.

Not that I'm aware of.

If you keep the mailing list in the CC, others can answer as well.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Tomasz Ostrowski
tometzky@batory.org.pl
In reply to: Brent Wood (#7)
Re: Temporary Tables and Web Application

On 2008-06-06 07:25, Brent Wood wrote:

Would "real" tables in a tablespace defined on a ramdisk meet this
need?

Bad idea. This would mean an unusable database after a restart.

You could also mount a tablespace on a physical disk with a
filesystem which has delayed/deferred writes to disk

Not a vary good idea. This could mean an unusable database after a crash
or power failure.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh

#9Tim Tassonis
timtas@cubic.ch
In reply to: Tomasz Ostrowski (#8)
Re: Temporary Tables and Web Application

Tomasz Ostrowski wrote:

On 2008-06-06 07:25, Brent Wood wrote:

Would "real" tables in a tablespace defined on a ramdisk meet this
need?

Bad idea. This would mean an unusable database after a restart.

Funnily, I was thinking the same this night, somehow defining a
tablespace on tmpfs or something, so I really liked Brents suggestion.

Sadly, it seems this would have some real downsides...

Show quoted text

You could also mount a tablespace on a physical disk with a
filesystem which has delayed/deferred writes to disk

Not a vary good idea. This could mean an unusable database after a crash
or power failure.

Regards
Tometzky