Statement Pooling

Started by Janningalmost 16 years ago7 messagesgeneral
Jump to latest
#1Janning
ml@planwerk6.de

Hi,

we are running java6/hibernate/c3p0/postgresql stack.
Our JDBC Driver is 8.4-701.jdbc3

I have a few questions about Prepared Statements. I have read
<http://www.theserverside.com/news/1365244/Why-Prepared-Statements-are-
important-and-how-to-use-them-properly>

At the moment we have
c3p0.maxStatements = 0
c3p0.maxStatementsPerConnection = 0

I don't know if I do understand statement pooling properly, so please help me.
In my understanding it is like this:

Our hibernate stack uses prepared statements. Postgresql is caching the
execution plan. Next time the same statement is used, postgresql reuses the
execution plan. This saves time planning statements inside DB.

Additionally c3p0 can cache java instances of "java.sql.PreparedStatement"
which means it is caching the java object. So when using
c3p0.maxStatementsPerConnection = 100 it caches at most 100 different
objects. It saves time on creating objects, but this has nothing to do with
the postgresql database and its prepared statements.

Right?

As we use about 100 different statements I would set
c3p0.maxStatementsPerConnection = 100

Is this reasonable? Is there a real benefit activating it?

I remember postgresql 8.4 is replanning prepared statements when statistics
change occur, but I didn't find it in the release notes. It is just saying
"Invalidate cached plans when referenced schemas, functions, operators, or
operator classes are modified". Does PG replans prepared statements from time
to time if underlying data statistics change?

I am glad if you help me and give me some insights to managing my connection
pool the right way.

kind regards
Janning

PS: I think its ok to post this on general and not on pgsql-jdbc as I am not
subcribed to pgsql-jdbc. If not, please let me know and I will repost there.

#2Joshua Tolley
eggyknap@gmail.com
In reply to: Janning (#1)
Re: Statement Pooling

On Tue, May 25, 2010 at 05:28:10PM +0200, Janning wrote:

Our hibernate stack uses prepared statements. Postgresql is caching the
execution plan. Next time the same statement is used, postgresql reuses the
execution plan. This saves time planning statements inside DB.

It only uses the cached plan if you prepare the statement and run that
prepared statement. Running "SELECT foo FROM bar" twice in a row without any
preparing will result in the query being parsed, planned, and executed twice.
On the other hand, doing something like this:

p = conn.prepareStatement("SELECT foo FROM bar");

...and then repeatedly executed p, parsing and planning for the query would
occur only once, at the time of the prepareStatement call.

Additionally c3p0 can cache java instances of "java.sql.PreparedStatement"
which means it is caching the java object. So when using
c3p0.maxStatementsPerConnection = 100 it caches at most 100 different
objects. It saves time on creating objects, but this has nothing to do with
the postgresql database and its prepared statements.

Right?

That's the idea.

As we use about 100 different statements I would set
c3p0.maxStatementsPerConnection = 100

Is this reasonable? Is there a real benefit activating it?

Answering that question for your situation really requires benchmarking with
and without statement caching turned on. Your best bet is probably to set it
to a value that seems decent, and revisit it if you find a performance
bottleneck you need to resolve which looks like it's related to statement
caching.

I remember postgresql 8.4 is replanning prepared statements when statistics
change occur, but I didn't find it in the release notes. It is just saying
"Invalidate cached plans when referenced schemas, functions, operators, or
operator classes are modified". Does PG replans prepared statements from time
to time if underlying data statistics change?

I don't think so, though I may be wrong. The change you refer to replans such
things when the actual objects change, such as when you remove a column or
something that would make the plan fail to execute.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#3Janning
ml@planwerk6.de
In reply to: Joshua Tolley (#2)
Re: Statement Pooling

On Thursday 27 May 2010 16:56:28 you wrote:

Additionally c3p0 can cache java instances of
"java.sql.PreparedStatement" which means it is caching the java object.
So when using
c3p0.maxStatementsPerConnection = 100 it caches at most 100 different
objects. It saves time on creating objects, but this has nothing to do
with the postgresql database and its prepared statements.

Right?

That's the idea.

Really? Today I think c3p0 statement pooling it is not only about saving time
in object creation, but to let postgresql reuse an already prwpeared
statement.

As we use about 100 different statements I would set
c3p0.maxStatementsPerConnection = 100

Is this reasonable? Is there a real benefit activating it?

Answering that question for your situation really requires benchmarking
with and without statement caching turned on. Your best bet is probably to
set it to a value that seems decent, and revisit it if you find a
performance bottleneck you need to resolve which looks like it's related to
statement caching.

Ok thanks for your detailed answer. I did some benchmarking and it seems to be
faster with c3p0 statement cache.

I remember postgresql 8.4 is replanning prepared statements when
statistics change occur, but I didn't find it in the release notes. It is
just saying "Invalidate cached plans when referenced schemas, functions,
operators, or operator classes are modified". Does PG replans prepared
statements from time to time if underlying data statistics change?

I don't think so, though I may be wrong. The change you refer to replans
such things when the actual objects change, such as when you remove a
column or something that would make the plan fail to execute.

Ok, so i should be aware of this. if the execution plan postgresql is caching
gets old, i can run into trouble.

So i should close my connections in the pool from time to time to force
replanning of my prepared statements.

i have asked this question on stackoverflow too, because i thought it was to
jdbc/hibernate specific for this list after sending and not getting an answer.
So for further reference i post a link to this question, where you can find an
excellent answer, too:

http://stackoverflow.com/questions/2920740/should-i-activate-c3p0-statement-
pooling/2921340#2921340

kind regards
Janning

http://www.kicktipp.de/

Show quoted text

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#4Joshua Tolley
eggyknap@gmail.com
In reply to: Janning (#3)
Re: Statement Pooling

On Fri, May 28, 2010 at 10:09:22PM +0200, Janning wrote:

On Thursday 27 May 2010 16:56:28 you wrote:

Additionally c3p0 can cache java instances of
"java.sql.PreparedStatement" which means it is caching the java object.
So when using
c3p0.maxStatementsPerConnection = 100 it caches at most 100 different
objects. It saves time on creating objects, but this has nothing to do
with the postgresql database and its prepared statements.

Right?

That's the idea.

Really? Today I think c3p0 statement pooling it is not only about saving time
in object creation, but to let postgresql reuse an already prwpeared
statement.

Well, it was probably too strong to say it "has nothing to do with" your
database. Anyway, it sounds like you've got it right.

<snip>

I remember postgresql 8.4 is replanning prepared statements when
statistics change occur, but I didn't find it in the release notes. It is
just saying "Invalidate cached plans when referenced schemas, functions,
operators, or operator classes are modified". Does PG replans prepared
statements from time to time if underlying data statistics change?

I don't think so, though I may be wrong. The change you refer to replans
such things when the actual objects change, such as when you remove a
column or something that would make the plan fail to execute.

Ok, so i should be aware of this. if the execution plan postgresql is caching
gets old, i can run into trouble.

So i should close my connections in the pool from time to time to force
replanning of my prepared statements.

I'm not sure this is a great idea, but it would depend on your application.
With caching and pooling and such, a prepared statement might live a really
long time, but your database's statistics might take even longer to change
sufficiently to suggest one plan is better than another. Explicitly replanning
sounds like it's probably more work than it's worth, unless you see specific
plan problems.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#5Janning
ml@planwerk6.de
In reply to: Joshua Tolley (#4)
Re: Statement Pooling

I remember postgresql 8.4 is replanning prepared statements when
statistics change occur, but I didn't find it in the release notes.
It is just saying "Invalidate cached plans when referenced schemas,
functions, operators, or operator classes are modified". Does PG
replans prepared statements from time to time if underlying data
statistics change?

I don't think so, though I may be wrong. The change you refer to
replans such things when the actual objects change, such as when you
remove a column or something that would make the plan fail to execute.

Ok, so i should be aware of this. if the execution plan postgresql is
caching gets old, i can run into trouble.

So i should close my connections in the pool from time to time to force
replanning of my prepared statements.

I'm not sure this is a great idea, but it would depend on your application.
With caching and pooling and such, a prepared statement might live a really
long time, but your database's statistics might take even longer to change
sufficiently to suggest one plan is better than another. Explicitly
replanning sounds like it's probably more work than it's worth, unless you
see specific plan problems.

I know close my connections every 5th hour, so a new connection can create new
preparedStatements with a new execution plan.

I had some situations in the past, where an old plan took so much longer. I
think planning a statement doesn't cost so much compared with executing a
statement with a wrong plan (like sequence scan instead of using an index)

But I remember a post of tom lane who said that prepared statements get
replaned since 8.4 if underlying statistics change. But I really can't find it.
Maybe I do not remember correctly.

kind regards
Janning

Show quoted text

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#6Joshua Tolley
eggyknap@gmail.com
In reply to: Janning (#5)
Re: Statement Pooling

On Wed, Jun 02, 2010 at 10:10:24AM +0200, Janning wrote:

But I remember a post of tom lane who said that prepared statements get
replaned since 8.4 if underlying statistics change. But I really can't find it.
Maybe I do not remember correctly.

I saw a similar post yesterday, and think I was wrong in my original
assertion that they wouldn't be replanned. So now I'm thinking don't bother
manually refreshing your prepared statements, because postgresql will do it
for you :)

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Janning (#5)
Re: Statement Pooling

Janning <ml@planwerk6.de> writes:

But I remember a post of tom lane who said that prepared statements get
replaned since 8.4 if underlying statistics change.

Since 8.3, IIRC. 8.4 extended the mechanism to apply in some more
cases, but an ANALYZE should cause a replan in 8.3.

regards, tom lane