Prepared statements performance
Hi!
My reading to date suggests that prepared statements should be faster to
execute than issuing the same statement multiple times. However, issuing
100'000 INSERTs turned out to be more than ten times faster than executing
the same prepared statement 100'000 times when executed via pgAdmin. The
table was:
CREATE TABLE test
(
one date,
two boolean,
three character varying,
four integer,
five numeric(18,5),
id serial NOT NULL --note the index here
)
The prepared statement test lasting ~160 seconds was:
TRUNCATE test;
BEGIN;
PREPARE foo(date, boolean, varchar, integer, numeric(18,5)) AS
INSERT INTO test (one, two, three, four, five) VALUES ($1, $2, $3, $4,
$5);
EXECUTE foo('2011-01-01', true, 'three', 4, 5.5);
-- 99'999 more executes...
END;
The insertion test lasting ~12 seconds was:
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01', true,
'three', 4, 5.5);
-- 99'999 more inserts...
END;
I'm assuming then that I've done something mistakenly.
Many thanks,
Dan.
Hi again,
I did a follow up test using 'multi-value' inserts which is three times
faster than multiple inserts thusly:
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01', true,
'three', 4, 5.5)
,('2011-01-01', true, 'three', 4, 5.5)
-- 99'998 more , ('2011-01-01', true, 'three', 4, 5.5) ...;
END;
This is the kind of speed increase I was hoping for when using prepared
statements (which makes sense because in this multi-value insert the query
is only being planned once?).
Thanks,
Dan.
P.S. Mac OS X 10.7.3 using PostgreSQL 9.1.2.
On Thu, May 10, 2012 at 9:25 AM, Daniel McGreal
<daniel.mcgreal@redbite.com>wrote:
Show quoted text
Hi!
My reading to date suggests that prepared statements should be faster to
execute than issuing the same statement multiple times. However, issuing
100'000 INSERTs turned out to be more than ten times faster than
executing the same prepared statement 100'000 times when executed via
pgAdmin. The table was:CREATE TABLE test
(
one date,
two boolean,
three character varying,
four integer,
five numeric(18,5),
id serial NOT NULL --note the index here
)The prepared statement test lasting ~160 seconds was:
TRUNCATE test;
BEGIN;
PREPARE foo(date, boolean, varchar, integer, numeric(18,5)) AS
INSERT INTO test (one, two, three, four, five) VALUES ($1, $2, $3,
$4, $5);EXECUTE foo('2011-01-01', true, 'three', 4, 5.5);
-- 99'999 more executes...
END;The insertion test lasting ~12 seconds was:
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01',
true, 'three', 4, 5.5);
-- 99'999 more inserts...
END;I'm assuming then that I've done something mistakenly.
Many thanks,
Dan.
Import Notes
Reply to msg id not found: CACAnjQwHKhBCfGBSMaVvnYPpA-Vc7jVQm7wRv-9_Khy2r2d4xw@mail.gmail.com
Hello
2012/5/10 Daniel McGreal <daniel.mcgreal@redbite.com>:
Hi again,
I did a follow up test using 'multi-value' inserts which is three times
faster than multiple inserts thusly:
if you need speed, use a COPY statement - it should be 10x faster than INSERTS
Pavel
Show quoted text
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01', true,
'three', 4, 5.5),('2011-01-01', true, 'three', 4, 5.5)
-- 99'998 more , ('2011-01-01', true, 'three', 4, 5.5) ...;
END;This is the kind of speed increase I was hoping for when using prepared
statements (which makes sense because in this multi-value insert the query
is only being planned once?).Thanks,
Dan.
P.S. Mac OS X 10.7.3 using PostgreSQL 9.1.2.On Thu, May 10, 2012 at 9:25 AM, Daniel McGreal
<daniel.mcgreal@redbite.com> wrote:Hi!
My reading to date suggests that prepared statements should be faster to
execute than issuing the same statement multiple times. However, issuing
100'000 INSERTs turned out to be more than ten times faster than executing
the same prepared statement 100'000 times when executed via pgAdmin. The
table was:CREATE TABLE test
(
one date,
two boolean,
three character varying,
four integer,
five numeric(18,5),
id serial NOT NULL --note the index here
)The prepared statement test lasting ~160 seconds was:
TRUNCATE test;
BEGIN;
PREPARE foo(date, boolean, varchar, integer, numeric(18,5)) AS
INSERT INTO test (one, two, three, four, five) VALUES ($1, $2, $3,
$4, $5);EXECUTE foo('2011-01-01', true, 'three', 4, 5.5);
-- 99'999 more executes...
END;The insertion test lasting ~12 seconds was:
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01',
true, 'three', 4, 5.5);
-- 99'999 more inserts...
END;I'm assuming then that I've done something mistakenly.
Many thanks,
Dan.
Hi,
Unfortunately these are experimental conditions. The conditions surrounding
the intended application are such that my two options are prepared
statements or many inserts. I put the multi-value inserts in as I was
curious as to why prepared statements would be slower given they only plan
the query once (as also does the multi-value insert, I assume).
It turns out though that the results are skewed by using pgAdmin. Executing
my scripts from the command line gives much more appropriate results.
Thanks,
Dan.
On Thu, May 10, 2012 at 10:16 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:
Show quoted text
Hello
2012/5/10 Daniel McGreal <daniel.mcgreal@redbite.com>:
Hi again,
I did a follow up test using 'multi-value' inserts which is three times
faster than multiple inserts thusly:if you need speed, use a COPY statement - it should be 10x faster than
INSERTSPavel
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01',true,
'three', 4, 5.5)
,('2011-01-01', true, 'three', 4, 5.5)
-- 99'998 more , ('2011-01-01', true, 'three', 4, 5.5) ...;
END;This is the kind of speed increase I was hoping for when using prepared
statements (which makes sense because in this multi-value insert thequery
is only being planned once?).
Thanks,
Dan.
P.S. Mac OS X 10.7.3 using PostgreSQL 9.1.2.On Thu, May 10, 2012 at 9:25 AM, Daniel McGreal
<daniel.mcgreal@redbite.com> wrote:Hi!
My reading to date suggests that prepared statements should be faster
to
execute than issuing the same statement multiple times. However,
issuing
100'000 INSERTs turned out to be more than ten times faster than
executing
the same prepared statement 100'000 times when executed via pgAdmin.
The
table was:
CREATE TABLE test
(
one date,
two boolean,
three character varying,
four integer,
five numeric(18,5),
id serial NOT NULL --note the index here
)The prepared statement test lasting ~160 seconds was:
TRUNCATE test;
BEGIN;
PREPARE foo(date, boolean, varchar, integer, numeric(18,5)) AS
INSERT INTO test (one, two, three, four, five) VALUES ($1, $2, $3,
$4, $5);EXECUTE foo('2011-01-01', true, 'three', 4, 5.5);
-- 99'999 more executes...
END;The insertion test lasting ~12 seconds was:
TRUNCATE test;
BEGIN;
INSERT INTO test (one, two, three, four, five) VALUES ('2011-01-01',
true, 'three', 4, 5.5);
-- 99'999 more inserts...
END;I'm assuming then that I've done something mistakenly.
Many thanks,
Dan.
On 10 May 2012 11:30, Daniel McGreal <daniel.mcgreal@redbite.com> wrote:
I put the multi-value inserts in as I was
curious as to why prepared statements would be slower given they only plan
the query once (as also does the multi-value insert, I assume).
That's a common misconception.
The reason that prepared statements are often slower, is exactly
_because_ they only plan the query once. Because the query-plan is
stored when the query gets prepared, the same plan gets used for every
combination of query parameters, so it has to be a fairly generic
query plan.
OTOH, the multi-value insert knows exactly what combinations of
"parameters" will be used in the query and the query planner can
optimise the query for those parameters. It wouldn't surprise me if it
would re-evaluate plan branch choices based on which row of values is
currently being inserted.
I think it's safe to say that prepared statements are only efficient
when you're dealing with repeated complicated queries, where preparing
the query plan takes a significant amount of time. It'll also shave
some time off queries that are inefficient regardless of how you
execute them (for example, because the query always needs to perform a
sequential scan).
They'll also be faster on database servers with a slower query planner
than the one in Postgres.
In most (all?) other cases, executing the query directly is probably faster.
Of course there are other benefits to prepared statements, such as a
natural immunity to SQL injection.
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
On Thu, 10 May 2012 13:52:29 +0200, Alban Hertroys wrote:
On 10 May 2012 11:30, Daniel McGreal <daniel.mcgreal@redbite.com>
wrote:I put the multi-value inserts in as I was
curious as to why prepared statements would be slower given they
only plan
the query once (as also does the multi-value insert, I assume).That's a common misconception.
The reason that prepared statements are often slower, is exactly
_because_ they only plan the query once. Because the query-plan is
stored when the query gets prepared, the same plan gets used for
every
combination of query parameters, so it has to be a fairly generic
query plan.OTOH, the multi-value insert knows exactly what combinations of
"parameters" will be used in the query and the query planner can
optimise the query for those parameters. It wouldn't surprise me if
it
would re-evaluate plan branch choices based on which row of values is
currently being inserted.I think it's safe to say that prepared statements are only efficient
when you're dealing with repeated complicated queries, where
preparing
the query plan takes a significant amount of time. It'll also shave
some time off queries that are inefficient regardless of how you
execute them (for example, because the query always needs to perform
a
sequential scan).
They'll also be faster on database servers with a slower query
planner
than the one in Postgres.In most (all?) other cases, executing the query directly is probably
faster.Of course there are other benefits to prepared statements, such as a
natural immunity to SQL injection.
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
May I ask what kind of planning may occur during insert?
Regards,
Radek
On Thu, May 10, 2012 at 6:52 AM, Alban Hertroys <haramrae@gmail.com> wrote:
On 10 May 2012 11:30, Daniel McGreal <daniel.mcgreal@redbite.com> wrote:
I put the multi-value inserts in as I was
curious as to why prepared statements would be slower given they only plan
the query once (as also does the multi-value insert, I assume).That's a common misconception.
The reason that prepared statements are often slower, is exactly
_because_ they only plan the query once. Because the query-plan is
stored when the query gets prepared, the same plan gets used for every
combination of query parameters, so it has to be a fairly generic
query plan.OTOH, the multi-value insert knows exactly what combinations of
"parameters" will be used in the query and the query planner can
optimise the query for those parameters. It wouldn't surprise me if it
would re-evaluate plan branch choices based on which row of values is
currently being inserted.I think it's safe to say that prepared statements are only efficient
when you're dealing with repeated complicated queries, where preparing
the query plan takes a significant amount of time. It'll also shave
some time off queries that are inefficient regardless of how you
execute them (for example, because the query always needs to perform a
sequential scan).
They'll also be faster on database servers with a slower query planner
than the one in Postgres.In most (all?) other cases, executing the query directly is probably faster.
Of course there are other benefits to prepared statements, such as a
natural immunity to SQL injection.
That can be often true, but for simple inserts there is no plan to get
wrong. Prepared statements can knock about 30-50% of statement
latency off in such cases if you're not i/o bound.
Definitely though prepared statements are headache though and I rarely use them.
merlin
On 10 May 2012 15:05, Radosław Smogura <rsmogura@softperience.eu> wrote:
May I ask what kind of planning may occur during insert?
Well, for example, if there's a unique constraint on the table then
the database will have to check that the newly inserted values don't
conflict with values that are already in the table. It needs to plan
an efficient strategy for that, which depends on the values being
inserted.
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
Alban Hertroys <haramrae@gmail.com> writes:
On 10 May 2012 15:05, Radosław Smogura <rsmogura@softperience.eu> wrote:
May I ask what kind of planning may occur during insert?
Well, for example, if there's a unique constraint on the table then
the database will have to check that the newly inserted values don't
conflict with values that are already in the table. It needs to plan
an efficient strategy for that, which depends on the values being
inserted.
There is no planning associated with checking unique constraints; that's
just a matter for the index mechanisms.
I think the real point here is that a simple INSERT/VALUES has such a
trivial plan that there is hardly any gain to be had by avoiding the
planning stage. Then the other overhead of a prepared statement
(looking up the saved plan, checking it's not stale, etc) outweighs
that. Or at least it could. 3x slower seems a bit fishy; I wonder
whether there's some client-side inefficiency involved in that.
Doing performance measurements with pgAdmin seems pretty questionable
in the first place ...
regards, tom lane
Doing the same tests from psql gives:
1. ~2.5 seconds for INSERT/VALUES
2. ~10 seconds for prepared statement executes
3. ~15 seconds for multiple INSERTs
Dan.
On Thu, May 10, 2012 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Alban Hertroys <haramrae@gmail.com> writes:
On 10 May 2012 15:05, Radosław Smogura <rsmogura@softperience.eu> wrote:
May I ask what kind of planning may occur during insert?
Well, for example, if there's a unique constraint on the table then
the database will have to check that the newly inserted values don't
conflict with values that are already in the table. It needs to plan
an efficient strategy for that, which depends on the values being
inserted.There is no planning associated with checking unique constraints; that's
just a matter for the index mechanisms.I think the real point here is that a simple INSERT/VALUES has such a
trivial plan that there is hardly any gain to be had by avoiding the
planning stage. Then the other overhead of a prepared statement
(looking up the saved plan, checking it's not stale, etc) outweighs
that. Or at least it could. 3x slower seems a bit fishy; I wonder
whether there's some client-side inefficiency involved in that.
Doing performance measurements with pgAdmin seems pretty questionable
in the first place ...regards, tom lane