pgstats_initstats() cost

Started by Gavin Sherryover 22 years ago38 messages
#1Gavin Sherry
swm@linuxworld.com.au

I did some basic profiling of CVS HEAD after having read Bruce's post the
other day: When did we get so fast. It seemed to me that the number of
inserts per second wasn't actually all that high so I had a look at some
numbers:

% cumulative self self total
time seconds seconds calls s/call s/call name
7.32 3.48 3.48 100004 0.00 0.00 yyparse
4.42 5.58 2.10 1200107 0.00 0.00 SearchCatCache
4.10 7.53 1.95 900020 0.00 0.00 base_yylex
3.85 9.35 1.83 100150 0.00 0.00 XLogInsert
3.51 11.02 1.67 600540 0.00 0.00 pgstat_initstats
2.24 12.09 1.07 2208418 0.00 0.00 hash_search
1.86 12.97 0.88 2607669 0.00 0.00 hash_any

I am still researching ways of increasing performance of yacc parsers --
there is a very small amount of information on the Web concerning this --
but pgstat_initstats() caught my eye. This gets called about 6 times per
insert (I did 100000 inserts) and the major cost appears to relate to the
linear pgStatTabstatMessages. The comparative performance of
hash_search() suggests that pgStatTabstatMessages may benefit from use of
a hash. However, it seems unreasonable that we're doing work at all in
pgstat_initstats() if the user is not interested in query/block/tuple
stats.

Comments? Have I missed something?

Thanks,

Gavin

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Gavin Sherry (#1)
Re: pgstats_initstats() cost

----- Original Message -----
From: "Gavin Sherry" <swm@linuxworld.com.au>

I am still researching ways of increasing performance of yacc parsers --
there is a very small amount of information on the Web concerning this --

I know some people who will tell you that the best way of improving
performance in this area is not to use yacc (or bison) parsers ...

OTOH we need to understand exactly what you were profiling - if it is 1
dynamic sql statement per insert then it might not be too close to the real
world - a high volume program is likely to require 1 parse per many many
executions, isn't it?

cheers

andrew

#3Gavin Sherry
swm@linuxworld.com.au
In reply to: Andrew Dunstan (#2)
Re: pgstats_initstats() cost

On Mon, 11 Aug 2003, Andrew Dunstan wrote:

----- Original Message -----
From: "Gavin Sherry" <swm@linuxworld.com.au>

I am still researching ways of increasing performance of yacc parsers --
there is a very small amount of information on the Web concerning this --

I know some people who will tell you that the best way of improving
performance in this area is not to use yacc (or bison) parsers ...

Yes. Cost of maintenance vs. performance cost...

OTOH we need to understand exactly what you were profiling - if it is 1
dynamic sql statement per insert then it might not be too close to the real
world - a high volume program is likely to require 1 parse per many many
executions, isn't it?

I wasn't interested in measuring the performance of yacc -- since I know
it is bad. It was a basic test which wasn't even meant to be real
world. It just seemed interesting that the numbers were three times slower
than other databases I ran it on. Here is the script which generates the
SQL:

echo "create table abc(t text);"
echo "begin;"
c=0
while [ $c -lt 100000 ]
do
echo "insert into abc values('thread1');";
c=$[$c+1]
done
echo "commit;"

Thanks,

Gavin

#4Rod Taylor
rbt@rbt.ca
In reply to: Gavin Sherry (#3)
Re: pgstats_initstats() cost

world. It just seemed interesting that the numbers were three times slower
than other databases I ran it on. Here is the script which generates the

You were comparing against databases with similar safety nets to
guarantee against dataloss?

#5Gavin Sherry
swm@linuxworld.com.au
In reply to: Rod Taylor (#4)
Re: pgstats_initstats() cost

On Tue, 12 Aug 2003, Rod Taylor wrote:

world. It just seemed interesting that the numbers were three times slower
than other databases I ran it on. Here is the script which generates the

You were comparing against databases with similar safety nets to
guarantee against dataloss?

I am in the process of reading through the logging/versioning code of them
and the others definately do. My main interest is in determining how to
reduce the cost of pgstats_initstats().

Thanks,

Gavin

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#1)
Re: pgstats_initstats() cost

Gavin Sherry <swm@linuxworld.com.au> writes:

but pgstat_initstats() caught my eye. This gets called about 6 times per
insert (I did 100000 inserts) and the major cost appears to relate to the
linear pgStatTabstatMessages. The comparative performance of
hash_search() suggests that pgStatTabstatMessages may benefit from use of
a hash. However, it seems unreasonable that we're doing work at all in
pgstat_initstats() if the user is not interested in query/block/tuple
stats.

The coding in the search loop could perhaps be tightened a little, but
I'd think the last point should be addressed by dropping out via the
"no_stats" exit if stats aren't being gathered.

I doubt a hash is worth maintaining, because the active tabstat entries
should only be for tables that are being touched in the current command
(thus, there are not more than six in your example). I'm not sure why
it takes so much time to look through six entries though ...

regards, tom lane

#7Gavin Sherry
swm@linuxworld.com.au
In reply to: Tom Lane (#6)
1 attachment(s)
Re: pgstats_initstats() cost

On Tue, 12 Aug 2003, Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

but pgstat_initstats() caught my eye. This gets called about 6 times per
insert (I did 100000 inserts) and the major cost appears to relate to the
linear pgStatTabstatMessages. The comparative performance of
hash_search() suggests that pgStatTabstatMessages may benefit from use of
a hash. However, it seems unreasonable that we're doing work at all in
pgstat_initstats() if the user is not interested in query/block/tuple
stats.

The coding in the search loop could perhaps be tightened a little, but
I'd think the last point should be addressed by dropping out via the
"no_stats" exit if stats aren't being gathered.

I doubt a hash is worth maintaining, because the active tabstat entries
should only be for tables that are being touched in the current command
(thus, there are not more than six in your example). I'm not sure why
it takes so much time to look through six entries though ...

Neither. I might look into it further later, but here's a patch to exit
out of pgstat_initstats() if we're not collecting stats (attached).

Thanks,

Gavin

Attachments:

pgstat.difftext/plain; charset=US-ASCII; name=pgstat.diffDownload
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql-server/src/backend/postmaster/pgstat.c,v
retrieving revision 1.42
diff -2 -c -r1.42 pgstat.c
*** src/backend/postmaster/pgstat.c     4 Aug 2003 00:43:21 -0000       1.42
--- src/backend/postmaster/pgstat.c     11 Aug 2003 02:14:53 -0000
***************
*** 829,832 ****
--- 829,838 ----
        stats->index_scan_counted = FALSE;

+     if (!pgstat_collect_querystring && !pgstat_collect_tuplelevel &&
+         !pgstat_collect_blocklevel) {
+               stats->no_stats = TRUE;
+         return;
+       }
+
        if (pgStatSock < 0)
        {
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
Re: pgstats_initstats() cost

I said:

I doubt a hash is worth maintaining, because the active tabstat entries
should only be for tables that are being touched in the current command
(thus, there are not more than six in your example). I'm not sure why
it takes so much time to look through six entries though ...

I replicated your example, and soon found that in fact there were
forty-three active tabstat slots, which makes pgstat_initstats a little
bit more credible as a time-waster.

The reason why there were forty-three, in a statement that's only
touching one relation, is that with the present coding of pgstats,
if you have stats gathering off then there will be an active entry
for every relation that's been touched since backend launch.
pgstat_report_tabstat() should have flushed the entries, but
*if reporting is disabled then it fails to do so*.

This is clearly a bug. Will fix.

regards, tom lane

#9sla-net
slambert@vediovis.net
In reply to: Gavin Sherry (#3)
concatenation = = transaction ?????

Hi everybody

It's my first post here, so be indulgent ;)

Just to confirm : if i do

$sql1='insert into "Enfant" ("NomEnfant","PrenomEnfant") VALUES
('.$NomEnfant.','.$PrenomEnfant.')

$sql2='insert into "IndividuEnfant" ("IdIndividu","IdEnfant") VALUES
('.$IdIndividu.',currval(\'"Enfant_Seq"\'));

$sql=$sql1.$sql2;

$Res=pg_query($sql);

so 2 query concatenation a a only one, send to postgress database : is it
EXACTLY the same thing than a classic transaction with a begin and a commit
or rollback ?

Tcks, it's just i dont want to write everything again in 6 months.....

S.L.

PS : why limitation to 8 patrameters in stored procedures ??????

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#7)
Re: pgstats_initstats() cost

Gavin Sherry <swm@linuxworld.com.au> writes:

Neither. I might look into it further later, but here's a patch to exit
out of pgstat_initstats() if we're not collecting stats (attached).

Applied along with other fixes.

regards, tom lane

#11Josh Berkus
josh@agliodbs.com
In reply to: sla-net (#9)
Re: concatenation = = transaction ?????

Sla,

PS : why limitation to 8 patrameters in stored procedures ??????

What version of PostgreSQL are you using?

The limit is 16 parameters for 7.1 and 7.2, raised to 32 parameters in 7.3 and
after. Further, you can raise the limit yourself at compile-time, although
I understand it incurrs a penalty in index efficiency.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#3)
Parsing speed (was Re: pgstats_initstats() cost)

Gavin Sherry <swm@linuxworld.com.au> writes:

I wasn't interested in measuring the performance of yacc -- since I know
it is bad. It was a basic test which wasn't even meant to be real
world. It just seemed interesting that the numbers were three times slower
than other databases I ran it on. Here is the script which generates the
SQL:

echo "create table abc(t text);"
echo "begin;"
c=0
while [ $c -lt 100000 ]
do
echo "insert into abc values('thread1');";
c=$[$c+1]
done
echo "commit;"

Of course the obvious way of getting rid of the parser overhead is not
to parse everytime --- viz, to use prepared statements.

I have just finished running some experiments that compared a series of
INSERTs issued via PQexec() versus preparing an INSERT command and then
issuing new-FE-protocol Bind and Execute commands against the prepared
statement. With a test case like the above (one target column and a
prepared statement like "insert into abc values($1)"), I saw about a 30%
speedup. (Or at least I did after fixing a couple of bottlenecks in the
backend's per-client-message loop.)

Of course, the amount of work needed to parse this INSERT command is
pretty trivial. With just a slightly more complex test case:
create table abc (f1 text, f2 int, f3 float8);
and a prepared statement like
PREPARE mystmt(text,int,float8) AS insert into abc values($1,$2,$3)
there was a factor of two difference in the speed.

This leaves us with a bit of a problem, though, because there isn't any
libpq API that allows access to this speedup. I put in a routine to
support Parse/Bind/Execute so that people could use out-of-line
parameters for safety reasons --- but there's no function to do
Bind/Execute against a pre-existing prepared statement. (I had to make
a hacked version of libpq to do the above testing.)

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

regards, tom lane

#13Rod Taylor
rbt@rbt.ca
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

On Tue, 2003-08-12 at 15:36, Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

I wasn't interested in measuring the performance of yacc -- since I know
it is bad. It was a basic test which wasn't even meant to be real
world. It just seemed interesting that the numbers were three times slower
than other databases I ran it on. Here is the script which generates the
SQL:

echo "create table abc(t text);"
echo "begin;"
c=0
while [ $c -lt 100000 ]
do
echo "insert into abc values('thread1');";
c=$[$c+1]
done
echo "commit;"

Of course the obvious way of getting rid of the parser overhead is not
to parse everytime --- viz, to use prepared statements.

I have just finished running some experiments that compared a series of
INSERTs issued via PQexec() versus preparing an INSERT command and then
issuing new-FE-protocol Bind and Execute commands against the prepared
statement. With a test case like the above (one target column and a
prepared statement like "insert into abc values($1)"), I saw about a 30%
speedup. (Or at least I did after fixing a couple of bottlenecks in the
backend's per-client-message loop.)

Of course, the amount of work needed to parse this INSERT command is
pretty trivial. With just a slightly more complex test case:
create table abc (f1 text, f2 int, f3 float8);
and a prepared statement like
PREPARE mystmt(text,int,float8) AS insert into abc values($1,$2,$3)
there was a factor of two difference in the speed.

Do you happen to have any numbers comparing prepared inserts in a single
transaction against copy?

#14Jon Jensen
jon@endpoint.com
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

On Tue, 12 Aug 2003, Tom Lane wrote:

I have just finished running some experiments that compared a series of
INSERTs issued via PQexec() versus preparing an INSERT command and then
issuing new-FE-protocol Bind and Execute commands against the prepared
statement. With a test case like the above (one target column and a
prepared statement like "insert into abc values($1)"), I saw about a 30%
speedup. (Or at least I did after fixing a couple of bottlenecks in the
backend's per-client-message loop.)

[snip]

This leaves us with a bit of a problem, though, because there isn't any
libpq API that allows access to this speedup. I put in a routine to
support Parse/Bind/Execute so that people could use out-of-line
parameters for safety reasons --- but there's no function to do
Bind/Execute against a pre-existing prepared statement. (I had to make
a hacked version of libpq to do the above testing.)

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

I think it would be well worth waiting for this feature. I often bind
parameters in Perl's DBD::Pg, looking to the future when that doesn't just
handle quoting, but also uses faster cached prepared queries. It'd be
great to see that in 7.4.

Jon

#15Sean Chittenden
sean@chittenden.org
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

This leaves us with a bit of a problem, though, because there isn't
any libpq API that allows access to this speedup. I put in a
routine to support Parse/Bind/Execute so that people could use
out-of-line parameters for safety reasons --- but there's no
function to do Bind/Execute against a pre-existing prepared
statement. (I had to make a hacked version of libpq to do the above
testing.)

I'm beginning to think that was a serious omission. I'm tempted to
fix it, even though we're past feature freeze for 7.4. Comments?

On a quasi-similar note (and unless I've missed how to do this), you
can't create a cursor from a prepared statement, which I found
frustrating. On frequently used queries, I've gotten in the habbit of
preparing the queries at connect time and then executing the query,
but with larger queries, it's problematic to not be able to use a
cursor in addition to the prepared statement.

-sc

--
Sean Chittenden

#16Alvaro Herrera Munoz
alvherre@dcc.uchile.cl
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

On Tue, Aug 12, 2003 at 03:36:07PM -0400, Tom Lane wrote:

This leaves us with a bit of a problem, though, because there isn't any
libpq API that allows access to this speedup. I put in a routine to
support Parse/Bind/Execute so that people could use out-of-line
parameters for safety reasons --- but there's no function to do
Bind/Execute against a pre-existing prepared statement. (I had to make
a hacked version of libpq to do the above testing.)

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

Please do. It could be argued that the feature is already there, and that
the fact that there's no way to use it through libpq is actually a bug.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Hay que recordar que la existencia en el cosmos, y particularmente la
elaboraci���n de civilizaciones dentre de ���l no son, por desgracia,
nada id���licas" (Ijon Tichy)

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rod Taylor (#13)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Rod Taylor <rbt@rbt.ca> writes:

Do you happen to have any numbers comparing prepared inserts in a single
transaction against copy?

COPY is about a factor of 6 faster, it appears. I got 11.8 versus
1.87 seconds for loading the same amount of data (with the 3-column
test table). So COPY's not gonna go away any time soon.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sean Chittenden (#15)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Sean Chittenden <sean@chittenden.org> writes:

I'm beginning to think that was a serious omission. I'm tempted to
fix it, even though we're past feature freeze for 7.4. Comments?

On a quasi-similar note (and unless I've missed how to do this), you
can't create a cursor from a prepared statement, which I found
frustrating.

Hmm. I'd be willing to see that put on TODO, but it strikes me as a
rather large change for post-feature-freeze; it'd require fooling around
in substantial parts of the backend. (What I was suggesting above only
requires adding one or two routines to libpq, which is a lot less
invasive...)

regards, tom lane

#19Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Of course the obvious way of getting rid of the parser overhead is not
to parse everytime --- viz, to use prepared statements.

I think this would be nice to have too... On a similar note (I think
anyway) I wasn't able to find any functions for bulk dumps into memory
space from a select statement. This would probably be alot more work
but I find it very useful with other databases I work on (mainly
Oracle).

You probably know but I'll quickly outline it to point out the
differences, as I see them, from the 'COPY' ability. Basically the user
defines their own C structure and then malloc's an array of them. The
user then tells the database the type, offset from start of structure
and the skip (size of structure) for each column returned by the select
statement. The user can then do 'bulk' grabs with a single command into
the memory space allocated, doing more than one and changing the offsets
inbetween if more is returned than was initially allocated for. The
user can realloc or allocate new segments and do their own handling of
the segments if they choose.

The same is true for 'insert' statements, in reverse, of course. This
avoids alot of unnecssary parsing and type conversion (where possible).
This does give more work to the library since it has to be able to do
type conversions in some cases where the database type and the user
requested type differ.

Thanks,
Stephen

#20Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Tom Lane wrote:

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

Seems pretty well isolated. If you're tallying votes, count me as a "yay".

Joe

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#17)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Rod Taylor <rbt@rbt.ca> writes:

Do you happen to have any numbers comparing prepared inserts in a single
transaction against copy?

COPY is about a factor of 6 faster, it appears. I got 11.8 versus
1.87 seconds for loading the same amount of data (with the 3-column
test table). So COPY's not gonna go away any time soon.

BTW, that was on a local Unix-socket connection. On a network
connection, COPY would have a huge advantage because (a) it could
transfer multiple rows per packet, if your rows are not too long,
and (b) you would not have to wait for the server to answer back
before sending the next packet.

In theory you could pipeline INSERT commands like that too, but not
with libpq.

regards, tom lane

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#19)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Stephen Frost <sfrost@snowman.net> writes:

You probably know but I'll quickly outline it to point out the
differences, as I see them, from the 'COPY' ability. Basically the user
defines their own C structure and then malloc's an array of them. The
user then tells the database the type, offset from start of structure
and the skip (size of structure) for each column returned by the select
statement. The user can then do 'bulk' grabs with a single command into
the memory space allocated, doing more than one and changing the offsets
inbetween if more is returned than was initially allocated for. The
user can realloc or allocate new segments and do their own handling of
the segments if they choose.

[shrug] That seems like a substantial increase in API complexity for
at best marginal performance gains. What does it gain for the user to
malloc space rather than libpq?

regards, tom lane

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#20)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

Seems pretty well isolated. If you're tallying votes, count me as a "yay".

Well, the early voting is running heavily to the 'yea' side. Let's talk
about what to call the thing.

Pre-7.4 we had "PQexec" (synchronous) and "PQsendQuery" (asynchronous)
to send a plain text query string (Simple Query message in new-FE-protocol-
speak). I added "PQexecParams" and "PQsendQueryParams" to do
Parse/Bind/Execute, choosing the names on the basis that being able to
pass out-of-line parameters was the primary new thing they were giving
the user.

A Bind/Execute facility would need a pair of routines with signatures
very similar to PQexecParams/PQsendQueryParams --- they'd take a
prepared statement name instead of a query string, and they'd not need
an array of parameter type OIDs, but otherwise the same. I couldn't
come up with very good names for them though. Best idea so far is
PQexecPrepared/PQsendQueryPrepared, but maybe someone has a better one?

We could also think about providing an interface to do just Parse,
although this is inessential since you can set up a prepared statement
by PQexec'ing a PREPARE command.

regards, tom lane

#24Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#22)
Re: Parsing speed (was Re: pgstats_initstats() cost)

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

You probably know but I'll quickly outline it to point out the
differences, as I see them, from the 'COPY' ability. Basically the user
defines their own C structure and then malloc's an array of them. The
user then tells the database the type, offset from start of structure
and the skip (size of structure) for each column returned by the select
statement. The user can then do 'bulk' grabs with a single command into
the memory space allocated, doing more than one and changing the offsets
inbetween if more is returned than was initially allocated for. The
user can realloc or allocate new segments and do their own handling of
the segments if they choose.

[shrug] That seems like a substantial increase in API complexity for
at best marginal performance gains. What does it gain for the user to
malloc space rather than libpq?

Perhaps I'm just used to dealing with very large selects.. When
dealing with Oracle it's been my experiance that it doesn't grab the
entire select return and store in local memory for the user to puruse
using the appropriate calls. It grabs a portion and stores it in a
local cache and then gets more from the server periodically as the user
requests more. This avoids having the library malloc'ing a very large
amount of memory when a very large query is done. Doing the 'bulk'
transfer avoids the cache entirely and just dumps the data from the
server into the user's allocated memory area. The user can then step
through it, gain what they need from it, throw it out and get the next
batch. If libpq grabs the entire result in one go then that may
actually cause a problem for me when I start to move things from Oracle
to postgres since the clients don't always have much memory available.

Otherwise it would just avoid the function calls to libpq in the loop
which passes over the data. That probably wouldn't be that much of a
gain if libpq has all the data local though.

Stephen

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#24)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Stephen Frost <sfrost@snowman.net> writes:

... If libpq grabs the entire result in one go then that may
actually cause a problem for me when I start to move things from Oracle
to postgres since the clients don't always have much memory available.

It does that in an ordinary SELECT. The customary answer to this has
been "use a cursor and FETCH the number of rows you want to process at
a time".

regards, tom lane

#26Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#23)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Tom Lane wrote:

A Bind/Execute facility would need a pair of routines with signatures
very similar to PQexecParams/PQsendQueryParams --- they'd take a
prepared statement name instead of a query string, and they'd not need
an array of parameter type OIDs, but otherwise the same. I couldn't
come up with very good names for them though. Best idea so far is
PQexecPrepared/PQsendQueryPrepared, but maybe someone has a better one?

Those sound reasonable to me.

We could also think about providing an interface to do just Parse,
although this is inessential since you can set up a prepared statement
by PQexec'ing a PREPARE command.

Agreed -- that sounds like a nice-to-have for some future release, but
not enough of an issue to warrant getting put into 7.4 at this point.

Joe

#27Gavin Sherry
swm@linuxworld.com.au
In reply to: Tom Lane (#12)
Re: Parsing speed (was Re: pgstats_initstats() cost)

On Tue, 12 Aug 2003, Tom Lane wrote:

This leaves us with a bit of a problem, though, because there isn't any
libpq API that allows access to this speedup. I put in a routine to
support Parse/Bind/Execute so that people could use out-of-line
parameters for safety reasons --- but there's no function to do
Bind/Execute against a pre-existing prepared statement. (I had to make
a hacked version of libpq to do the above testing.)

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

Can you give an example of this usage of this API? I am wondering whether
releasing this specific feature would be eclipsed by a generalised bound
variables solution in a future release... still, that's a nice speed up
:-).

Thanks,

Gavin

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#27)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Gavin Sherry <swm@linuxworld.com.au> writes:

On Tue, 12 Aug 2003, Tom Lane wrote:

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

Can you give an example of this usage of this API? I am wondering whether
releasing this specific feature would be eclipsed by a generalised bound
variables solution in a future release... still, that's a nice speed up
:-).

Attached is the test code I was using to compare speeds. It won't do
you much good without the accompanying libpq mods, but it's enough to
illustrate the usage. (This is a hacked version of example program 3 from
the 7.4 libpq docs.)

regards, tom lane

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include "libpq-fe.h"

/* for ntohl/htonl */
#include <netinet/in.h>
#include <arpa/inet.h>

static void
exit_nicely(PGconn *conn)
{
PQfinish(conn);
exit(1);
}

int
main(int argc, char **argv)
{
const char *conninfo;
PGconn *conn;
PGresult *res;
const char *paramValues[3];
int tries, ntries;

/*
* If the user supplies a parameter on the command line, use it as the
* conninfo string; otherwise default to setting dbname=template1 and
* using environment variables or defaults for all other connection
* parameters.
*/
if (argc > 1)
conninfo = argv[1];
else
conninfo = "dbname = template1";

if (argc > 2)
ntries = atoi(argv[2]);
else
ntries = 10;

/* Make a connection to the database */
conn = PQconnectdb(conninfo);

/* Check to see that the backend connection was successfully made */
if (PQstatus(conn) != CONNECTION_OK)
{
fprintf(stderr, "Connection to database '%s' failed.\n", PQdb(conn));
fprintf(stderr, "%s", PQerrorMessage(conn));
exit_nicely(conn);
}

res = PQexec(conn, "PREPARE mystmt(text,int,float8) AS insert into abc values($1,$2,$3)");
if (PQresultStatus(res) != PGRES_COMMAND_OK)
{
fprintf(stderr, "PREPARE failed: %s", PQerrorMessage(conn));
PQclear(res);
exit_nicely(conn);
}
PQclear(res);

res = PQexec(conn, "BEGIN");
if (PQresultStatus(res) != PGRES_COMMAND_OK)
{
fprintf(stderr, "BEGIN failed: %s", PQerrorMessage(conn));
PQclear(res);
exit_nicely(conn);
}
PQclear(res);

for (tries = 0; tries < ntries; tries++)
{
#if 0
res = PQexec(conn, "insert into abc values('joe''s place',22,123.4)");
#endif

#if 0
/* Here are our out-of-line parameter values */
paramValues[0] = "joe's place";
paramValues[1] = "22";
paramValues[2] = "123.4";

res = PQexecParams(conn,
"insert into abc values($1,$2,$3)",
3, /* 3 params */
NULL, /* let the backend deduce param type */
paramValues,
NULL, /* don't need param lengths since text */
NULL, /* default to all text params */
1); /* ask for binary results */
#endif

#if 1
/* Here are our out-of-line parameter values */
paramValues[0] = "joe's place";
paramValues[1] = "22";
paramValues[2] = "123.4";

res = PQexecPrepared(conn,
"mystmt",
3, /* 3 params */
paramValues,
NULL, /* don't need param lengths since text */
NULL, /* default to all text params */
1); /* ask for binary results */
#endif

if (PQresultStatus(res) != PGRES_COMMAND_OK)
{
fprintf(stderr, "INSERT failed: %s", PQerrorMessage(conn));
PQclear(res);
exit_nicely(conn);
}

PQclear(res);
}

res = PQexec(conn, "COMMIT");
if (PQresultStatus(res) != PGRES_COMMAND_OK)
{
fprintf(stderr, "COMMIT failed: %s", PQerrorMessage(conn));
PQclear(res);
exit_nicely(conn);
}
PQclear(res);

/* close the connection to the database and cleanup */
PQfinish(conn);

return 0;
}

#29Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Gavin Sherry (#3)
Re: Parsing speed (was Re: pgstats_initstats() cost)

This leaves us with a bit of a problem, though, because there isn't any
libpq API that allows access to this speedup. I put in a routine to
support Parse/Bind/Execute so that people could use out-of-line
parameters for safety reasons --- but there's no function to do
Bind/Execute against a pre-existing prepared statement. (I had to make
a hacked version of libpq to do the above testing.)

I'm beginning to think that was a serious omission. I'm tempted to fix
it, even though we're past feature freeze for 7.4. Comments?

I say fix it...

Chris

#30Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Gavin Sherry (#3)
Re: Parsing speed (was Re: pgstats_initstats() cost)

We could also think about providing an interface to do just Parse,
although this is inessential since you can set up a prepared statement
by PQexec'ing a PREPARE command.

Wait just a minute! phpPgAdmin would love to be able to 'parse' arbitrary
sql entered by the user to separate semi-coloned queries, identify various
types of queries, etc. What would a Parse call allow us to do?

Chris

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#30)
Re: Parsing speed (was Re: pgstats_initstats() cost)

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

We could also think about providing an interface to do just Parse,
although this is inessential since you can set up a prepared statement
by PQexec'ing a PREPARE command.

Wait just a minute! phpPgAdmin would love to be able to 'parse' arbitrary
sql entered by the user to separate semi-coloned queries, identify various
types of queries, etc. What would a Parse call allow us to do?

Hm. I was about to say "very little that you can't do with a PREPARE",
but if you don't want to even count semicolons then Parse would be
distinctly safer. For example if the string is
SELECT * FROM foo; UPDATE foo SET ...
then sticking a PREPARE in front would not have the desired effect ---
but sending it in a Parse message would result in a syntax error.
Not sure if that helps you get to your goal though.

regards, tom lane

#32Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Tom Lane (#31)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Wait just a minute! phpPgAdmin would love to be able to 'parse' arbitrary
sql entered by the user to separate semi-coloned queries, identify various
types of queries, etc. What would a Parse call allow us to do?

Hm. I was about to say "very little that you can't do with a PREPARE",
but if you don't want to even count semicolons then Parse would be
distinctly safer. For example if the string is
SELECT * FROM foo; UPDATE foo SET ...
then sticking a PREPARE in front would not have the desired effect ---
but sending it in a Parse message would result in a syntax error.
Not sure if that helps you get to your goal though.

What do you actually get back from a Parse request?

Chris

#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#32)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

What do you actually get back from a Parse request?

Nothing. If successful, it creates a prepared statement inside the
server.

It might possibly make sense for a libpq routine that exposes Parse
to actually do Parse followed by Describe Statement; that would allow
it to give back (a) an indication of the number and types of parameters
needed by the statement, and (b) an indication of the column set to be
returned, if it's a SELECT. However, the protocol doesn't tell anything
about the type of a non-SELECT statement. In any case, this would
require more invention and coding than I care to do at this point in
the release cycle (since there's no support in the guts of libpq for
accepting ParameterDescription messages from the backend). If that's
what we think we want, we'd better put it on the wish-list for 7.5.

regards, tom lane

#34Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#18)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Added to TODO:

* Allow PREPARE of cursors

---------------------------------------------------------------------------

Tom Lane wrote:

Sean Chittenden <sean@chittenden.org> writes:

I'm beginning to think that was a serious omission. I'm tempted to
fix it, even though we're past feature freeze for 7.4. Comments?

On a quasi-similar note (and unless I've missed how to do this), you
can't create a cursor from a prepared statement, which I found
frustrating.

Hmm. I'd be willing to see that put on TODO, but it strikes me as a
rather large change for post-feature-freeze; it'd require fooling around
in substantial parts of the backend. (What I was suggesting above only
requires adding one or two routines to libpq, which is a lot less
invasive...)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#35Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#33)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Is there a TODO here? Text?

---------------------------------------------------------------------------

Tom Lane wrote:

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

What do you actually get back from a Parse request?

Nothing. If successful, it creates a prepared statement inside the
server.

It might possibly make sense for a libpq routine that exposes Parse
to actually do Parse followed by Describe Statement; that would allow
it to give back (a) an indication of the number and types of parameters
needed by the statement, and (b) an indication of the column set to be
returned, if it's a SELECT. However, the protocol doesn't tell anything
about the type of a non-SELECT statement. In any case, this would
require more invention and coding than I care to do at this point in
the release cycle (since there's no support in the guts of libpq for
accepting ParameterDescription messages from the backend). If that's
what we think we want, we'd better put it on the wish-list for 7.5.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#36Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Bruce Momjian (#35)
Re: Parsing speed (was Re: pgstats_initstats() cost)

It might possibly make sense for a libpq routine that exposes Parse
to actually do Parse followed by Describe Statement; that would allow
it to give back (a) an indication of the number and types of parameters
needed by the statement, and (b) an indication of the column set to be
returned, if it's a SELECT. However, the protocol doesn't tell anything
about the type of a non-SELECT statement. In any case, this would
require more invention and coding than I care to do at this point in
the release cycle (since there's no support in the guts of libpq for
accepting ParameterDescription messages from the backend). If that's
what we think we want, we'd better put it on the wish-list for 7.5.

If we had a Parse function, then we at phpPgAdmin could allow Reports to
contain parameters, and detect as such, and then when they run their report,
they can enter the values for that run.

Chris

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#36)
Re: Parsing speed (was Re: pgstats_initstats() cost)

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

... If that's
what we think we want, we'd better put it on the wish-list for 7.5.

If we had a Parse function, then we at phpPgAdmin could allow Reports to
contain parameters, and detect as such, and then when they run their report,
they can enter the values for that run.

If you want to do that, then I think you need the whole nine yards
including DescribeStatement support. Too late for 7.4, but let's make
it a TODO for 7.5:

* Add a libpq function to support Parse/DescribeStatement capability

regards, tom lane

#38Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#37)
Re: Parsing speed (was Re: pgstats_initstats() cost)

Tom Lane wrote:

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

... If that's
what we think we want, we'd better put it on the wish-list for 7.5.

If we had a Parse function, then we at phpPgAdmin could allow Reports to
contain parameters, and detect as such, and then when they run their report,
they can enter the values for that run.

If you want to do that, then I think you need the whole nine yards
including DescribeStatement support. Too late for 7.4, but let's make
it a TODO for 7.5:

* Add a libpq function to support Parse/DescribeStatement capability

Added.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073