7.3 schedule
Is anyone feeling we have the 7.3 release nearing? I certainly am not.
I can imagine us going for several more months like this, perhaps
through August.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Is anyone feeling we have the 7.3 release nearing?
No way!
I certainly am not.
I can imagine us going for several more months like this, perhaps
through August.
Easily. I think that the critical path is Tom's schema support.
We'll need a good beta period this time, because of:
* Schemas
* Prepare/Execute maybe
* Domains
Chris
Christopher Kings-Lynne wrote:
Is anyone feeling we have the 7.3 release nearing?
No way!
Good.
I certainly am not.
I can imagine us going for several more months like this, perhaps
through August.Easily. I think that the critical path is Tom's schema support.
We'll need a good beta period this time, because of:
* Schemas
* Prepare/Execute maybe
* Domains
I guess I am hoping for even more killer features for this release.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Christopher Kings-Lynne wrote:
Is anyone feeling we have the 7.3 release nearing?
No way!
I certainly am not.
I can imagine us going for several more months like this, perhaps
through August.Easily. I think that the critical path is Tom's schema support.
We'll need a good beta period this time, because of:
* Schemas
* Prepare/Execute maybe
What are the chances that the BE/FE will be altered to take advantage of
prepare / execute? Or is it something that will "never happen"?
* Domains
Chris
Ashley Cambrell
For the next release and package it would be good to differentiate the
release candidate to the proper release. (7.2.1 had the same name and it can
be confusing). a suffix postgresql-7.3-RCN.tar.gz is enough to make the
difference between different verisons of release candidates and the final
release.
----- Original Message -----
From: "Ashley Cambrell" <ash@freaky-namuh.com>
To: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
Sent: Thursday, April 11, 2002 4:25 PM
Subject: Re: [HACKERS] 7.3 schedule
Show quoted text
Christopher Kings-Lynne wrote:
Is anyone feeling we have the 7.3 release nearing?
No way!
I certainly am not.
I can imagine us going for several more months like this, perhaps
through August.Easily. I think that the critical path is Tom's schema support.
We'll need a good beta period this time, because of:
* Schemas
* Prepare/Execute maybeWhat are the chances that the BE/FE will be altered to take advantage of
prepare / execute? Or is it something that will "never happen"?* Domains
Chris
Ashley Cambrell
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
We'll need a good beta period this time, because of:
I know it's a sore subject, but how about "ALTER TABLE DROP COLUMN" this
time around? I've been hearing about it for years now. :)
- brandon
----------------------------------------------------------------------------
c: 646-456-5455 h: 201-798-4983
b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5
Nicolas Bazin writes:
For the next release and package it would be good to differentiate the
release candidate to the proper release.
They do have different names.
--
Peter Eisentraut peter_e@gmx.net
On Thu, 11 Apr 2002 16:25:24 +1000
"Ashley Cambrell" <ash@freaky-namuh.com> wrote:
What are the chances that the BE/FE will be altered to take advantage of
prepare / execute? Or is it something that will "never happen"?
Is there a need for this? The current patch I'm working on just
does everything using SQL statements, which I don't think is
too bad (the typical client programmer won't actually need to
see them, their interface should wrap the PREPARE/EXECUTE stuff
for them).
On the other hand, there are already a few reasons to make some
changes to the FE/BE protocol (NOTIFY messages, transaction state,
and now possibly PREPARE/EXECUTE -- anything else?). IMHO, each of
these isn't worth changing the protocol by itself, but perhaps if
we can get all 3 in one swell foop it might be a good idea...
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
On Thu, 11 Apr 2002, Bruce Momjian wrote:
Is anyone feeling we have the 7.3 release nearing? I certainly am not.
I can imagine us going for several more months like this, perhaps
through August.
seeing as how we just released v7.2, I don't see a v7.3 even going beta
until end of Summer ... I personally consider July/August to be relatively
dead months since too much turnover of ppl going on holidays with their
kids ... right now, I'm kinda seeing Sept 1st/Labour Day Weekend timeframe
from going Beta ...
Neil Conway <nconway@klamath.dyndns.org> writes:
On the other hand, there are already a few reasons to make some
changes to the FE/BE protocol (NOTIFY messages, transaction state,
and now possibly PREPARE/EXECUTE -- anything else?).
Passing EXECUTE parameters without having them go through the parser
could possibly be done without a protocol change: use the 'fast path'
function-call code to pass binary parameters to a function that is
otherwise equivalent to EXECUTE.
On the other hand, the 'fast path' protocol itself is pretty horribly
misdesigned, and I'm not sure I want to encourage more use of it until
we can get it cleaned up (see the comments in backend/tcop/fastpath.c).
Aside from lack of robustness, I'm not sure it can work at all for
functions that don't have prespecified types and numbers of parameters.
The FE/BE COPY protocol is also horrible. So yeah, there are a bunch of
things we *could* fix if we were ready to take on a protocol change.
My own thought is this might be better held for 7.4, though. We are
already going to be causing application programmers a lot of pain with
the schema changes and ensuing system-catalog revisions. That might
be enough on their plates for this cycle.
In any case, for the moment I think it's fine to be working on
PREPARE/EXECUTE support at the SQL level. We can worry about adding
a parser bypass for EXECUTE parameters later.
regards, tom lane
On Thu, 2002-04-11 at 18:14, Tom Lane wrote:
Neil Conway <nconway@klamath.dyndns.org> writes:
On the other hand, there are already a few reasons to make some
changes to the FE/BE protocol (NOTIFY messages, transaction state,
and now possibly PREPARE/EXECUTE -- anything else?).Passing EXECUTE parameters without having them go through the parser
could possibly be done without a protocol change: use the 'fast path'
function-call code to pass binary parameters to a function that is
otherwise equivalent to EXECUTE.On the other hand, the 'fast path' protocol itself is pretty horribly
misdesigned, and I'm not sure I want to encourage more use of it until
we can get it cleaned up (see the comments in backend/tcop/fastpath.c).
Aside from lack of robustness, I'm not sure it can work at all for
functions that don't have prespecified types and numbers of parameters.The FE/BE COPY protocol is also horrible. So yeah, there are a bunch of
things we *could* fix if we were ready to take on a protocol change.
Also _universal_ binary on-wire representation for types would be a good
thing. There already are slots in pg_type for functions to do that. By
doing so we could also avoid parsing text representations of field data.
My own thought is this might be better held for 7.4, though. We are
already going to be causing application programmers a lot of pain with
the schema changes and ensuing system-catalog revisions. That might
be enough on their plates for this cycle.In any case, for the moment I think it's fine to be working on
PREPARE/EXECUTE support at the SQL level. We can worry about adding
a parser bypass for EXECUTE parameters later.
IIRC someone started work on modularising the network-related parts with
a goal of supporting DRDA (DB2 protocol) and others in future.
-----------------
Hannu
Neil Conway wrote:
On Thu, 11 Apr 2002 16:25:24 +1000
"Ashley Cambrell" <ash@freaky-namuh.com> wrote:What are the chances that the BE/FE will be altered to take advantage of
prepare / execute? Or is it something that will "never happen"?Is there a need for this? The current patch I'm working on just
does everything using SQL statements, which I don't think is
too bad (the typical client programmer won't actually need to
see them, their interface should wrap the PREPARE/EXECUTE stuff
for them).
Yes there is a need.
If you break up the query into roughly three stages of execution:
parse, plan, and execute, each of these can be the performance
bottleneck. The parse can be the performance bottleneck when passing
large values as data to the parser (eg. inserting one row containing a
100K value will result in a 100K+ sized statement that needs to be
parsed, parsing will take a long time, but the planning and execution
should be relatively short). The planning stage can be a bottleneck for
complex queries. And of course the execution stage can be a bottleneck
for all sorts of reasons (eg. bad plans, missing indexes, bad
statistics, poorly written sql, etc.).
So if you look at the three stages (parse, plan, execute) we have a lot
of tools, tips, and techniques for making the execute faster. We have
some tools (at least on the server side via SPI, and plpgsql) to help
minimize the planning costs by reusing plans. But there doesn't exist
much to help with the parsing cost of large values (actually the
fastpath API does help in this regard, but everytime I mention it Tom
responds that the fastpath API should be avoided).
So when I look at the proposal for the prepare/execute stuff:
PREPARE <plan> AS <query>;
EXECUTE <plan> USING <parameters>;
DEALLOCATE <plan>;
Executing a sql statement today is the following:
insert into table values (<stuff>);
which does one parse, one plan, one execute
under the new functionality:
prepare <plan> as insert into table values (<stuff>);
execute <plan> using <stuff>;
which does two parses, one plan, one execute
which obviously isn't a win unless you end up reusing the plan many
times. So lets look at the case of reusing the plan multiple times:
prepare <plan> as insert into table values (<stuff>);
execute <plan> using <stuff>;
execute <plan> using <stuff>;
...
which does n+1 parses, one plan, n executes
so this is a win if the cost of the planing stage is significant
compared to the costs of the parse and execute stages. If the cost of
the plan is not significant there is little if any benefit in doing this.
I realize that there are situations where this functionality will be a
big win. But I question how the typical user of postgres will know when
they should use this functionality and when they shouldn't. Since we
don't currently provide any information to the user on the relative cost
of the parse, plan and execute phases, the end user is going to be
guessing IMHO.
What I think would be a clear win would be if we could get the above
senario of multiple inserts down to one parse, one plan, n executes, and
n binds (where binding is simply the operation of plugging values into
the statement without having to pipe the values through the parser).
This would be a win in most if not all circumstances where the same
statement is executed many times.
I think it would also be nice if the new explain anaylze showed times
for the parsing and planning stages in addition to the execution stage
which it currently shows so there is more information for the end user
on what approach they should take.
thanks,
--Barry
Show quoted text
On the other hand, there are already a few reasons to make some
changes to the FE/BE protocol (NOTIFY messages, transaction state,
and now possibly PREPARE/EXECUTE -- anything else?). IMHO, each of
these isn't worth changing the protocol by itself, but perhaps if
we can get all 3 in one swell foop it might be a good idea...Cheers,
Neil
Barry Lind <barry@xythos.com> writes:
...
Since we
don't currently provide any information to the user on the relative cost
of the parse, plan and execute phases, the end user is going to be
guessing IMHO.
You can in fact get that information fairly easily; set
show_parser_stats, show_planner_stats, and show_executor_stats to 1
and then look in the postmaster log for the results. (Although to be
fair, this does not provide any accounting for the CPU time expended
simply to *receive* the query string, which might be non negligible
for huge queries.)
It would be interesting to see some stats for the large-BLOB scenarios
being debated here. You could get more support for the position that
something should be done if you had numbers to back it up.
regards, tom lane
Tom Lane wrote:
It would be interesting to see some stats for the large-BLOB scenarios
being debated here. You could get more support for the position that
something should be done if you had numbers to back it up.
Below are some stats you did a few months ago when I was asking a
related question. Your summary was: "Bottom line: feeding huge strings
through the lexer is slow."
--Barry
Tom Lane wrote:
Barry Lind <barry@xythos.com> writes:
In looking at some performance issues (I was trying to look at the
overhead of toast) I found that large insert statements were very slow.
...
...
I got around to reproducing this today,
and what I find is that the majority of the backend time is going into
simple scanning of the input statement:
Each sample counts as 0.01 seconds.
% cumulative self self total time
seconds seconds calls ms/call ms/call name 31.24 11.90
11.90 _mcount
19.51 19.33 7.43 10097 0.74 1.06 base_yylex
7.48 22.18 2.85 21953666 0.00 0.00 appendStringInfoChar
5.88 24.42 2.24 776 2.89 2.89 pglz_compress
4.36 26.08 1.66 21954441 0.00 0.00 pq_getbyte
3.57 27.44 1.36 7852141 0.00 0.00 addlit
3.26 28.68 1.24 1552 0.80 0.81 scanstr
2.84 29.76 1.08 779 1.39 7.18 pq_getstring
2.31 30.64 0.88 10171 0.09 0.09 _doprnt
2.26 31.50 0.86 776 1.11 1.11 byteain
2.07 32.29 0.79 msquadloop
1.60 32.90 0.61 7931430 0.00 0.00 memcpy
1.18 33.35 0.45 chunks
1.08 33.76 0.41 46160 0.01 0.01 strlen
1.08 34.17 0.41 encore
1.05 34.57 0.40 8541 0.05 0.05 XLogInsert
0.89 34.91 0.34 appendStringInfo
60% of the call graph time is accounted for by these two areas:
index % time self children called name
7.43 3.32 10097/10097 yylex [14]
[13]: 41.0 7.43 3.32 10097 base_yylex [13] 1.36 0.61 7852141/7852141 addlit [28] 1.24 0.01 1552/1552 scanstr [30] 0.02 0.03 3108/3108 ScanKeywordLookup [99] 0.00 0.02 2335/2335 yy_get_next_buffer [144] 0.02 0.00 776/781 strtol [155] 0.00 0.01 777/3920 MemoryContextStrdup [108] 0.00 0.00 1/1 base_yy_create_buffer
1.36 0.61 7852141/7852141 addlit [28]
1.24 0.01 1552/1552 scanstr [30]
0.02 0.03 3108/3108 ScanKeywordLookup [99]
0.00 0.02 2335/2335 yy_get_next_buffer [144]
0.02 0.00 776/781 strtol [155]
0.00 0.01 777/3920 MemoryContextStrdup [108]
0.00 0.00 1/1 base_yy_create_buffer
[560]: 0.00 0.00 4675/17091 isupper [617] 0.00 0.00 1556/1556 yy_get_previous_state
0.00 0.00 4675/17091 isupper [617]
0.00 0.00 1556/1556 yy_get_previous_state
[671]: 0.00 0.00 779/779 yywrap [706] 0.00 0.00 1/2337 base_yy_load_buffer_state [654] ----------------------------------------------- 1.08 4.51 779/779 pq_getstr [17]
0.00 0.00 779/779 yywrap [706]
0.00 0.00 1/2337
base_yy_load_buffer_state [654]
-----------------------------------------------
1.08 4.51 779/779 pq_getstr [17]
[18]: 21.4 1.08 4.51 779 pq_getstring [18] 2.85 0.00 21953662/21953666 appendStringInfoChar
2.85 0.00 21953662/21953666 appendStringInfoChar
[20]: 1.66 0.00 21954441/21954441 pq_getbyte [29] -----------------------------------------------
1.66 0.00 21954441/21954441 pq_getbyte [29]
-----------------------------------------------
While we could probably do a little bit to speed up pg_getstring and its
children, it's not clear that we can do anything about yylex, which is
flex output code not handmade code, and is probably well-tuned already.
Bottom line: feeding huge strings through the lexer is slow.
regards, tom lane
Show quoted text
It would be interesting to see some stats for the large-BLOB scenarios
being debated here. You could get more support for the position that
something should be done if you had numbers to back it up.regards, tom lane
Neil Conway wrote:
On Thu, 11 Apr 2002 16:25:24 +1000
"Ashley Cambrell" <ash@freaky-namuh.com> wrote:What are the chances that the BE/FE will be altered to take advantage of
prepare / execute? Or is it something that will "never happen"?Is there a need for this? The current patch I'm working on just
does everything using SQL statements, which I don't think is
too bad (the typical client programmer won't actually need to
see them, their interface should wrap the PREPARE/EXECUTE stuff
for them).
I remember an email Hannu sent (I originally thought Tome sent it but I
found the email*) that said postgresql spends a lot of time parsing sql
(compared to oracle), so if the BE/FE and libpq were extended to support
pg_prepare / pg_bind, then it might make repetitive queries quicker.
"if we could save half of parse/optimise time by saving query plans, then
the backend performance would go up from 1097 to 100000/(91.1-16.2)=1335
updates/sec."
Hannu's email doesn't seem to be in google groups, but it's titled
"Oracle vs PostgreSQL in real life" (2002-03-01). I can attach it if
people can't find it.
On the other hand, there are already a few reasons to make some
changes to the FE/BE protocol (NOTIFY messages, transaction state,
and now possibly PREPARE/EXECUTE -- anything else?). IMHO, each of
these isn't worth changing the protocol by itself, but perhaps if
we can get all 3 in one swell foop it might be a good idea...
Passing on a possible 1/3 speed improvement doesn't sound like a bad
thing.. :-)
Hannu: You mentioned that you already had an experimental patch that did
it? Was that the same sort of thing as Neil's patch (SPI), or did it
include a libpq patch as well?
Cheers,
Neil
Ashley Cambrell
On Thu, 11 Apr 2002 11:38:33 -0700
"Barry Lind" <barry@xythos.com> wrote:
Neil Conway wrote:
On Thu, 11 Apr 2002 16:25:24 +1000
"Ashley Cambrell" <ash@freaky-namuh.com> wrote:What are the chances that the BE/FE will be altered to take advantage of
prepare / execute? Or is it something that will "never happen"?Is there a need for this? The current patch I'm working on just
does everything using SQL statements, which I don't think is
too bad (the typical client programmer won't actually need to
see them, their interface should wrap the PREPARE/EXECUTE stuff
for them).Yes there is a need.
Right -- I would agree that such functionality would be nice to have.
What I meant was "is there a need for this in order to implement
PREPARE/EXECUTE"? IMHO, no -- the two features are largely
orthogonal.
If you break up the query into roughly three stages of execution:
parse, plan, and execute, each of these can be the performance
bottleneck. The parse can be the performance bottleneck when passing
large values as data to the parser (eg. inserting one row containing a
100K value will result in a 100K+ sized statement that needs to be
parsed, parsing will take a long time, but the planning and execution
should be relatively short).
If you're inserting 100KB of data, I'd expect the time to insert
that into tables, update relevent indexes, etc. to be larger than
the time to parse the query (i.e. execution > parsing). But I
may well be wrong, I haven't done any benchmarks.
Executing a sql statement today is the following:
insert into table values (<stuff>);
which does one parse, one plan, one execute
You're assuming that the cost of the "parse" step for the EXECUTE
statement is the same as "parse" for the original query, which
will often not be the case (parsing the EXECUTE statement will
be cheaper).
so this is a win if the cost of the planing stage is significant
compared to the costs of the parse and execute stages. If the cost of
the plan is not significant there is little if any benefit in doing this.I realize that there are situations where this functionality will be a
big win. But I question how the typical user of postgres will know when
they should use this functionality and when they shouldn't.
I would suggest using it any time you're executing the same query
plan a large number of times. In my experience, this is very common.
There are already hooks for this in many client interfaces: e.g.
PrepareableStatement in JDBC and $dbh->prepare() in Perl DBI.
What I think would be a clear win would be if we could get the above
senario of multiple inserts down to one parse, one plan, n executes, and
n binds
This behavior would be better, but I think the current solution is
still a "clear win", and good enough for now. I'd prefer that we
worry about implementing PREPARE/EXECUTE for now, and deal with
query binding/BLOB parser-shortcuts later -- perhaps with an FE/BE
protocol in 7.4 as Tom suggested.
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
On 11 Apr 2002, Hannu Krosing wrote:
IIRC someone started work on modularising the network-related parts with
a goal of supporting DRDA (DB2 protocol) and others in future.
That was me, although I've been bogged down lately, and haven't been able
to get back to it. DRDA, btw, is not just a DB2 protocol but an opengroup
spec that hopefully will someday be *the* standard on the wire database
protocol. DRDA handles prepare/execute and is completely binary in
representation, among other advantages.
Brian
Neil Conway wrote:
I would suggest using it any time you're executing the same query
plan a large number of times. In my experience, this is very common.
There are already hooks for this in many client interfaces: e.g.
PrepareableStatement in JDBC and $dbh->prepare() in Perl DBI.
I'm not sure that JDBC would use this feature directly. When a
PreparableStatement is created in JDBC there is nothing that indicates
how many times this statement is going to be used. Many (most IMHO)
will be used only once. As I stated previously, this feature is only
useful if you are going to end up using the PreparedStatement multiple
times. If it only is used once, it will actually perform worse than
without the feature (since you need to issue two sql statements to the
backend to accomplish what you were doing in one before).
Thus if someone wanted to use this functionality from jdbc they would
need to do it manually, i.e. issue the prepare and execute statements
manually instead of the jdbc driver doing it automatically for them.
thanks,
--Barry
PS. I actually do believe that the proposed functionality is good and
should be added (even though it may sound from the tone of my emails in
this thread that that isn't the case :-) I just want to make sure that
everyone understands that this doesn't solve the whole problem. And
that more work needs to be done either in 7.3 or some future release.
My fear is that everyone will view this work as being good enough such
that the rest of the issues won't be addressed anytime soon. I only
wish I was able to work on some of this myself, but I don't have the
skills to hack on the backend too much. (However if someone really
wanted a new feature in the jdbc driver in exchange, I'd be more than
happy to help)
Ashley Cambrell <ash@freaky-namuh.com> writes:
I remember an email Hannu sent (I originally thought Tome sent it but I
found the email*) that said postgresql spends a lot of time parsing sql
(compared to oracle), so if the BE/FE and libpq were extended to support
pg_prepare / pg_bind, then it might make repetitive queries quicker.
I'm not sure I believe Hannu's numbers, but in any case they're fairly
irrelevant to the argument about whether a special protocol is useful.
He wasn't testing textually-long queries, but rather the planning
overhead, which is more or less independent of the length of any literal
constants involved (especially if they're not part of the WHERE clause).
Saving query plans via PREPARE seems quite sufficient, and appropriate,
to tackle the planner-overhead issue.
We do have some numbers suggesting that the per-character loop in the
lexer is slow enough to be a problem with very long literals. That is
the overhead that might be avoided with a special protocol.
However, it should be noted that (AFAIK) no one has spent any effort at
all on trying to make the lexer go faster. There is quite a bit of
material in the flex documentation about performance considerations ---
someone should take a look at it and see if we can get any wins by being
smarter, without having to introduce protocol changes.
regards, tom lane
Tom Lane wrote:
I'm not sure I believe Hannu's numbers, but in any case they're fairly
irrelevant to the argument about whether a special protocol is useful.
He wasn't testing textually-long queries, but rather the planning
overhead, which is more or less independent of the length of any literal
constants involved (especially if they're not part of the WHERE clause).
Saving query plans via PREPARE seems quite sufficient, and appropriate,
to tackle the planner-overhead issue.
Just a confirmation.
Someone is working on PREPARE/EXECUTE ?
What about Karel's work ?
regards,
Hiroshi Inoue