estimated_count() implementation
Hi,
I'm trying to implement estimated_count() function that's mentioned in
the TODO list. First of all, I wanted to learn if this TODO item is
still valid? I looked at the related -hackers discussions, does anybody
want to say more sth related with the implementation?
Also I've some questions. I'd be appreciated if somebody would answer
any of the below questions to help me find my way.
1. I'm planning to use same method as ExplainOneQuery() does in
backend/commands/explain.c. (Using Plan->plan_rows that will be
returned from planner(query, isCursor, cursorOptions, params)
function.) Is this the way to go, or should I look for another
method to aggregate estimated row count.
2. I've been also considering getting called from a nodeAgg. In such a
case, it shouldn't be a problem for me to use same way as above to
retrieve Query, ParamListInfo and TupOutputState. Right?
3. I was looking at int8inc() and backend/executor/nodeAgg.c and
couldn't find anything special to count() aggregate. Am I looking
at the right place? For instance, for my case, I won't need any
transition function call. How should I modify nodeAgg.c to skip
transfn calls for estimated_count()?
4. Related with the problem, any question I missed.
Regards.
On Sat, Oct 21, 2006 at 11:44:19PM +0300, Volkan YAZICI wrote:
I'm trying to implement estimated_count() function that's mentioned in
the TODO list. First of all, I wanted to learn if this TODO item is
still valid? I looked at the related -hackers discussions, does anybody
want to say more sth related with the implementation?
I hadn't noticed the TODO item but about a year ago I posted a
cursor_plan_rows() function and asked for comments. The only reply
was from Tom, who said, "Given how far off it frequently is, I can't
believe that any of the people who ask for the feature would find
this a satisfactory answer :-("
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00579.php
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00580.php
--
Michael Fuhr
On Oct 21 05:09, Michael Fuhr wrote:
I hadn't noticed the TODO item but about a year ago I posted a
cursor_plan_rows() function and asked for comments.
Ah! I didn't see this.
The only reply was from Tom, who said, "Given how far off it
frequently is, I can't believe that any of the people who ask for the
feature would find this a satisfactory answer :-("
AFAIU, cursor_plan_rows() has some serious limitations like requiring to
be executed for a portal. I was planning to make estimated_count() work
for nodeAgg and custom calls too - as count() does.
But OTOH, Tom's complaints look like still applicable for my
estimated_count() too. Does this TODO need a little bit more
clarification or we can count is a redundant one?
Regards.
On Sun, 2006-10-22 at 12:07 +0300, Volkan YAZICI wrote:
On Oct 21 05:09, Michael Fuhr wrote:
I hadn't noticed the TODO item but about a year ago I posted a
cursor_plan_rows() function and asked for comments.Ah! I didn't see this.
The only reply was from Tom, who said, "Given how far off it
frequently is, I can't believe that any of the people who ask for the
feature would find this a satisfactory answer :-("AFAIU, cursor_plan_rows() has some serious limitations like requiring to
be executed for a portal. I was planning to make estimated_count() work
for nodeAgg and custom calls too - as count() does.But OTOH, Tom's complaints look like still applicable for my
estimated_count() too. Does this TODO need a little bit more
clarification or we can count is a redundant one?
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php
Is the source of the TODO item, though please read the upthread messages
as to how we got there...
I think it would be a useful function...
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes:
On Sun, 2006-10-22 at 12:07 +0300, Volkan YAZICI wrote:
But OTOH, Tom's complaints look like still applicable for my
estimated_count() too. Does this TODO need a little bit more
clarification or we can count is a redundant one?
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php
Is the source of the TODO item, though please read the upthread messages
as to how we got there...
I think there is a use-case for something like
select estimated_count('select * from ... where ...');
i.e. there are applications where a possibly-bad estimate is enough.
What I'm concerned about is the level of complaints from newbies who'll
expect it to be dead accurate all the time ...
BTW, you can build estimated_count() today in a few lines of plpgsql:
create or replace function estimated_count(text) returns float8 as $$
declare r text;
begin
for r in execute 'explain ' || $1 loop
return substring(r from 'rows=([0-9]+) ');
end loop;
end$$ language plpgsql strict;
I don't see that it really justifies any more work than that.
regards, tom lane