question about count(b) where b is a custom type

Started by Grzegorz Jaskiewiczabout 20 years ago12 messages
#1Grzegorz Jaskiewicz
gj@pointblue.com.pl

Hi folks

We're developing here gist index (which works finally, and I remember
about writing some docs about it).
I have few fprintf(stderr,""") in function that converts internal
rep. into string. I was really supprised to see them on me screen when :
select count(b) from blah where b ~ 'something';
was issued. What the hell, isn't it wrong ? I don't need strings to
count results, right ?
IMO this is a serious bug.
What do yah think ?

--
GJ

"If we knew what we were doing, it wouldn't be called Research, would
it?" - AE

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Grzegorz Jaskiewicz (#1)
Re: question about count(b) where b is a custom type

On Wed, Nov 16, 2005 at 10:05:36AM +0100, Grzegorz Jaskiewicz wrote:

Hi folks

We're developing here gist index (which works finally, and I remember
about writing some docs about it).
I have few fprintf(stderr,""") in function that converts internal
rep. into string. I was really supprised to see them on me screen when :
select count(b) from blah where b ~ 'something';
was issued. What the hell, isn't it wrong ? I don't need strings to
count results, right ?
IMO this is a serious bug.
What do yah think ?

Firstly, if you just want a count, what's wrong with count(1) or
count(*).

Secondly, if you want an aggregate to work on your new type, you should
declare it as such. This is one of the reasons why implicit casts to
text are discouraged. If it had to be explicit, the parser would have
told you that what you asked for wasn't possible directly. (There's no
count(yourtype) function defined).

See CREATE AGGREGATE.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#3Hannu Krosing
hannu@skype.net
In reply to: Grzegorz Jaskiewicz (#1)
Re: question about count(b) where b is a custom type

On K, 2005-11-16 at 10:05 +0100, Grzegorz Jaskiewicz wrote:

Hi folks

We're developing here gist index (which works finally, and I remember
about writing some docs about it).
I have few fprintf(stderr,""") in function that converts internal
rep. into string. I was really supprised to see them on me screen when :
select count(b) from blah where b ~ 'something';
was issued. What the hell, isn't it wrong ? I don't need strings to
count results, right ?

I gess you need the rep in b ~ 'something'

IMO this is a serious bug.
What do yah think ?

--
Hannu Krosing <hannu@skype.net>

#4Pollard, Mike
mpollard@cincom.com
In reply to: Hannu Krosing (#3)
Re: question about count(b) where b is a custom type

Firstly, if you just want a count, what's wrong with count(1) or
count(*).

Because unless the column does not allow nulls, they will not return the
same value.

Mike Pollard
SUPRA Server SQL Engineering and Support
Cincom Systems, Inc.
--------------------------------
Better to remain silent and be thought a fool than to speak out and
remove all doubt.
Abraham Lincoln

#5Richard Huxton
dev@archonet.com
In reply to: Pollard, Mike (#4)
Re: question about count(b) where b is a custom type

Pollard, Mike wrote:

Firstly, if you just want a count, what's wrong with count(1) or
count(*).

Because unless the column does not allow nulls, they will not return the
same value.

Ah, but in the example given the column was being matched against a
value, so nulls were already excluded.

--
Richard Huxton
Archonet Ltd

#6Pollard, Mike
mpollard@cincom.com
In reply to: Richard Huxton (#5)
Re: question about count(b) where b is a custom type

Richard Huxton wrote:

Pollard, Mike wrote:

Firstly, if you just want a count, what's wrong with count(1) or
count(*).

Because unless the column does not allow nulls, they will not return

the

same value.

Ah, but in the example given the column was being matched against a
value, so nulls were already excluded.

--

Details, details. But there is a valid general question here, and
changing the semantics of the query will not address it. When doing a
count(col), why convert col into a string just so you can determine if
it is null or not? This isn't a problem on a small amount of data, but
it seems like a waste, especially if you are counting millions of
records. Is there some way to convert this to have the caller convert
nulls to zero and non-nulls to 1, and then just pass an int? So
logically the backend does:

Select count(case <col> when null then 0 else 1) from <table>

And count just adds the number to the running tally.

Mike Pollard
SUPRA Server SQL Engineering and Support
Cincom Systems, Inc.
--------------------------------
Better to remain silent and be thought a fool than to speak out and
remove all doubt.
Abraham Lincoln

#7Tino Wildenhain
tino@wildenhain.de
In reply to: Pollard, Mike (#6)
Re: question about count(b) where b is a custom type

Pollard, Mike schrieb:

Richard Huxton wrote:

Pollard, Mike wrote:

Firstly, if you just want a count, what's wrong with count(1) or
count(*).

Because unless the column does not allow nulls, they will not return

the

same value.

Ah, but in the example given the column was being matched against a
value, so nulls were already excluded.

--

Details, details. But there is a valid general question here, and
changing the semantics of the query will not address it. When doing a
count(col), why convert col into a string just so you can determine if
it is null or not? This isn't a problem on a small amount of data, but

Why convert? A null is always null no matter in which datatype.

it seems like a waste, especially if you are counting millions of
records. Is there some way to convert this to have the caller convert
nulls to zero and non-nulls to 1, and then just pass an int? So
logically the backend does:

Select count(case <col> when null then 0 else 1) from <table>

Which would be totally silly :-) no matter if its 0 or 1
it counts as 1. Do you mean sum() maybe?
Even then you dont need coalesce to convert null to 0
because sum() just ignores null.

And count just adds the number to the running tally.

Which number here?

Mike Pollard
SUPRA Server SQL Engineering and Support

strange...

Show quoted text

Cincom Systems, Inc.

#8Martijn van Oosterhout
kleptog@svana.org
In reply to: Pollard, Mike (#6)
Re: question about count(b) where b is a custom type

On Wed, Nov 16, 2005 at 08:28:28AM -0500, Pollard, Mike wrote:

Details, details. But there is a valid general question here, and
changing the semantics of the query will not address it. When doing a
count(col), why convert col into a string just so you can determine if
it is null or not? This isn't a problem on a small amount of data, but
it seems like a waste, especially if you are counting millions of
records. Is there some way to convert this to have the caller convert
nulls to zero and non-nulls to 1, and then just pass an int? So
logically the backend does:

Select count(case <col> when null then 0 else 1) from <table>

And count just adds the number to the running tally.

Actually, something is wrong with this whole thread. count(x) is
defined to take any type, hence count(b) won't convert anything to text
or anything else.

Which seems to imply that in the original query it's the '~' operator
that has the text conversion. Can you post an EXPLAIN VERBOSE for that
query so we can see where the conversion is being called.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#9Pollard, Mike
mpollard@cincom.com
In reply to: Martijn van Oosterhout (#8)
Re: question about count(b) where b is a custom type

If count(<col>) convert <col> to a string (an assumption that Martijn
has cast into doubt, or perhaps shredded), then rather than convert all
non-nulls that are not a string into a string, I was proposing
converting the values into an int with the values 0 or 1 (0 means that
row was null for that column, 1 means that row was not null;, since
count(<col>) means count the non-null rows in <col>).

Anyway, to make a short story long. The idea is rather than convert the
column into a string, convert it into a value indicating whether the
column was null or not null (which is all count cares about). In any
case, it's moot idea since it appears Postgres already does that.

Mike Pollard
SUPRA Server SQL Engineering and Support
Cincom Systems, Inc.
--------------------------------
Better to remain silent and be thought a fool than to speak out and
remove all doubt.
Abraham Lincoln

-----Original Message-----
From: Tino Wildenhain [mailto:tino@wildenhain.de]
Sent: Wednesday, November 16, 2005 8:43 AM
To: Pollard, Mike
Cc: Richard Huxton; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] question about count(b) where b is a custom

type

Pollard, Mike schrieb:

Richard Huxton wrote:

Pollard, Mike wrote:

Firstly, if you just want a count, what's wrong with count(1) or
count(*).

Because unless the column does not allow nulls, they will not

return

the

same value.

Ah, but in the example given the column was being matched against a
value, so nulls were already excluded.

--

Details, details. But there is a valid general question here, and
changing the semantics of the query will not address it. When doing

a

count(col), why convert col into a string just so you can determine

if

it is null or not? This isn't a problem on a small amount of data,

but

Why convert? A null is always null no matter in which datatype.

it seems like a waste, especially if you are counting millions of
records. Is there some way to convert this to have the caller

convert

Show quoted text

nulls to zero and non-nulls to 1, and then just pass an int? So
logically the backend does:

Select count(case <col> when null then 0 else 1) from <table>

Which would be totally silly :-) no matter if its 0 or 1
it counts as 1. Do you mean sum() maybe?
Even then you dont need coalesce to convert null to 0
because sum() just ignores null.

And count just adds the number to the running tally.

Which number here?

Mike Pollard
SUPRA Server SQL Engineering and Support

strange...

Cincom Systems, Inc.

#10Tino Wildenhain
tino@wildenhain.de
In reply to: Pollard, Mike (#9)
Re: question about count(b) where b is a custom type

Pollard, Mike schrieb:

If count(<col>) convert <col> to a string (an assumption that Martijn
has cast into doubt, or perhaps shredded), then rather than convert all
non-nulls that are not a string into a string, I was proposing
converting the values into an int with the values 0 or 1 (0 means that
row was null for that column, 1 means that row was not null;, since
count(<col>) means count the non-null rows in <col>).

I'm not getting how you got this idea of count() doing any conversion?
It does not and there is nothing in the docs wich would lead to this.

Anyway, to make a short story long. The idea is rather than convert the
column into a string, convert it into a value indicating whether the
column was null or not null (which is all count cares about). In any
case, it's moot idea since it appears Postgres already does that.

No, count does not convert. It just counts all non null values.
If you want to count rows, just use count(*).

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Grzegorz Jaskiewicz (#1)
Re: question about count(b) where b is a custom type

Grzegorz Jaskiewicz <gj@pointblue.com.pl> writes:

I have few fprintf(stderr,""") in function that converts internal
rep. into string. I was really supprised to see them on me screen when :
select count(b) from blah where b ~ 'something';
was issued. What the hell, isn't it wrong ?

Depends ... what does the ~ operator do for your datatype? (If you've
defined an implicit conversion to text then it's very possibly invoking
that followed by the regex match operator.)

Have you tried getting a stack trace back from your function to see
exactly what is calling it?

regards, tom lane

#12Grzegorz Jaskiewicz
gj@pointblue.com.pl
In reply to: Tino Wildenhain (#10)
Re: question about count(b) where b is a custom type

Yes, sorry for the mess. The problem was somewhere else (not quite
well written log procedure issuing conversion used for logging, even
tho log was off).

--
GJ

"If we knew what we were doing, it wouldn't be called Research, would
it?" - AE