PG_RETURN_?

Started by Don Yalmost 20 years ago8 messagesgeneral
Jump to latest
#1Don Y
pgsql@DakotaCom.Net

Hi,

I have a set of functions for a data type that return
small integers (i.e. [0..12]). I can, of course, represent
it as a char, short or long (CHAR, INT16 or INT32).
re there any advantages/drawbacks to chosing one particular
PG_RETURN_ type over another (realizing that they are
effectively just casts)?

Thanks!
--don

#2Richard Huxton
dev@archonet.com
In reply to: Don Y (#1)
Re: PG_RETURN_?

Don Y wrote:

Hi,

I have a set of functions for a data type that return
small integers (i.e. [0..12]). I can, of course, represent
it as a char, short or long (CHAR, INT16 or INT32).
re there any advantages/drawbacks to chosing one particular
PG_RETURN_ type over another (realizing that they are
effectively just casts)?

If they are integers then an int would be the obvious choice. If you are
going to treat them as int2 outside the function then int2, otherwise
just integer. Oh, it's int2/int4 not int16/int32.
--
Richard Huxton
Archonet Ltd

#3Don Y
pgsql@DakotaCom.Net
In reply to: Richard Huxton (#2)
Re: PG_RETURN_?

Richard Huxton wrote:

Don Y wrote:

Hi,

I have a set of functions for a data type that return
small integers (i.e. [0..12]). I can, of course, represent
it as a char, short or long (CHAR, INT16 or INT32).
re there any advantages/drawbacks to chosing one particular
PG_RETURN_ type over another (realizing that they are
effectively just casts)?

If they are integers then an int would be the obvious choice. If you are
going to treat them as int2 outside the function then int2, otherwise
just integer.

Yes, I was more interested in what might be going on "behind the
scenes" inside the server that could bias my choice of WHICH
integer type to use. E.g., if arguments are marshalled as
byte arrays vs. as Datum arrays, etc. (I would suspect the
latter). Since I could use something as small as a char to
represent the values, the choice is more interested in how
OTHER things would be affected...

Oh, it's int2/int4 not int16/int32.

The *data type* is int2/int4 but the PG_RETURN_? macro is
PG_RETURN_INT16 or PG_RETURN_INT32 -- hence the reason
I referred to them as "CHAR, INT16 or INT32" instead of
"char, int2 or int4" :>

--don

#4Richard Huxton
dev@archonet.com
In reply to: Don Y (#3)
Re: PG_RETURN_?

Don Y wrote:

Richard Huxton wrote:

Don Y wrote:

Hi,

I have a set of functions for a data type that return
small integers (i.e. [0..12]). I can, of course, represent
it as a char, short or long (CHAR, INT16 or INT32).
re there any advantages/drawbacks to chosing one particular
PG_RETURN_ type over another (realizing that they are
effectively just casts)?

If they are integers then an int would be the obvious choice. If you
are going to treat them as int2 outside the function then int2,
otherwise just integer.

Yes, I was more interested in what might be going on "behind the
scenes" inside the server that could bias my choice of WHICH
integer type to use. E.g., if arguments are marshalled as
byte arrays vs. as Datum arrays, etc. (I would suspect the
latter). Since I could use something as small as a char to
represent the values, the choice is more interested in how
OTHER things would be affected...

I must admit I've never tested, but I strongly suspect any differences
will be below the level you can accurately measure. Certainly from the
point of view of 8/16/32 bit integers I'd guess they'd all time the same
(they should all end up as a Datum). With a 64-bit CPU I'd guess that
would extend to 64 bits too. Hmm - looking at comments it seems int64 is
a reference type regardless of CPU (include/postgres.h)

Oh, it's int2/int4 not int16/int32.

The *data type* is int2/int4 but the PG_RETURN_? macro is
PG_RETURN_INT16 or PG_RETURN_INT32 -- hence the reason
I referred to them as "CHAR, INT16 or INT32" instead of
"char, int2 or int4" :>

You're quite right. I was thinking from the other side.

--
Richard Huxton
Archonet Ltd

#5Martijn van Oosterhout
kleptog@svana.org
In reply to: Don Y (#3)
Re: PG_RETURN_?

On Tue, May 02, 2006 at 08:43:03AM -0700, Don Y wrote:

Richard Huxton wrote:

Don Y wrote:

Hi,

I have a set of functions for a data type that return
small integers (i.e. [0..12]). I can, of course, represent
it as a char, short or long (CHAR, INT16 or INT32).
re there any advantages/drawbacks to chosing one particular
PG_RETURN_ type over another (realizing that they are
effectively just casts)?

If they are integers then an int would be the obvious choice. If you are
going to treat them as int2 outside the function then int2, otherwise
just integer.

Yes, I was more interested in what might be going on "behind the
scenes" inside the server that could bias my choice of WHICH
integer type to use. E.g., if arguments are marshalled as
byte arrays vs. as Datum arrays, etc. (I would suspect the
latter). Since I could use something as small as a char to
represent the values, the choice is more interested in how
OTHER things would be affected...

You should always *always* match the PG_RETURN_* to the declared type
you are returning. anything else will cause problems. PG_RETURN_INT16
means "return in a format consistant with a type declared as
pass-by-value two byte width". PostgreSQL does not check that what
you're returning actually matches what you declared.

The type as declared determines the storage required to store it. That
might be a far more useful factor to consider than what it copied
internally which, as has been pointed out, is probably below what you
can measure.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#6Don Y
pgsql@DakotaCom.Net
In reply to: Martijn van Oosterhout (#5)
Re: PG_RETURN_?

Martijn van Oosterhout wrote:

On Tue, May 02, 2006 at 08:43:03AM -0700, Don Y wrote:

Richard Huxton wrote:

Don Y wrote:

Hi,

I have a set of functions for a data type that return
small integers (i.e. [0..12]). I can, of course, represent
it as a char, short or long (CHAR, INT16 or INT32).
re there any advantages/drawbacks to chosing one particular
PG_RETURN_ type over another (realizing that they are
effectively just casts)?

If they are integers then an int would be the obvious choice. If you are
going to treat them as int2 outside the function then int2, otherwise
just integer.

Yes, I was more interested in what might be going on "behind the
scenes" inside the server that could bias my choice of WHICH
integer type to use. E.g., if arguments are marshalled as
byte arrays vs. as Datum arrays, etc. (I would suspect the
latter). Since I could use something as small as a char to
represent the values, the choice is more interested in how
OTHER things would be affected...

You should always *always* match the PG_RETURN_* to the declared type
you are returning. anything else will cause problems. PG_RETURN_INT16
means "return in a format consistant with a type declared as
pass-by-value two byte width". PostgreSQL does not check that what
you're returning actually matches what you declared.

Yes, but that wasn't the question.

I can PG_RETURN_CHAR(2), PG_RETURN_INT16(2) or PG_RETURN_INT32(2)
and end up with the same result (assuming the function is defined
to return char, int2 or int4, respectively in the SQL interface).

The type as declared determines the storage required to store it. That

Yes, but for a function returning a value that does not exceed
sizeof(Datum), there is no *space* consequence. I would assume
most modern architectures use 32 bit (and larger) registers.

OTOH, some machines incur a (tiny) penalty for casting char to long.
Returning INT32 *may* be better from that standpoint -- assuming
there is no added offsetting cost marshalling.

might be a far more useful factor to consider than what it copied
internally which, as has been pointed out, is probably below what you
can measure.

Sure. But, given that the difference ONLY amounts to whether
I type "INT32" or "INT16" or "CHAR" in the PG_RETURN_ macro,
an understanding of what is going on "inside" can contribute
epsilon for or against performance. I'd be annoyed to have
built dozens of functions ASSUMING "INT32" when a *better*
assumption might have been "CHAR"... (I'm working in an
embedded environment where "spare CPU cycles" mean you've
wasted $$$ on hardware that you don't need :-/ )

--don

#7Martijn van Oosterhout
kleptog@svana.org
In reply to: Don Y (#6)
Re: PG_RETURN_?

On Tue, May 02, 2006 at 10:06:19AM -0700, Don Y wrote:

The type as declared determines the storage required to store it. That

Yes, but for a function returning a value that does not exceed
sizeof(Datum), there is no *space* consequence. I would assume
most modern architectures use 32 bit (and larger) registers.

When you return a Datum, it's always the same size. When you're
returning a string, you're still returning a Datum, which may be 4 or 8
bytes depending on the platform.

But what I was referring to was the space to store the data in a tuple
on disk, or to send the data to a client. These are affected by the
choice of representation.

OTOH, some machines incur a (tiny) penalty for casting char to long.
Returning INT32 *may* be better from that standpoint -- assuming
there is no added offsetting cost marshalling.

Within the backend the only representations used are Datum and tuples.
I don't think either of them would have a noticable difference between
various pass-by-value formats.

... I'd be annoyed to have
built dozens of functions ASSUMING "INT32" when a *better*
assumption might have been "CHAR"... (I'm working in an
embedded environment where "spare CPU cycles" mean you've
wasted $$$ on hardware that you don't need :-/ )

Hmm, postgres doesn't try to save on cycles. the philosophy is to get
it right first, then make it fast. The entire fmgr interface is slower
than the original design (old-style functions), but this design works
on all platforms whereas the old one didn't.

I'd go for INT32, it's most likely to be an "int" which should be "the
most natural size for the machine".

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#8Don Y
pgsql@DakotaCom.Net
In reply to: Martijn van Oosterhout (#7)
Re: PG_RETURN_?

Martijn van Oosterhout wrote:

On Tue, May 02, 2006 at 10:06:19AM -0700, Don Y wrote:

The type as declared determines the storage required to store it. That

Yes, but for a function returning a value that does not exceed
sizeof(Datum), there is no *space* consequence. I would assume
most modern architectures use 32 bit (and larger) registers.

When you return a Datum, it's always the same size. When you're
returning a string, you're still returning a Datum, which may be 4 or 8
bytes depending on the platform.

Yes.

But what I was referring to was the space to store the data in a tuple
on disk, or to send the data to a client. These are affected by the
choice of representation.

So, as I had mentioned before, you marshall as a *byte* stream
and not a *Datum* stream?

OTOH, some machines incur a (tiny) penalty for casting char to long.
Returning INT32 *may* be better from that standpoint -- assuming
there is no added offsetting cost marshalling.

Within the backend the only representations used are Datum and tuples.
I don't think either of them would have a noticable difference between
various pass-by-value formats.

... I'd be annoyed to have
built dozens of functions ASSUMING "INT32" when a *better*
assumption might have been "CHAR"... (I'm working in an
embedded environment where "spare CPU cycles" mean you've
wasted $$$ on hardware that you don't need :-/ )

Hmm, postgres doesn't try to save on cycles.

<grin> Yes, I noticed. :> But it's hard for me to get this
"attitude" out of the way I approach a problem. :-(
(e.g., I wouldn't count people at a rally using a *float*! :>)

the philosophy is to get
it right first, then make it fast. The entire fmgr interface is slower
than the original design (old-style functions), but this design works
on all platforms whereas the old one didn't.

Exactly. I could more "efficiently" replace postgres with
dedicated structures to do what I want. But, that ties my
implementation down to one less portable (and maintainable).

I'd go for INT32, it's most likely to be an "int" which should be "the
most natural size for the machine".

(sigh) Yes, I suppose so. Though it can have a big impact
on transport delays (server to client) if things really
are marshalled as byte streams, etc.

<shrug> I suppose I should just "do it" and let technology
catch up with my inefficiencies later!

Thanks!
--don