CStringGetTextDatum and other conversions in server-side code

Started by dandlabout 10 years ago10 messagesgeneral
Jump to latest
#1dandl
david@andl.org

Writing a language handler: pl_language_handler. Need to do a variety of data conversions. One of them is char* C-string to and from Text/Varchar.

The include file postgres.h has the macro CStringGetDatum but this is of no use: it’s just a cast.

There is a builtin macro CStringGetTextDatum which calls a function and looks like it might work. Questions:

1. Is this the right function?

2. Is it OK to use, or are there restrictions?

3. Does it have friends: are there other conversion functions like this for other data types (decimal, time in particular)?

4. Is there any particular documentation I can read?

Regards

David M Bennett FACS

_____

Andl - A New Database Language - andl.org

#2Joe Conway
mail@joeconway.com
In reply to: dandl (#1)
Re: CStringGetTextDatum and other conversions in server-side code

(please do not post HTML to these lists; see:
https://wiki.postgresql.org/wiki/Mailing_Lists)

On 03/02/2016 03:06 PM, david@andl.org wrote:

Writing a language handler: pl_language_handler. Need to do a variety of
data conversions. One of them is char* C-string to and from Text/Varchar.

The include file postgres.h has the macro CStringGetDatum but this is of
no use: it’s just a cast.

There is a builtin macro CStringGetTextDatum which calls a function and
looks like it might work. Questions:

1. Is this the right function?
2. Is it OK to use, or are there restrictions?
3. Does it have friends: are there other conversion functions like
this for other data types (decimal, time in particular)?
4. Is there any particular documentation I can read?

Your best bet is to look at examples. The code in the source tree under
contrib has many examples, and more specifically src/pl/* should have
even more relevant examples. Remember, grep is your friend -- get a copy
of the source tree and try:

cd <location-of-postgres-source-code>
grep -rni CStringGetTextDatum src/pl/* --include=*.c
grep -rni CStringGetTextDatum contrib/* --include=*.c
grep -rni TextDatumGetCstring contrib/* --include=*.c

Additionally there are external projects that implement postgres
procedural language handlers, e.g.:
https://github.com/jconway/plr
Lots of good examples there too. Google for others.

Basically any symbol exported from the postgres backend can be used by
your PL. There is not a lot of documentation other than in the source
code itself. Look at examples, see what they do, emulate it.

HTH,

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

#3dandl
david@andl.org
In reply to: Joe Conway (#2)
Re: CStringGetTextDatum and other conversions in server-side code

(please do not post HTML to these lists; see:
https://wiki.postgresql.org/wiki/Mailing_Lists)
[dmb>] I checked the list first: it looks like about 25-50% HTML. Happy to oblige, but I think you've got your work cut out.

and looks like it might work. Questions:

1. Is this the right function?
2. Is it OK to use, or are there restrictions?
3. Does it have friends: are there other conversion functions like
this for other data types (decimal, time in particular)?
4. Is there any particular documentation I can read?

Your best bet is to look at examples. The code in the source tree under contrib has many examples, and more specifically src/pl/* should have even more relevant examples. Remember, grep is your friend -- get a copy of the source tree and try:
[dmb>]
[dmb>] I have the source tree, I use grep and other tools, and I've read thousands of lines of code. There is a great deal of inconsistency, and much of the code looks like "Summer of Code", or has been written a number of years ago or talks about issues I know have been resolved, or says it's just 'legacy support'. The upshot is: I haven't yet been able to figure out the 'right' way to do things, or what might go wrong if I choose the 'wrong' way.

Basically any symbol exported from the postgres backend can be used by your PL. There is not a lot of documentation other than in the source code itself. Look at
examples, see what they do, emulate it.
[dmb>] That was my plan. But I do prefer to emulate code that is 'right'.

[dmb>] So how would I go about finding a set of useful conversion functions for basic types (real, decimal, time, etc)?

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4John R Pierce
pierce@hogranch.com
In reply to: dandl (#3)
Re: CStringGetTextDatum and other conversions in server-side code

On 3/2/2016 4:20 PM, david@andl.org wrote:

[dmb>] So how would I go about finding a set of useful conversion functions for basic types (real, decimal, time, etc)?

the basic SQL to C mappings are defined by the H files listed here,
http://www.postgresql.org/docs/current/static/xfunc-c.html#XFUNC-C-TYPE-TABLE

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Joe Conway
mail@joeconway.com
In reply to: dandl (#3)
Re: CStringGetTextDatum and other conversions in server-side code

On 03/02/2016 04:20 PM, david@andl.org wrote:

(please do not post HTML to these lists; see:
https://wiki.postgresql.org/wiki/Mailing_Lists)
[dmb>] I checked the list first: it looks like about 25-50% HTML.
Happy to oblige, but I think you've got your work cut out.

Understood, but you should know that quite a few of the more experienced
people on these lists will totally ignore your HTML emails, so you do
yourself no favors by not following that advice.

[dmb>] I have the source tree, I use grep and other tools, and I've
read thousands of lines of code. There is a great deal of
inconsistency, and much of the code looks like "Summer of Code", or
has been written a number of years ago or talks about issues I know
have been resolved, or says it's just 'legacy support'. The upshot
is: I haven't yet been able to figure out the 'right' way to do
things, or what might go wrong if I choose the 'wrong' way.

There is no specific "right" way per se. The postgres code base has been
evolving for going on 30 years, so yes, there are inconsistencies. As I
said, best advice is to find code similar to whatever you are trying to
achieve and emulate it.

[dmb>] That was my plan. But I do prefer to emulate code that is
'right'.

If it is in the postgres source tree (i.e. contrib and src/pl) you can
consider it to be about as "right" as you can get because it is all
maintained to be that way. If it is an external project you'll have to
be more careful.

[dmb>] So how would I go about finding a set of useful conversion
functions for basic types (real, decimal, time, etc)?

Sorry I don't have a better answer than before: grep and/or read source
for other PLs.

FWIW here are examples from PL/R for incoming (argument) and outgoing
(result) conversions of scalar values:

https://github.com/jconway/plr/blob/master/pg_conversion.c#L632
https://github.com/jconway/plr/blob/master/pg_conversion.c#L1002

That same file also has routines for conversions of more complex data types.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

#6dandl
david@andl.org
In reply to: John R Pierce (#4)
Re: CStringGetTextDatum and other conversions in server-side code

[dmb>] So how would I go about finding a set of useful conversion functions for basic types (real, decimal, time, etc)?

the basic SQL to C mappings are defined by the H files listed here, http://www.postgresql.org/docs/current/static/xfunc-c.html#XFUNC-C-TYPE-TABLE
[dmb>]
[dmb>] Yes, thanks, I found those. Most useful too. What I was looking for was the preferred set of functions that could be used to convert between those types and native C types (that could be used externally with no Postgres includes).

[dmb>] This is fairly easy for ints and reals, but is particularly a problem for all the variable length types (eg text, time and decimal).

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7John R Pierce
pierce@hogranch.com
In reply to: dandl (#6)
Re: CStringGetTextDatum and other conversions in server-side code

On 3/2/2016 9:36 PM, david@andl.org wrote:

[dmb>] This is fairly easy for ints and reals, but is particularly a problem for all the variable length types (eg text, time and decimal).

all the text types are simply a 32bit length and an array of
characters. you need to be aware of the encoding, and null is just
another character.

single and double real/floats are stored in standard x86 IEEE floating
point.

the date and time types are all fixed lengths, mostly a long int or two.

numeric in postgres is a bit tricky. its a decimal number stored in
base 10000 as a series of short integers, and can represent. does your
language even have a decimal or BCD or something numeric type?

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8David Bennett
davidb@pfxcorp.com
In reply to: John R Pierce (#7)
Re: CStringGetTextDatum and other conversions in server-side code

all the text types are simply a 32bit length and an array of
characters. you need to be aware of the encoding, and null is just
another character.
[dmb>] Yes, I can see that. What I need is a function that will convert to and from whatever the actual encoding happens to be into Unicode, period. I can handle UTF8/16/32 as long as I know which.

single and double real/floats are stored in standard x86 IEEE floating point.
[dmb>] Apart from the x86/x64 oddity that seems easy enough (I'm using x64).

the date and time types are all fixed lengths, mostly a long int or two.
[dmb>] Again, I just need a function to convert to and from one single date/time format that covers everything. I use 2xint64 and no TZ. It covers year 9999 down to milliseconds and then some.

numeric in postgres is a bit tricky. its a decimal number stored in
base 10000 as a series of short integers, and can represent. does your
language even have a decimal or BCD or something numeric type?
[dmb>] Of course. I have decimal which is also 2xint64: it equates to about 28 digits. For business purposes I think that's enough, but I may go for something bigger in future. Most likely the simplest solution will be to use string as an intermediate.

[dmb>] The point is that none of my code is in C, so I can't get Postgres macros or data types and my conversion routines into the same scope. That's why I'm looking for basic routines that support basic C data types as the go between, so I don't have to write too much C code.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9dandl
david@andl.org
In reply to: John R Pierce (#7)
Re: CStringGetTextDatum and other conversions in server-side code

all the text types are simply a 32bit length and an array of
characters. you need to be aware of the encoding, and null is just
another character.
[dmb>] Yes, I can see that. What I need is a function that will convert to and from whatever the actual encoding happens to be into Unicode, period. I can handle UTF8/16/32 as long as I know which.

single and double real/floats are stored in standard x86 IEEE floating point.
[dmb>] Apart from the x86/x64 oddity that seems easy enough (I'm using x64).

the date and time types are all fixed lengths, mostly a long int or two.
[dmb>] Again, I just need a function to convert to and from one single date/time format that covers everything. I use 2xint64 and no TZ. It covers year 9999 down to milliseconds and then some.

numeric in postgres is a bit tricky. its a decimal number stored in
base 10000 as a series of short integers, and can represent. does your
language even have a decimal or BCD or something numeric type?
[dmb>] Of course. I have decimal which is also 2xint64: it equates to about 28 digits. For business purposes I think that's enough, but I may go for something bigger in future. Most likely the simplest solution will be to use string as an intermediate.

[dmb>] The point is that none of my code is in C, so I can't get Postgres macros or data types and my conversion routines into the same scope. That's why I'm looking for basic routines that support basic C data types as the go between, so I don't have to write too much C code.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10dandl
david@andl.org
In reply to: Joe Conway (#5)
Re: CStringGetTextDatum and other conversions in server-side code

FWIW here are examples from PL/R for incoming (argument) and outgoing
(result) conversions of scalar values:

https://github.com/jconway/plr/blob/master/pg_conversion.c#L632
https://github.com/jconway/plr/blob/master/pg_conversion.c#L1002

That same file also has routines for conversions of more complex data types.
[dmb>]
[dmb>] Thank you for the links. Obviously you've put a lot of effort into this work, and equally obviously you know far more about this than I do. But I think the problem I face is a little different.

This code shows a programming environment in which both the Postgres and R declarations are in scope, and the code references both. I can't do that. I don't have any C code or any C API that I can bring into a Postgres scope, and Postgres does not provide a 'pure' C API that can be referenced from another language.

So what I'm trying to do is to write that 'pure' interface, using a thin layer of C and a chosen set of intermediate data types. [Take a look at https://www.sqlite.org/c3ref/funclist.html to see what I mean by a pure interface.]

For the fixed types, the Datum format is simple enough, but the variable types are much harder. Text comes as char with an encoding and I need Unicode. Decimal and time are proprietary formats of some kind, which I would prefer not to have to access directly. And so on.

I'm making progress, but it's slow. I've got everything else working: SPI queries, decoding tuples, etc. It's just these conversions that have me stuck.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general