printTable API (was: Show INHERIT in \du)

Started by Brendan Jurdalmost 18 years ago17 messages
#1Brendan Jurd
direvus@gmail.com

On 25/03/2008, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Brendan Jurd" <direvus@gmail.com> writes:

This makes me wonder whether print.c could offer something a bit more
helpful to callers wishing to DIY a table; we could have a
table-building struct with methods like addHeader and addCell.

Once you have two occurrences of a pattern, it's reasonable to assume
there will be more later. +1 for building a little bit of infrastructure.

Hi hackers,

Per the above discussion, I'm looking at creating a nice API in
print.c for callers wishing to build their own psql output table.

Currently, if you want to build your own table you need to implement
your own local version of the logic in printQuery.
describeOneTableDetails is the only place this happens right now, but
due to some localisation issues it looks like my \du patch will have
to build its table manually as well.

There seem to be some differences in the way printQuery and
describeOneTableDetails go about constructing the array of cells.
Here's the version from describe:

<code>
/* Generate table cells to be printed */
/* note: initialize all cells[] to NULL in case of error exit */
cells = pg_malloc_zero((numrows * cols + 1) * sizeof(*cells));

for (i = 0; i < numrows; i++)
{
/* Name */
#ifdef WIN32
cells[i * cols + 0] = mbvalidate(PQgetvalue(res, i, 0), myopt.encoding);
#else
cells[i * cols + 0] = PQgetvalue(res, i, 0); /* don't free this
* afterwards */
#endif

/* Type */
#ifdef WIN32
cells[i * cols + 1] = mbvalidate(PQgetvalue(res, i, 1), myopt.encoding);
#else
cells[i * cols + 1] = PQgetvalue(res, i, 1); /* don't free this
* either */
#endif
</code>

And the version from printQuery:

<code>
/* set cells */
ncells = ntuples * nfields;
cells = pg_local_calloc(ncells + 1, sizeof(*cells));

i = 0;
for (r = 0; r < ntuples; r++)
{
for (c = 0; c < nfields; c++)
{
if (PQgetisnull(result, r, c))
cells[i] = opt->nullPrint ? opt->nullPrint : "";
else
{
cells[i] = (char *)
mbvalidate((unsigned char *) PQgetvalue(result, r, c),
opt->topt.encoding);
#ifdef ENABLE_NLS
if (opt->trans_columns && opt->trans_columns[c])
cells[i] = _(cells[i]);
#endif
}
i++;
}
}
</code>

So, it looks like:

1. describe malloc's the cells to zero, but print just does a local
calloc without any initialisation.

2. describe only does an mbvalidate for WIN32, but print does it in all cases.

Any opinions on which behaviour is preferable/more correct in these two cases?

Regards,
BJ

#2Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Brendan Jurd (#1)
Re: printTable API (was: Show INHERIT in \du)

Brendan Jurd wrote:

1. describe malloc's the cells to zero, but print just does a local
calloc without any initialisation.

Um, calloc is the same as malloc + zero. Those two seem identical to me.

2. describe only does an mbvalidate for WIN32, but print does it in all cases.

There's this comment in describe.c:

/*
* mbvalidate() is used in function describeOneTableDetails() to make sure
* all characters of the cells will be printed to the DOS console in a
* correct way
*/

I don't know what that's about. Perhaps there's something in the archives...

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brendan Jurd (#1)
Re: printTable API (was: Show INHERIT in \du)

"Brendan Jurd" <direvus@gmail.com> writes:

So, it looks like:

1. describe malloc's the cells to zero, but print just does a local
calloc without any initialisation.

There isn't any functional difference there. I am not sure, but I think
the reason print.c has its own malloc wrappers instead of depending on
common.c's is that we use print.c in some bin/scripts/ programs that
do not want common.c too.

2. describe only does an mbvalidate for WIN32, but print does it in all cases.

I don't know why describe only does that for WIN32; it looks
inconsistent to me too. Possibly some trolling in the CVS history would
give a clue about this.

If you're not actively working on this patch right now, I am going to go
ahead and commit the other open patches for describe.c. If you do have
a patch in progress, I'm willing to hold off to avoid any merge
conflicts. Let me know.

regards, tom lane

#4Brendan Jurd
direvus@gmail.com
In reply to: Tom Lane (#3)
Re: printTable API (was: Show INHERIT in \du)

On 31/03/2008, Tom Lane <tgl@sss.pgh.pa.us> wrote:

There isn't any functional difference there. I am not sure, but I think
the reason print.c has its own malloc wrappers instead of depending on
common.c's is that we use print.c in some bin/scripts/ programs that
do not want common.c too.

Okay, thanks (to Heikki as well) for the clarification. It's good to
know they are functionally equivalent. I'll do some snooping in
/scripts to get a better view of the situation.

2. describe only does an mbvalidate for WIN32, but print does it in all cases.

I don't know why describe only does that for WIN32; it looks
inconsistent to me too. Possibly some trolling in the CVS history would
give a clue about this.

Alright, I'll be spending some quality time with 'annotate' then =)

If you're not actively working on this patch right now, I am going to go
ahead and commit the other open patches for describe.c. If you do have
a patch in progress, I'm willing to hold off to avoid any merge
conflicts. Let me know.

I didn't get much beyond sketching out my struct. Now that I have
answers to the questions I raised above, I can push forward with the
patch, but I wouldn't expect to have anything to submit for another
couple of days at least.

Short answer: I have zero objections to you committing those patches.

Thanks for your time,
BJ

#5Bruce Momjian
bruce@momjian.us
In reply to: Brendan Jurd (#4)
Re: printTable API (was: Show INHERIT in \du)

The author has been given feedback so this has been saved for the next
commit-fest:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Brendan Jurd wrote:

On 31/03/2008, Tom Lane <tgl@sss.pgh.pa.us> wrote:

There isn't any functional difference there. I am not sure, but I think
the reason print.c has its own malloc wrappers instead of depending on
common.c's is that we use print.c in some bin/scripts/ programs that
do not want common.c too.

Okay, thanks (to Heikki as well) for the clarification. It's good to
know they are functionally equivalent. I'll do some snooping in
/scripts to get a better view of the situation.

2. describe only does an mbvalidate for WIN32, but print does it in all cases.

I don't know why describe only does that for WIN32; it looks
inconsistent to me too. Possibly some trolling in the CVS history would
give a clue about this.

Alright, I'll be spending some quality time with 'annotate' then =)

If you're not actively working on this patch right now, I am going to go
ahead and commit the other open patches for describe.c. If you do have
a patch in progress, I'm willing to hold off to avoid any merge
conflicts. Let me know.

I didn't get much beyond sketching out my struct. Now that I have
answers to the questions I raised above, I can push forward with the
patch, but I wouldn't expect to have anything to submit for another
couple of days at least.

Short answer: I have zero objections to you committing those patches.

Thanks for your time,
BJ

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#6Brendan Jurd
direvus@gmail.com
In reply to: Tom Lane (#3)
Re: printTable API (was: Show INHERIT in \du)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 31/03/2008, Tom Lane wrote:

"Brendan Jurd" writes:

1. describe malloc's the cells to zero, but print just does a local
calloc without any initialisation.

There isn't any functional difference there. I am not sure, but I think
the reason print.c has its own malloc wrappers instead of depending on
common.c's is that we use print.c in some bin/scripts/ programs that
do not want common.c too.

Yeah, it looks like createlang and droplang use print.c to emit a list
of installed languages.

2. describe only does an mbvalidate for WIN32, but print does it in all cases.

I don't know why describe only does that for WIN32; it looks
inconsistent to me too. Possibly some trolling in the CVS history would
give a clue about this.

Well, mbvalidate was originally added to print.c in 2001, as part of a
big patch to add multibyte support to psql [1]http://repo.or.cz/w/PostgreSQL.git?a=commit;h=a428cef1. However, it was only
added to describe much later (2003) in response to a bug report about
8-bit characters not displaying correctly on the Windows console [2]http://repo.or.cz/w/PostgreSQL.git?a=commit;h=e6a16c17 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: http://getfiregpg.org.
I think that because the bug was only observed in Windows, the patch
was added #ifdef WIN32, even though print.c was already using
mbvalidate for all content.

This nicely illustrates the nuisance inherent to duplication of code!

Based on this, I'm going to go ahead with using mbvalidate in all cases.

Cheers,
BJ

[1]: http://repo.or.cz/w/PostgreSQL.git?a=commit;h=a428cef1
[2]: http://repo.or.cz/w/PostgreSQL.git?a=commit;h=e6a16c17 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: http://getfiregpg.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFH9QFt5YBsbHkuyV0RAv2ZAJ4/rfyjgFOh8XZo6aJo68dz5NsovQCgmf40
fCXMlsHdg1r4oTpfZD5DH+0=
=PrN1
-----END PGP SIGNATURE-----

#7Brendan Jurd
direvus@gmail.com
In reply to: Brendan Jurd (#1)
1 attachment(s)
Re: printTable API (was: Show INHERIT in \du)

On Sun, Mar 30, 2008 at 9:39 AM, Brendan Jurd <direvus@gmail.com> wrote:

On 25/03/2008, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Brendan Jurd" <direvus@gmail.com> writes:

This makes me wonder whether print.c could offer something a bit more
helpful to callers wishing to DIY a table; we could have a
table-building struct with methods like addHeader and addCell.

Once you have two occurrences of a pattern, it's reasonable to assume
there will be more later. +1 for building a little bit of infrastructure.

Per the above discussion, I'm looking at creating a nice API in
print.c for callers wishing to build their own psql output table.

Currently, if you want to build your own table you need to implement
your own local version of the logic in printQuery.
describeOneTableDetails is the only place this happens right now, but
due to some localisation issues it looks like my \du patch will have
to build its table manually as well.

I'd like to submit my first version of this patch for review. I have
introduced a new struct in print.h called printTableContent, which is
used to compose the contents of a psql table. The methods exposed for
this struct are as follows:

void printTableInit(printTableContent *const content, const char *title,
const int ncolumns, const int nrows);

void printTableAddHeader(printTableContent *const content, const char *header,
const int encoding, const bool translate, const char align);

void printTableAddCell(printTableContent *const content, const char *cell,
const int encoding, const bool translate);

void printTableAddFooter(printTableContent *const content, const char *footer);

void printTableSetFooter(printTableContent *const content, const char *footer);

void printTableCleanUp(printTableContent *const content);

-= Notes

The column headers and cells are implemented as simple arrays of
pointers to existing strings. It's necessary for the calling site to
ensure that these strings survive at least until the table has been
printed. That's usually not a problem since the headers and cells
tend to be inside a PGresult.

Footers are a bit different. I've implemented them as a very
primitive singly-linked list which copies its contents with strdup. A
lot of the complexity in the pre-patch describeOneTableDetails comes
from the fact that it needs to know the number of footers in advance.
The singly-linked list allows callers to incrementally add an
arbitrary number of footers, and doesn't require them to preserve the
strings, which is nice when you're working with a temporary
PQExpBuffer to produce a footer.

-= QA

Not having written much C, I'm concerned about memory management
errors. But I've run my own tests with describeOneTableDetails on
tables with every kind of footer there is, and put the patch through
valgrind, and I haven't encountered any segfaults or memory leaks.

Both the serial and parallel regression tests passed perfectly.

-= Documentation

None required as far as I can tell, aside from code comments.

-= Patch tracking

I'll add this to the May CommitFest wiki page myself -- Bruce, you
don't need to do anything at all! =)

Looking forward to your comments.

Cheers,
BJ

Attachments:

print-table.diff_0.bz2application/x-bzip2; name=print-table.diff_0.bz2Download
#8Alvaro Herrera
alvherre@commandprompt.com
In reply to: Brendan Jurd (#7)
Re: printTable API (was: Show INHERIT in \du)

Brendan Jurd escribi�:

I'd like to submit my first version of this patch for review. I have
introduced a new struct in print.h called printTableContent, which is
used to compose the contents of a psql table. The methods exposed for
this struct are as follows:

Looks cool -- on a first read, I think you should add some more code
comments at the top of each function specifying whether the texts need
to be translated by the caller or done by the function itself. Also it
would be good if it is consistent, too :-)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#9Brendan Jurd
direvus@gmail.com
In reply to: Alvaro Herrera (#8)
1 attachment(s)
Re: printTable API (was: Show INHERIT in \du)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, Apr 14, 2008 at 6:03 AM, Alvaro Herrera wrote:

Looks cool -- on a first read, I think you should add some more code
comments at the top of each function specifying whether the texts need
to be translated by the caller or done by the function itself. Also it
would be good if it is consistent, too :-)

Thanks for your feedback Alvaro. Taking a look at this with fresh
eyes, I realised it was daft to pass the encoding to each AddHeader
and AddCell call, since the encoding isn't going to change between
cells.

Instead I put a pointer to a printTableOpt inside the
printTableContent. The Opt struct contains the encoding and the Add
functions refer to the encoding internally when adding items. This
also has the benefit that you don't need to pass a printTableOpt
separately to the functions that actually do the printing.

Per your suggestion, I fleshed out the comments on translation, in
particular explaining why AddHeader and AddCell have an option for
translation but AddFooter does not.

The second version of the patch is attached.

Now my question is, how do I indicate that a second version of the
patch has been submitted on the wiki? Should I leave the primary link
pointing at the original submission, or update it to point at this
message?

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIBGMz5YBsbHkuyV0RArf1AJwOsz2Wq5g7vg6DvyOzD7Ixsgl8EgCg4HyZ
sxtmc2A5FYiwYkkrhfpDsDM=
=jvKI
-----END PGP SIGNATURE-----

Attachments:

print-table_1.diff.bz2application/x-bzip2; name=print-table_1.diff.bz2Download
#10Martijn van Oosterhout
kleptog@svana.org
In reply to: Brendan Jurd (#9)
Re: [HACKERS] printTable API (was: Show INHERIT in \du)

On Tue, Apr 15, 2008 at 06:11:32PM +1000, Brendan Jurd wrote:

Now my question is, how do I indicate that a second version of the
patch has been submitted on the wiki? Should I leave the primary link
pointing at the original submission, or update it to point at this
message?

I'd say add a row: New version submitted (link) (brief summary of
changes)

If there are many more versions you can consider cutting, but you're
not there yet.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.

#11Alvaro Herrera
alvherre@commandprompt.com
In reply to: Brendan Jurd (#9)
1 attachment(s)
Re: printTable API (was: Show INHERIT in \du)

Brendan Jurd escribi�:

The second version of the patch is attached.

Thanks. I looked the patch over and did some minor changes. Modified
version attached.

One thing I'm not entirely happy about is the fact that we need a
pointer to the last footer/cell/header, so two pointers for each element
kind.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachments:

print-table-3.patch.bz2application/octet-streamDownload
#12Brendan Jurd
direvus@gmail.com
In reply to: Alvaro Herrera (#11)
Re: printTable API (was: Show INHERIT in \du)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Apr 17, 2008 at 7:27 AM, Alvaro Herrera wrote:

Brendan Jurd escribió:

The second version of the patch is attached.

Thanks. I looked the patch over and did some minor changes. Modified
version attached.

Cool, I had a look through your changes and they all seemed fine to
me. In particular, moving the header comments to the ... header ...
file seemed like a smart move. =)

One thing I'm not entirely happy about is the fact that we need a
pointer to the last footer/cell/header, so two pointers for each element
kind.

Well, the alternative is iterating through the array each time you
want to add something until you hit a NULL pointer, and adding the new
item at that point. Considering we're only chewing up an extra 4 *
sizeof(pointer) = 16 bytes in the struct, it seems like a reasonable
price to pay for the convenience.

What is it about the extra fields that makes you unhappy?

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIBo2y5YBsbHkuyV0RAmLhAJ9CP/9L1Nv7WbCtrCYu6tyoGhQItQCeMRm0
HegSSBq8tXw43Kj2xeQ2RCs=
=RvzP
-----END PGP SIGNATURE-----

#13Alvaro Herrera
alvherre@commandprompt.com
In reply to: Brendan Jurd (#12)
Re: printTable API (was: Show INHERIT in \du)

Brendan Jurd escribi�:

On Thu, Apr 17, 2008 at 7:27 AM, Alvaro Herrera wrote:

One thing I'm not entirely happy about is the fact that we need a
pointer to the last footer/cell/header, so two pointers for each
element kind.

Well, the alternative is iterating through the array each time you
want to add something until you hit a NULL pointer, and adding the new
item at that point. Considering we're only chewing up an extra 4 *
sizeof(pointer) = 16 bytes in the struct, it seems like a reasonable
price to pay for the convenience.

Well, consider that there are going to be 1 or 2 entries in the arrays
in most cases anyway :-) Well, as far as footers go anyway ... I just
realized that in all other cases it will certainly be the wrong thing to
do :-) Still, perhaps a integer count is better?

What is it about the extra fields that makes you unhappy?

I don't know if "unnecessarity" is a word, but I hope you get what I
mean :-)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#14Brendan Jurd
direvus@gmail.com
In reply to: Alvaro Herrera (#13)
Re: printTable API (was: Show INHERIT in \du)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Apr 17, 2008 at 9:53 AM, Alvaro Herrera wrote:

Well, consider that there are going to be 1 or 2 entries in the arrays
in most cases anyway :-) Well, as far as footers go anyway ... I just
realized that in all other cases it will certainly be the wrong thing to
do :-) Still, perhaps a integer count is better?

I oscillated between using an integer and a pointer for ->header and
- ->cell while I was writing the code. In the end I went with pointer
simply because it makes the iterations nicer looking, and there's a
symmetry in having the "first item" and "last item" pointers be the
same type.

If there's some technical reason that integers would be better, I'll
all ears, but from an aesthetic/code readability perspective I like
the pointer.

Brendan Jurd escribió:
What is it about the extra fields that makes you unhappy?

I don't know if "unnecessarity" is a word, but I hope you get what I
mean :-)

Since the footers list is usually pretty short, you could make an
argument for dropping the "last footer" pointer and just iterating
through the list to find the last element every time you want to do
something with footers. And to do that, you'd have to declare a
temporary pointer inside the AddFooter function anyway, so you're not
getting rid of the pointer so much as making it slightly more
transient.

All that having been said, C isn't my native language. So if the
pointer is indeed wasteful I'm happy to cede to the wisdom of more
experienced C coders.

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIBw7m5YBsbHkuyV0RAv4IAJ0cKrziZpNWkVV7LxFhlV/V5L0pJACfUtZ+
cVbcjL1e89j21JDZJBVdBqw=
=Nzr+
-----END PGP SIGNATURE-----

#15Alvaro Herrera
alvherre@commandprompt.com
In reply to: Brendan Jurd (#12)
Re: printTable API (was: Show INHERIT in \du)

Brendan Jurd escribi�:

On Thu, Apr 17, 2008 at 7:27 AM, Alvaro Herrera wrote:

Thanks. I looked the patch over and did some minor changes. Modified
version attached.

Cool, I had a look through your changes and they all seemed fine to
me. In particular, moving the header comments to the ... header ...
file seemed like a smart move. =)

FWIW I just noticed something else. With this patch we add pg_strdup
calls into print.c. pg_strdup lives in common.c.

This is fine as psql is concerned, but we have another problem which is
that in bin/scripts there are two programs that want to use
printQuery(). The problem is that there's no pg_strdup there :-(

The easy solution is to add pg_strdup to bin/scripts/common.c, but there
we don't have a global progname, so the error report in the out of
memory case cannot carry the name of the program crashing.

I don't like that, but I don't see any other solution. Ideas welcome.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#16Brendan Jurd
direvus@gmail.com
In reply to: Alvaro Herrera (#15)
1 attachment(s)
Re: printTable API (was: Show INHERIT in \du)

Hi guys,

Here's the latest version of the printTable API. This version is
against the current HEAD and merges in the changes made by the
recently committed psql wrap patch.

This version also includes Alvaro's fix for the issue of pg_strdup not
being available to programs in scripts/ (as quoted below).

Clean compile and all regression tests passed on amd64 gentoo.

Cheers,
BJ

On Thu, May 8, 2008 at 7:55 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Show quoted text

FWIW I just noticed something else. With this patch we add pg_strdup
calls into print.c. pg_strdup lives in common.c.

This is fine as psql is concerned, but we have another problem which is
that in bin/scripts there are two programs that want to use
printQuery(). The problem is that there's no pg_strdup there :-(

The easy solution is to add pg_strdup to bin/scripts/common.c, but there
we don't have a global progname, so the error report in the out of
memory case cannot carry the name of the program crashing.

I don't like that, but I don't see any other solution. Ideas welcome.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachments:

print-table-5.diff.bz2application/x-bzip2; name=print-table-5.diff.bz2Download
#17Alvaro Herrera
alvherre@commandprompt.com
In reply to: Brendan Jurd (#16)
Re: printTable API (was: Show INHERIT in \du)

Brendan Jurd escribi�:

Here's the latest version of the printTable API. This version is
against the current HEAD and merges in the changes made by the
recently committed psql wrap patch.

This version also includes Alvaro's fix for the issue of pg_strdup not
being available to programs in scripts/ (as quoted below).

Thanks. I had to merge Tom's fixes to the wrap code too.

Another thing that I changed is that now "not null" and "default %s" is
translatable in a table description. Also, I removed the _() call
around things like "?%c? \"%s.%s\"", which is what we use for tables of
which we don't recognize the relkind.

Thanks for the patch.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.