Soliciting Feedback on Improving Server-Side Programming Documentation

Started by Corey Huinkeralmost 10 years ago10 messages
#1Corey Huinker
corey.huinker@gmail.com

Over the past few months, I've been familiarizing myself with postgres
server side programming in C.

My attempts to educate myself were slow and halting. The existing server
side programming documentation has some examples, but those examples didn't
show me how do what I wanted to do, and my research-via-google was highly
circular, almost always pointing back to the documentation I had already
found lacking, or a copy of it.

Most of what I have learned I have culled from asking people on IRC, or
bugging people I've met through user groups and PgConf. In all cases,
people have been extremely helpful. However, this method is inefficient,
because we're using two people's time, one of whom has to tolerate my
incessant questions and slow learning pace.

Furthermore, the helpful suggestions I received boiled down to:
1. The function/macro/var you're looking for is PG_FOO, git grep PG_FOO
2. Look in blah.c which does something like what you're trying to do
3. The comments in blah.h do a good job of listing and explaining this
macro or that

#1 git grep is a helpful reflex for discovering examples on my own, but it
requires that I have a term to search on in the first place, and too often
I don't know what I don't know.

#2 is the gold standard in terms of correctness (the code had to have
worked at least up to the last checkin date), and in terms of
discoverability it often gave me names of new macros to search for, coding
patterns, etc. However, I was always left with the questions: How would I
have figured this out on my own? How is the next person going to figure it
out? Why doesn't anybody document this?

#3 Often answers the last question in #2: It *is* documented, but that
documentation is not easily discoverable by conventional means.

So what I'd like to do is migrate some of the helpful information in the
header files into pages of web searchable documentation, and also to revamp
the existing documentation to be more relevant.

Along the way, I collected a list of things I wished I'd had from the start:

- A list of all the GETARG_* macros. It would have been especially great
if this were in table form: Your Parameter Is A / Use This Macro / Which
Gives This Result Type / Working example.
- A list/table of the DatumGet* macros. I'm aware that many of them
overlap/duplicate others. That'd be good to know too.
- The table at
http://www.postgresql.org/docs/9.1/static/errcodes-appendix.html has the
numeric codes and PL/PGSQL constants enumerated. It'd be nice if it had the
C #define as well
- The SPI documentation mentions most/all of the SPI functions, but I
couldn't find documentation on the SPI variables like SPI_processed and
SPI_tuptable.
- Examples and explanation of how PG_TRY()/PG_CATCH work. How to add
context callbacks.
- Direct Function Calls
- A comparison of the two modes of writing SRF functions (Materialize vs
multi-call)
- Less explanation of how to do write V0-style functions. That was
called the "old style" back in version 7.1. Why is that information up
front in the documentation when so much else is sequestered in header files?

Some of these things may seem obvious/trivial to you. I would argue that
they're only obvious in retrospect, and the more obvious-to-you things we
robustly document, the quicker we accumulate programmers who are capable of
agreeing that it's obvious, and that's good for the community.

I'm aware that some of these APIs change frequently. In those cases, I
suggest that we make note of that on the same page.

Because I'm still going through the learning curve, I'm probably the least
qualified to write the actual documentation. However, I have a clear memory
of what was hard to learn and I have the motivation to make it easier on
the next person. That makes me a good focal point for gathering,
formatting, and submitting the documentation in patch form. I'm
volunteering to do so. What I need from you is:

- Citations of existing documentation in header files that could/should
be exposed in our more formal documentation.
- Explanations of any of the things above, which I can then reformat
into proposed documentation.
- A willingness to review the proposed new documentation
- Reasoned explanations for why this is a fool's errand

You supply the expertise, I'll write the patch.

Thanks in advance.

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Corey Huinker (#1)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On 03/15/2016 10:02 AM, Corey Huinker wrote:

Some of these things may seem obvious/trivial to you. I would argue that
they're only obvious in retrospect, and the more obvious-to-you things
we robustly document, the quicker we accumulate programmers who are
capable of agreeing that it's obvious, and that's good for the community.

I'm aware that some of these APIs change frequently. In those cases, I
suggest that we make note of that on the same page.

I think this is all great. You may find some automated assistance from
doxygen.postgresql.org .

Sincerely,

JD

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Shulgin, Oleksandr
oleksandr.shulgin@zalando.de
In reply to: Corey Huinker (#1)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On Tue, Mar 15, 2016 at 6:02 PM, Corey Huinker <corey.huinker@gmail.com>
wrote:

Over the past few months, I've been familiarizing myself with postgres
server side programming in C.

My attempts to educate myself were slow and halting. The existing server
side programming documentation has some examples, but those examples didn't
show me how do what I wanted to do, and my research-via-google was highly
circular, almost always pointing back to the documentation I had already
found lacking, or a copy of it.

Most of what I have learned I have culled from asking people on IRC, or
bugging people I've met through user groups and PgConf. In all cases,
people have been extremely helpful. However, this method is inefficient,
because we're using two people's time, one of whom has to tolerate my
incessant questions and slow learning pace.

Furthermore, the helpful suggestions I received boiled down to:
1. The function/macro/var you're looking for is PG_FOO, git grep PG_FOO
2. Look in blah.c which does something like what you're trying to do
3. The comments in blah.h do a good job of listing and explaining this
macro or that

There's also a good deal of README files in the source tree, so I would add:

4. find src -name 'README*'

#4Corey Huinker
corey.huinker@gmail.com
In reply to: Joshua D. Drake (#2)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

I think this is all great. You may find some automated assistance from
doxygen.postgresql.org .

Sincerely,

JD

doxygen is great as far as it goes, but it does a great job of separating
function definition from the comment explaining the function, so I have to
drill into the raw source anyway.

Also, doxygen isn't very helpful with macros, and a lot of functions in the
internals are actually macros.

#5Corey Huinker
corey.huinker@gmail.com
In reply to: Shulgin, Oleksandr (#3)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On Tue, Mar 15, 2016 at 1:19 PM, Shulgin, Oleksandr <
oleksandr.shulgin@zalando.de> wrote:

There's also a good deal of README files in the source tree, so I would
add:

4. find src -name 'README*'

That too. But README's don't show up (easily) in a google search, so they
elude discovery. We should want to make discovery easy to the uninitiated.

#6Joshua D. Drake
jd@commandprompt.com
In reply to: Corey Huinker (#5)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On 03/15/2016 10:30 AM, Corey Huinker wrote:

On Tue, Mar 15, 2016 at 1:19 PM, Shulgin, Oleksandr
<oleksandr.shulgin@zalando.de <mailto:oleksandr.shulgin@zalando.de>> wrote:

There's also a good deal of README files in the source tree, so I
would add:

4. find src -name 'README*'

That too. But README's don't show up (easily) in a google search, so
they elude discovery. We should want to make discovery easy to the
uninitiated.

I don't think anyone is arguing with you. I think we are trying to point
you to sources for your project.

JD

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Corey Huinker
corey.huinker@gmail.com
In reply to: Joshua D. Drake (#6)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On Tue, Mar 15, 2016 at 1:35 PM, Joshua D. Drake <jd@commandprompt.com>
wrote:

On 03/15/2016 10:30 AM, Corey Huinker wrote:

On Tue, Mar 15, 2016 at 1:19 PM, Shulgin, Oleksandr
<oleksandr.shulgin@zalando.de <mailto:oleksandr.shulgin@zalando.de>>
wrote:

There's also a good deal of README files in the source tree, so I
would add:

4. find src -name 'README*'

That too. But README's don't show up (easily) in a google search, so
they elude discovery. We should want to make discovery easy to the
uninitiated.

I don't think anyone is arguing with you. I think we are trying to point
you to sources for your project.

I didn't mean to imply that anyone was arguing. All responses so far have
been positive.

In reply to: Corey Huinker (#1)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

I started a similar thread with probably similar concerns:
/messages/by-id/56D1A6AA.6080303@8kdata.com

I believe this effort should be done. I added to my TODO list to
compile a list of used functions in a selection of picked extensions to
use that as a starting point of an "API".

Regards,

Álvaro

--
Álvaro Hernández Tortosa

-----------
8Kdata

Show quoted text

On 15/03/16 13:02, Corey Huinker wrote:

Over the past few months, I've been familiarizing myself with postgres
server side programming in C.

My attempts to educate myself were slow and halting. The existing
server side programming documentation has some examples, but those
examples didn't show me how do what I wanted to do, and my
research-via-google was highly circular, almost always pointing back
to the documentation I had already found lacking, or a copy of it.

Most of what I have learned I have culled from asking people on IRC,
or bugging people I've met through user groups and PgConf. In all
cases, people have been extremely helpful. However, this method is
inefficient, because we're using two people's time, one of whom has to
tolerate my incessant questions and slow learning pace.

Furthermore, the helpful suggestions I received boiled down to:
1. The function/macro/var you're looking for is PG_FOO, git grep PG_FOO
2. Look in blah.c which does something like what you're trying to do
3. The comments in blah.h do a good job of listing and explaining this
macro or that

#1 git grep is a helpful reflex for discovering examples on my own,
but it requires that I have a term to search on in the first place,
and too often I don't know what I don't know.

#2 is the gold standard in terms of correctness (the code had to have
worked at least up to the last checkin date), and in terms of
discoverability it often gave me names of new macros to search for,
coding patterns, etc. However, I was always left with the questions:
How would I have figured this out on my own? How is the next person
going to figure it out? Why doesn't anybody document this?

#3 Often answers the last question in #2: It *is* documented, but that
documentation is not easily discoverable by conventional means.

So what I'd like to do is migrate some of the helpful information in
the header files into pages of web searchable documentation, and also
to revamp the existing documentation to be more relevant.

Along the way, I collected a list of things I wished I'd had from the
start:

* A list of all the GETARG_* macros. It would have been especially
great if this were in table form: Your Parameter Is A / Use This
Macro / Which Gives This Result Type / Working example.
* A list/table of the DatumGet* macros. I'm aware that many of them
overlap/duplicate others. That'd be good to know too.
* The table at
http://www.postgresql.org/docs/9.1/static/errcodes-appendix.html
has the numeric codes and PL/PGSQL constants enumerated. It'd be
nice if it had the C #define as well
* The SPI documentation mentions most/all of the SPI functions, but
I couldn't find documentation on the SPI variables like
SPI_processed and SPI_tuptable.
* Examples and explanation of how PG_TRY()/PG_CATCH work. How to add
context callbacks.
* Direct Function Calls
* A comparison of the two modes of writing SRF functions
(Materialize vs multi-call)
* Less explanation of how to do write V0-style functions. That was
called the "old style" back in version 7.1. Why is that
information up front in the documentation when so much else is
sequestered in header files?

Some of these things may seem obvious/trivial to you. I would argue
that they're only obvious in retrospect, and the more obvious-to-you
things we robustly document, the quicker we accumulate programmers who
are capable of agreeing that it's obvious, and that's good for the
community.

I'm aware that some of these APIs change frequently. In those cases, I
suggest that we make note of that on the same page.

Because I'm still going through the learning curve, I'm probably the
least qualified to write the actual documentation. However, I have a
clear memory of what was hard to learn and I have the motivation to
make it easier on the next person. That makes me a good focal point
for gathering, formatting, and submitting the documentation in patch
form. I'm volunteering to do so. What I need from you is:

* Citations of existing documentation in header files that
could/should be exposed in our more formal documentation.
* Explanations of any of the things above, which I can then reformat
into proposed documentation.
* A willingness to review the proposed new documentation
* Reasoned explanations for why this is a fool's errand

You supply the expertise, I'll write the patch.

Thanks in advance.

#9Corey Huinker
corey.huinker@gmail.com
In reply to: Álvaro Hernández Tortosa (#8)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On Tue, Mar 15, 2016 at 4:38 PM, Álvaro Hernández Tortosa <aht@8kdata.com>
wrote:

I started a similar thread with probably similar concerns:
/messages/by-id/56D1A6AA.6080303@8kdata.com

I believe this effort should be done. I added to my TODO list to
compile a list of used functions in a selection of picked extensions to use
that as a starting point of an "API".

Regards,

Álvaro

Clearly we have the same goal in mind. I don't know how I missed seeing
your thread.

#10Craig Ringer
craig@2ndquadrant.com
In reply to: Corey Huinker (#1)
Re: Soliciting Feedback on Improving Server-Side Programming Documentation

On 16 March 2016 at 01:02, Corey Huinker <corey.huinker@gmail.com> wrote:

#1 git grep is a helpful reflex for discovering examples on my own, but it
requires that I have a term to search on in the first place, and too often
I don't know what I don't know.

Yep. This can be painful when you're trying to figure out what macro to use
to access fields in some struct, the PG_GETARG macro for a type, etc.

#2 is the gold standard in terms of correctness (the code had to have
worked at least up to the last checkin date), and in terms of
discoverability it often gave me names of new macros to search for, coding
patterns, etc. However, I was always left with the questions: How would I
have figured this out on my own? How is the next person going to figure it
out? Why doesn't anybody document this?

Indeed. In particular, it's not always obvious when you're new to the
codebase which files relate to what. The codebase is fairly well structured
but you have to get a decent understanding of how the bits of the server
fit together before you really know where to look.

A src/README docs with brief descriptions of the tree structure and key
files would be helpful, or additions to src/DEVELOPERS .

#3 Often answers the last question in #2: It *is* documented, but that
documentation is not easily discoverable by conventional means.

Particularly when it's a comment on a function that you have to know about
before you can find the comment.

So what I'd like to do is migrate some of the helpful information in the
header files into pages of web searchable documentation, and also to revamp
the existing documentation to be more relevant.

I'm not convinced by that part. Rather than moving stuff into the SGML
docs, which are frankly painful to maintain and less visible when editing
the relevant code, I'd be much happier to see README-style docs located
close to the code they're relevant to, and/or cross-referencing comments in
headers and sources to help tell you where to look.

Along the way, I collected a list of things I wished I'd had from the start:

- A list of all the GETARG_* macros. It would have been especially
great if this were in table form: Your Parameter Is A / Use This Macro /
Which Gives This Result Type / Working example.

Yes, though that's an area where "git grep" does a reasonable job it's a

bit awkward.

This probably *should* be in the SGML docs, in the C extensions section,
along with the related DatumGet and PG_RETURN_ functions and macros.

- The table at
http://www.postgresql.org/docs/9.1/static/errcodes-appendix.html has
the numeric codes and PL/PGSQL constants enumerated. It'd be nice if it had
the C #define as well

... at least where the C define is different to the plpgsql constant. Which
is occasonally the case.

- The SPI documentation mentions most/all of the SPI functions, but I
couldn't find documentation on the SPI variables like SPI_processed and
SPI_tuptable.

http://www.postgresql.org/docs/current/static/spi-spi-execute.html

- Examples and explanation of how PG_TRY()/PG_CATCH work. How to add
context callbacks.

... and warnings about their limitations. In particular, that PG_TRY /

PG_CATCH doesn't imply a savepoint and you can't just merrily carry on
after (say) an SPI error.

- Direct Function Calls

Yeah, with a few examples, including one showing caching of the fmgr info

for a FunctionCall by info not oid.

- A comparison of the two modes of writing SRF functions (Materialize
vs multi-call)

Worthwhile, yeah.

- Less explanation of how to do write V0-style functions. That was
called the "old style" back in version 7.1. Why is that information up
front in the documentation when so much else is sequestered in header files?

I'd just like to delete that entirely.

Some of these things may seem obvious/trivial to you. I would argue that

they're only obvious in retrospect, and the more obvious-to-you things we
robustly document, the quicker we accumulate programmers who are capable of
agreeing that it's obvious, and that's good for the community.

I still remember them being very non-obvious, so I agree.

Because I'm still going through the learning curve, I'm probably the least
qualified to write the actual documentation.

You're *extremely* qualified to make notes of what's hard, though, which is
something people who've worked on the codebase for a while tend to forget.

I've been trying to write little bits of docs as I go and as I learn.
Going to write one on how timelines work soon.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services