Wanna help PostgreSQL

Started by Zhe-Wei Jiangover 11 years ago11 messages
#1Zhe-Wei Jiang
jrreinhardt@gmail.com

Hi everyone,

I'm new to the open source community, and wanna help in some implementation
work.
However, when I looked into the TODO list, I found most of the items are
very old.
Therefore, I'd like to know if there is any entry-level implementation
needed by PostgreSQL now?
Any advise is appreciated.
Thanks!!

Regards,
Zhe-Wei Jiang

#2Stephen Frost
sfrost@snowman.net
In reply to: Zhe-Wei Jiang (#1)
Re: Wanna help PostgreSQL

* Zhe-Wei Jiang (jrreinhardt@gmail.com) wrote:

Therefore, I'd like to know if there is any entry-level implementation
needed by PostgreSQL now?
Any advise is appreciated.

You might take a look at the GSoC ideas page instead:

https://wiki.postgresql.org/wiki/GSoC_2014

Thanks,

Stephen

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Zhe-Wei Jiang (#1)
Re: Wanna help PostgreSQL

On Tue, May 13, 2014 at 10:12 AM, Zhe-Wei Jiang <jrreinhardt@gmail.com>wrote:

Hi everyone,

I'm new to the open source community, and wanna help in some
implementation work.
However, when I looked into the TODO list, I found most of the items are
very old.

So that is one thing you can do to help, try to improve the todo page!
I've thought we should create a new badge for staleness (in addition to
the Easy and Done ones) and add it to everything currently on the page,
then remove it from ones that were recently evaluated and still seem to be
relevant.

And in the process, we should clarify why they haven't been done yet.
Often what really needs to be done is not the implementation, but rather
the evaluation of whether it is worth the trade offs involved.

Therefore, I'd like to know if there is any entry-level implementation
needed by PostgreSQL now?

Did anything on the todo list strike your fancy, especially anything marked
Easy? Even if it is old, it still might be valid. Then discuss it here on
the hackers list, and if it not still valid we can remove it from the list.
No point putting the sour milk back in the fridge.

How much experience do you have using PostgreSQL (or for that matter,
coding in C)?

Cheers,

Jeff

#4Zhe-Wei Jiang
jrreinhardt@gmail.com
In reply to: Zhe-Wei Jiang (#1)
Re: Wanna help PostgreSQL

Forgot to include pgsql-hackers.

2014-05-14 22:11 GMT+08:00 Zhe-Wei Jiang <jrreinhardt@gmail.com>:

Show quoted text

Thanks Stephen.

I read the GSoC ideas page and think the only one I'm possibly to help for
now is
"Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll &
.dylib)",
while I'm still not clear with the whole postgreSQL project.

If my understanding is correct, taking pg_dump as an example, this should
include refactoring the pg_dump.c and rewrite the Makefile to make it as
shared libraries.
Do I miss anything?

Regards,
Zhe-Wei Jiang

2014-05-14 1:49 GMT+08:00 Stephen Frost <sfrost@snowman.net>:

* Zhe-Wei Jiang (jrreinhardt@gmail.com) wrote:

Therefore, I'd like to know if there is any entry-level implementation
needed by PostgreSQL now?
Any advise is appreciated.

You might take a look at the GSoC ideas page instead:

https://wiki.postgresql.org/wiki/GSoC_2014

Thanks,

Stephen

#5Zhe-Wei Jiang
jrreinhardt@gmail.com
In reply to: Jeff Janes (#3)
Re: Wanna help PostgreSQL

Thanks Jeff.

2014-05-14 2:19 GMT+08:00 Jeff Janes <jeff.janes@gmail.com>:

On Tue, May 13, 2014 at 10:12 AM, Zhe-Wei Jiang <jrreinhardt@gmail.com>wrote:

Hi everyone,

I'm new to the open source community, and wanna help in some
implementation work.
However, when I looked into the TODO list, I found most of the items are
very old.

So that is one thing you can do to help, try to improve the todo page!
I've thought we should create a new badge for staleness (in addition to
the Easy and Done ones) and add it to everything currently on the page,
then remove it from ones that were recently evaluated and still seem to be
relevant.

And in the process, we should clarify why they haven't been done yet.
Often what really needs to be done is not the implementation, but rather
the evaluation of whether it is worth the trade offs involved.

Aren't I too unfamiliar with the postgreSQL project to help on this?
I think this should be coordinated by a more experienced contributor.

Therefore, I'd like to know if there is any entry-level implementation
needed by PostgreSQL now?

Did anything on the todo list strike your fancy, especially anything
marked Easy? Even if it is old, it still might be valid. Then discuss it
here on the hackers list, and if it not still valid we can remove it from
the list. No point putting the sour milk back in the fridge.

The following two look easies to me:
[E] [image: Incomplete
item]<http://wiki.postgresql.org/wiki/File:UntickedTick.svg&gt;Add
full object name to the tag field. eg. for operators we need '=(integer,
integer)', instead of just '='. [E] [image: Incomplete
item]<http://wiki.postgresql.org/wiki/File:UntickedTick.svg&gt;Modify
pg_dump to create skeleton views for reload (which are then updated via
CREATE OR REPLACE VIEW) when views have circular dependencies. This should
eliminate the need for the CREATE RULE "_RETURN" hack currently used to
address this issue.

Are they still in need?

Besides, for the following:
[E] [image: Incomplete
item]<http://wiki.postgresql.org/wiki/File:UntickedTick.svg&gt;Remove
warnings created by -Wcast-align

I tried to add -Wcast-align in the latest git source but found no warning.
Should this be removed from the TODO page?

How much experience do you have using PostgreSQL (or for that matter,
coding in C)?

Actually I never used PostgreSQL before.
I used to run mySQL but also not much experience.
I coded C++ for ~10 years and I guess C should not bring too much trouble.

Cheers,

Jeff

Regards,
Zhe-Wei Jiang

#6Zhe-Wei Jiang
jrreinhardt@gmail.com
In reply to: Zhe-Wei Jiang (#1)
Re: Wanna help PostgreSQL

See, that's beyond my imagination.
Then I have to do more research first before I can understand the other
ideas.

Regards,
Zhe-Wei Jiang

2014-05-14 22:25 GMT+08:00 Stephen Frost <sfrost@snowman.net>:

Show quoted text

Zhe-Wei,

* Zhe-Wei Jiang (jrreinhardt@gmail.com) wrote:

I read the GSoC ideas page and think the only one I'm possibly to help

for

now is
"Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll &
.dylib)",
while I'm still not clear with the whole postgreSQL project.

It's not clear to me that we actually want that one, and even if we do,
it's actually a huge amount of work...

Thanks,

Stephen

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zhe-Wei Jiang (#4)
Re: Wanna help PostgreSQL

Zhe-Wei Jiang <jrreinhardt@gmail.com> writes:

I read the GSoC ideas page and think the only one I'm possibly to help for
now is
"Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll &
.dylib)",
while I'm still not clear with the whole postgreSQL project.

If my understanding is correct, taking pg_dump as an example, this should
include refactoring the pg_dump.c and rewrite the Makefile to make it as
shared libraries.
Do I miss anything?

The reason that that project has gone untouched for upwards of ten years
is that it's not just a large coding project, but involves a lot of
complex API design with uncertain goals. It's not very clear what
features people would want from a "pg_dump library", though one capability
that gets mentioned often is the ability to extract the SQL definition
for a single object. So before anything else you'd need to identify a
satisfactory set of library capabilities. The next nasty problem is that
pg_dump has a large set of odd behaviors that have evolved for good and
sufficient reason, and that we'd not want to give up, but that it's not
clear whether anyone else would want --- and they'd complicate any API
definition quite a bit. One example is that pg_dump knows how to dump
objects in a safe order to avoid forward references. If there turn out
to be circular references (which arise in more cases than you might think)
it even knows how to split certain kinds of objects into multiple
commands so as to break the circularity. How would we expose all that?
The features for parallel pg_dump and pg_restore are another thing that
doesn't seem to fit all that well into a clean library API. But how much
of this should actually be in a library, rather than in the wrapper
programs?

So while this is certainly a worthwhile task, it's not one to
underestimate the scope and difficulty of.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#7)
Re: Wanna help PostgreSQL

On Wed, May 14, 2014 at 9:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason that that project has gone untouched for upwards of ten years
is that it's not just a large coding project, but involves a lot of
complex API design with uncertain goals. It's not very clear what
features people would want from a "pg_dump library", though one capability
that gets mentioned often is the ability to extract the SQL definition
for a single object.

Personally I'd prefer the creation of definitional SQL be moved out of
pg_dump and into the database proper via something like
'pg_sql_definition(oid)' or something like that. There are lot of
reasons applications (especially administrative ones like pgadmin and
psql but also end user applications in some cases) would want to do
that and forcing everything through pg_dump et al is awkward. The
less magic in the external applications the better.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@2ndquadrant.com
In reply to: Merlin Moncure (#8)
Re: Wanna help PostgreSQL

On 2014-05-14 10:49:05 -0500, Merlin Moncure wrote:

On Wed, May 14, 2014 at 9:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason that that project has gone untouched for upwards of ten years
is that it's not just a large coding project, but involves a lot of
complex API design with uncertain goals. It's not very clear what
features people would want from a "pg_dump library", though one capability
that gets mentioned often is the ability to extract the SQL definition
for a single object.

Personally I'd prefer the creation of definitional SQL be moved out of
pg_dump and into the database proper via something like
'pg_sql_definition(oid)' or something like that. There are lot of
reasons applications (especially administrative ones like pgadmin and
psql but also end user applications in some cases) would want to do
that and forcing everything through pg_dump et al is awkward. The
less magic in the external applications the better.

That'd be a separate feature from pg_dump though. pg_dump needs to be
cross-version compatible and the above prevents that...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#8)
Re: Wanna help PostgreSQL

Merlin Moncure <mmoncure@gmail.com> writes:

On Wed, May 14, 2014 at 9:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason that that project has gone untouched for upwards of ten years
is that it's not just a large coding project, but involves a lot of
complex API design with uncertain goals. It's not very clear what
features people would want from a "pg_dump library", though one capability
that gets mentioned often is the ability to extract the SQL definition
for a single object.

Personally I'd prefer the creation of definitional SQL be moved out of
pg_dump and into the database proper via something like
'pg_sql_definition(oid)' or something like that.

Well, that's just a different way of packaging a library, no? It doesn't
make the library-API problems any less difficult. If anything, it makes
things even harder, because now you have to consider version skew between
pg_dump and the server. And if you get any API details wrong you have
no ability to change them till the next major release cycle.

While we might someday do it like that, I'd think it foolish to proceed
with such an approach until we had a proven library API design on the
client side. The costs of iterating there are a lot less.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#10)
Re: Wanna help PostgreSQL

On Wed, May 14, 2014 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

On Wed, May 14, 2014 at 9:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason that that project has gone untouched for upwards of ten years
is that it's not just a large coding project, but involves a lot of
complex API design with uncertain goals. It's not very clear what
features people would want from a "pg_dump library", though one capability
that gets mentioned often is the ability to extract the SQL definition
for a single object.

Personally I'd prefer the creation of definitional SQL be moved out of
pg_dump and into the database proper via something like
'pg_sql_definition(oid)' or something like that.

Well, that's just a different way of packaging a library, no? It doesn't
make the library-API problems any less difficult. If anything, it makes
things even harder, because now you have to consider version skew between
pg_dump and the server. And if you get any API details wrong you have
no ability to change them till the next major release cycle.

While we might someday do it like that, I'd think it foolish to proceed
with such an approach until we had a proven library API design on the
client side. The costs of iterating there are a lot less.

Yeah -- Andres said it even more cleanly: for forward upgrades you'd
need to communicate with both versioned backends to produce a dump.
That's not a complete deal breaker but definitely a lot more complex
than I was thinking.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers