What Would You Like To Do?
Hackers,
Later this week I'm giving a [brief][] for an audience of what I hope will be corporate PostgreSQL users that covers how to get a feature developed for PostgreSQL. The idea here is that there are a lot of organizations out there with very deep commitments to PostgreSQL, who really take advantage of what it has to offer, but also would love additional features PostgreSQL doesn't offer. Perhaps some of them would be willing to fund development of the featured they need.
[brief]: http://postgresopen.org/2011/schedule/presentations/83/
Toward the end of the presentation, I'd like to make some suggestions and offer to do some match-making. I'm thinking primarily of listing some of the stuff the community would love to see done, along with the names of the folks and/or companies who, with funding, might make it happen. My question for you is: What do you want to work on?
Here's my preliminary list:
* Integrated partitioning support: Simon/2nd Quadrant
* High-CPU concurrency: Robert/Enterprise DB
* Multimaster replication and clustering: Simon/2nd Quadrant
* Multi-table indexes: Heiki? Oleg & Teodor?
* Column-leve collation support: Peter/Enterprise DB
* Faster and more fault tolerant data loading: Andrew/PGX
* Automated postgresql.conf Configuration: Greg/2nd Quadrant
* Parallel pg_dump: Andrew/PGX
* SET GLOBAL-style configuration in SQL: Greg/2nd Quadant
* Track table and index caching to improve optimizer decisions: Robert/Enterprise DB
Thanks to Greg Smith for adding a few bonus ideas I hadn't thought of. What else have you got? I don't think we necessarily have to limit ourselves to core features, BTW: projects like PostGIS and pgAdmin are also clearly popular, and new projects of that scope (or improvements to those!) would no doubt be welcome. Also, I'm highlighting PGXN and an example of how this sort of thing might work.
So, what do you want to work on? Let me know, I'll do as much match-making at the conference as I can.
Best,
David
On sön, 2011-09-11 at 21:21 -0700, David E. Wheeler wrote:
* Column-leve collation support: Peter/Enterprise DB
Column-level collation support already exists.
Hi,
"David E. Wheeler" <david@kineticode.com> writes:
Thanks to Greg Smith for adding a few bonus ideas I hadn't thought of. What
else have you got? I don't think we necessarily have to limit ourselves to
core features, BTW: projects like PostGIS and pgAdmin are also clearly
popular, and new projects of that scope (or improvements to those!) would no
doubt be welcome.
You could add DDL Triggers from me (2ndQuadrant) and process-based
parallel loading in pgloader (currently thread based, sucks).
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sep 12, 2011, at 6:01, Peter Eisentraut <peter_e@gmx.net> wrote:
Column-level collation support already exists.
Yeah, just realized that. I mention to say table or column-level encoding.
Best,
David
* David E. Wheeler (david@kineticode.com) wrote:
Toward the end of the presentation, I'd like to make some suggestions and offer to do some match-making. I'm thinking primarily of listing some of the stuff the community would love to see done, along with the names of the folks and/or companies who, with funding, might make it happen. My question for you is: What do you want to work on?
I'm not looking for funding (probably couldn't take it if I was offered
it, heh), so I'm not sure if it should be included, but I'm still
planning to dig into revamping the logging system (if I can ever manage
to get out from under my current 'real job' workload :/). If others are
interested and have time to help, please let me know..
Thanks,
Stephen
On Sep 13, 2011 2:37 AM, "Stephen Frost" <sfrost@snowman.net> wrote:
* David E. Wheeler (david@kineticode.com) wrote:
Toward the end of the presentation, I'd like to make some suggestions
and offer to do some match-making. I'm thinking primarily of listing some of
the stuff the community would love to see done, along with the names of the
folks and/or companies who, with funding, might make it happen. My question
for you is: What do you want to work on?
I'm not looking for funding (probably couldn't take it if I was offered
it, heh), so I'm not sure if it should be included, but I'm still
planning to dig into revamping the logging system (if I can ever manage
to get out from under my current 'real job' workload :/). If others are
interested and have time to help, please let me know..
Definitely interested in that, yes. We probably have some overlap in our
thoughts and plans, as discussed at the developer meeting in Ottawa.
Not specifically looking for funding either, but it would certainly increase
the number of hours available to work on it and as such make it happen
sooner...
/Magnus
On Sep 12, 2011, at 9:41 PM, Magnus Hagander wrote:
I'm not looking for funding (probably couldn't take it if I was offered
it, heh), so I'm not sure if it should be included, but I'm still
planning to dig into revamping the logging system (if I can ever manage
to get out from under my current 'real job' workload :/). If others are
interested and have time to help, please let me know..Definitely interested in that, yes. We probably have some overlap in our thoughts and plans, as discussed at the developer meeting in Ottawa.
Not specifically looking for funding either, but it would certainly increase the number of hours available to work on it and as such make it happen sooner…
Yeah, that's the point.
Best,
David
On Sun, 2011-09-11 at 21:21 -0700, David E. Wheeler wrote:
Hackers,
Later this week I'm giving a [brief][] for an audience of what I
hope will be corporate PostgreSQL users that covers how to get a
feature developed for PostgreSQL. The idea here is that there are
a lot of organizations out there with very deep commitments to
PostgreSQL, who really take advantage of what it has to offer,
but also would love additional features PostgreSQL doesn't offer.
Perhaps some of them would be willing to fund development of the featured they need.
Hannu Krosing / 2ndQuadrant
* more enhancements to pl/python - use real function arguments,
store modules in database, direct support for postgresql types,
operators and functions, automatic startup command,
automatic ORM from table definitions, ...
* various support functionality for replication and automatic growth
of sharded databases - user defined tuple visibility functions,
triggers for DDL and ON COMMIT/ON ROLLBACK, ...
* putting time travel (which Oracle calls "flashback queries") back
into postgreSQL
* moving tuple visibility in a separate index-like structure which
should be highly compressible in most cases, as a way to enabling
index-only scans, column oriented storage and effective table
compression, ...
Show quoted text
[brief]: http://postgresopen.org/2011/schedule/presentations/83/
Toward the end of the presentation, I'd like to make some suggestions and offer to do some match-making. I'm thinking primarily of listing some of the stuff the community would love to see done, along with the names of the folks and/or companies who, with funding, might make it happen. My question for you is: What do you want to work on?
Here's my preliminary list:
* Integrated partitioning support: Simon/2nd Quadrant
* High-CPU concurrency: Robert/Enterprise DB
* Multimaster replication and clustering: Simon/2nd Quadrant
* Multi-table indexes: Heiki? Oleg & Teodor?
* Column-leve collation support: Peter/Enterprise DB
* Faster and more fault tolerant data loading: Andrew/PGX
* Automated postgresql.conf Configuration: Greg/2nd Quadrant
* Parallel pg_dump: Andrew/PGX
* SET GLOBAL-style configuration in SQL: Greg/2nd Quadant
* Track table and index caching to improve optimizer decisions: Robert/Enterprise DBThanks to Greg Smith for adding a few bonus ideas I hadn't thought of. What else have you got? I don't think we necessarily have to limit ourselves to core features, BTW: projects like PostGIS and pgAdmin are also clearly popular, and new projects of that scope (or improvements to those!) would no doubt be welcome. Also, I'm highlighting PGXN and an example of how this sort of thing might work.
So, what do you want to work on? Let me know, I'll do as much match-making at the conference as I can.
Best,
David
On Sep 13, 2011, at 9:43 AM, Hannu Krosing wrote:
Hannu Krosing / 2ndQuadrant
* more enhancements to pl/python - use real function arguments,
store modules in database, direct support for postgresql types,
operators and functions, automatic startup command,
automatic ORM from table definitions, ...
* various support functionality for replication and automatic growth
of sharded databases - user defined tuple visibility functions,
triggers for DDL and ON COMMIT/ON ROLLBACK, ...
* putting time travel (which Oracle calls "flashback queries") back
into postgreSQL
* moving tuple visibility in a separate index-like structure which
should be highly compressible in most cases, as a way to enabling
index-only scans, column oriented storage and effective table
compression, ...
Awesome, thanks!
David
On 12 September 2011 05:21, David E. Wheeler <david@kineticode.com> wrote:
Hackers,
Later this week I'm giving a [brief][] for an audience of what I hope will be corporate PostgreSQL users that covers how to get a feature developed for PostgreSQL. The idea here is that there are a lot of organizations out there with very deep commitments to PostgreSQL, who really take advantage of what it has to offer, but also would love additional features PostgreSQL doesn't offer. Perhaps some of them would be willing to fund development of the featured they need.
[brief]: http://postgresopen.org/2011/schedule/presentations/83/
Toward the end of the presentation, I'd like to make some suggestions and offer to do some match-making. I'm thinking primarily of listing some of the stuff the community would love to see done, along with the names of the folks and/or companies who, with funding, might make it happen. My question for you is: What do you want to work on?
Here's my preliminary list:
* Integrated partitioning support: Simon/2nd Quadrant
* High-CPU concurrency: Robert/Enterprise DB
* Multimaster replication and clustering: Simon/2nd Quadrant
* Multi-table indexes: Heiki? Oleg & Teodor?
* Column-leve collation support: Peter/Enterprise DB
* Faster and more fault tolerant data loading: Andrew/PGX
* Automated postgresql.conf Configuration: Greg/2nd Quadrant
* Parallel pg_dump: Andrew/PGX
* SET GLOBAL-style configuration in SQL: Greg/2nd Quadant
* Track table and index caching to improve optimizer decisions: Robert/Enterprise DBThanks to Greg Smith for adding a few bonus ideas I hadn't thought of. What else have you got? I don't think we necessarily have to limit ourselves to core features, BTW: projects like PostGIS and pgAdmin are also clearly popular, and new projects of that scope (or improvements to those!) would no doubt be welcome. Also, I'm highlighting PGXN and an example of how this sort of thing might work.
So, what do you want to work on? Let me know, I'll do as much match-making at the conference as I can.
I have a wish-list of features, but I don't know of anyone specific
who could work on them. In addition to some you've mentioned they
are:
* Distributed queries
* Multi-threaded query operations (single queries making use of more
than 1 core in effect)
* Stored procedures
* Automatic failover re-subscription (okay, I don't know what you'd
call this, but where you have several standbys, the primary fails, one
standby is automatically promoted, and the remaining standbys
automatically subscribe to the newly-promoted one without needing a
new base backup)
* ROLLUP and CUBE
* pg_dumpall custom format (Guillaume mentioned this was on his to-do
list previously)
--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
The lists all seem to be focusing on the things that the developers would
like to add to PostgreSQL, what about some things that users or ISPs might
like to have, and thus perhaps something that companies might actually see
as worth funding?
For example:
A fully integrated ability to query across multiple databases,possibly on
multiple servers, something Oracle has had for nearly two decades.
Complete isolation at the user level, allowing an ISP to support multiple
independent customers on a server without having to fiddle with multiple
back ends each running on a separate port, a feature that MySQL has had for
as far back as I can recall, and one of the reasons ISPs are more likely to
offer MySQL than PostgreSQL.
The ability to restore a table from a backup file to a different table name
in the same database and schema.
A built-in report writer, capable of things like column totals. (SqlPlus
has this, even though it isn't very pretty.)
--
Mike Nolan
On 09/13/2011 10:13 AM, Michael Nolan wrote:
The lists all seem to be focusing on the things that the developers
would like to add to PostgreSQL, what about some things that users or
ISPs might like to have, and thus perhaps something that companies might
actually see as worth funding?
Well just my own two cents ... but it all depends on who is doing the
funding. At this point 80% of the work CMD codes for Pg (or tertiary
projects and modules) is funded by companies. So let's not assume that
companies aren't funding things. They are.
For example:
A fully integrated ability to query across multiple databases,possibly
on multiple servers, something Oracle has had for nearly two decades.
That isn't the approach to take. The fact that Oracle has it is not a
guarantee that it is useful or good. If you need to query across
databases (assuming within the same cluster) then you designed your
database wrong and should have used our SCHEMA support (what Oracle
calls Namespaces) instead.
Complete isolation at the user level, allowing an ISP to support
multiple independent customers on a server without having to fiddle with
multiple back ends each running on a separate port, a feature that MySQL
has had for as far back as I can recall, and one of the reasons ISPs are
more likely to offer MySQL than PostgreSQL.
Now this would definitely be nice. It is frustrating that we don't have
per database users.
The ability to restore a table from a backup file to a different table
name in the same database and schema.
This can be done but agreed it is not intuitive.
A built-in report writer, capable of things like column totals.
(SqlPlus has this, even though it isn't very pretty.)
There are a billion and one tools that do this without us having to
reinvent the wheel. Why would we support that?
Sincerely,
Joshua D. Drake
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579
On Tue, Sep 13, 2011 at 12:26 PM, Joshua D. Drake <jd@commandprompt.com>wrote:
On 09/13/2011 10:13 AM, Michael Nolan wrote:
The lists all seem to be focusing on the things that the developers
would like to add to PostgreSQL, what about some things that users or
ISPs might like to have, and thus perhaps something that companies might
actually see as worth funding?Well just my own two cents ... but it all depends on who is doing the
funding. At this point 80% of the work CMD codes for Pg (or tertiary
projects and modules) is funded by companies. So let's not assume that
companies aren't funding things. They are.
But perhaps if a few 'commercial' features were on the wish list there would
be more companies willing to fund development? The developers get a bit of
what they want to work on, the production users get a bit of what they need,
everybody's happy.
For example:
A fully integrated ability to query across multiple databases,possibly
on multiple servers, something Oracle has had for nearly two decades.That isn't the approach to take. The fact that Oracle has it is not a
guarantee that it is useful or good. If you need to query across databases
(assuming within the same cluster) then you designed your database wrong and
should have used our SCHEMA support (what Oracle calls Namespaces) instead.
This is the difference between developers and real world users. Real world
users may not have the ability, time or resources to redesign their
databases just because that's the 'best' way to do something. Will it be
the most efficient way to do it? Almost certainly not.
I've been involved in a few corporate mergers, and there was a short term
need to do queries on the combined databases while the tiger team handling
the IT restructuring figured out how (or whether) to merge the dabases
together. (One of these happened to be an Oracle/Oracle situation, it was a
piece of cake even though the two data centers were 750 miles apart and the
table structures had almost nothing in common. Another was a two week
headache, the third was even worse!)
In a perfect world, it would be nice if one could do combined queries
linking a PostgreSQL database with an Oracle one, or a MySQL one, too.
Because sometimes, that's what you gotta do. Even something that is several
hundred times slower is going to be faster than merging the databases
together. When I do this today, I have to write a program (in perl or php)
that accesses both databases and merges it by hand.
The ability to restore a table from a backup file to a different table
name in the same database and schema.This can be done but agreed it is not intuitive.
Can you elaborate on tha a bit, please? The only way I've been able to do
it is to edit the dump file to change the table name. That's not very
practical with a several gigabyte dump file, even less so with one that is
much larger. If this capability already exists, is it documented?
(SqlPlus has this, even though it isn't very pretty.)
A built-in report writer, capable of things like column totals.
There are a billion and one tools that do this without us having to
reinvent the wheel. Why would we support that?
There are other databases out there, too, why reinvent the wheel by working
on PostgreSQL? :-)
The question shoud be, would this be USEFUL?
--
Mike Nolan
On 09/13/2011 03:51 PM, Michael Nolan wrote:
For example:
A fully integrated ability to query across multiple
databases,possibly
on multiple servers, something Oracle has had for nearly two
decades.That isn't the approach to take. The fact that Oracle has it is
not a guarantee that it is useful or good. If you need to query
across databases (assuming within the same cluster) then you
designed your database wrong and should have used our SCHEMA
support (what Oracle calls Namespaces) instead.This is the difference between developers and real world users. Real
world users may not have the ability, time or resources to redesign
their databases just because that's the 'best' way to do something.
Will it be the most efficient way to do it? Almost certainly not.I've been involved in a few corporate mergers, and there was a short
term need to do queries on the combined databases while the tiger team
handling the IT restructuring figured out how (or whether) to merge
the dabases together. (One of these happened to be an Oracle/Oracle
situation, it was a piece of cake even though the two data centers
were 750 miles apart and the table structures had almost nothing in
common. Another was a two week headache, the third was even worse!)In a perfect world, it would be nice if one could do combined queries
linking a PostgreSQL database with an Oracle one, or a MySQL one,
too. Because sometimes, that's what you gotta do. Even something
that is several hundred times slower is going to be faster than
merging the databases together. When I do this today, I have to write
a program (in perl or php) that accesses both databases and merges it
by hand.
Can't you do that with FDW that is present in 9.1?
Rodrigo Gonzalez <rjgonzale@estrads.com.ar> writes:
In a perfect world, it would be nice if one could do combined queries
linking a PostgreSQL database with an Oracle one, or a MySQL one,
Can't you do that with FDW that is present in 9.1?
FDW provides the structure within which that will eventually be
possible, but there's no Oracle or MySQL wrapper today ... and there are
a lot of FDW restrictions that need to be worked on, too.
regards, tom lane
On 09/13/2011 11:51 AM, Michael Nolan wrote:
The ability to restore a table from a backup file to a different
table
name in the same database and schema.This can be done but agreed it is not intuitive.
Can you elaborate on tha a bit, please? The only way I've been able to
do it is to edit the dump file to change the table name. That's not
very practical with a several gigabyte dump file, even less so with one
that is much larger. If this capability already exists, is it documented?
You use the -Fc method, extract the TOC and edit just the TOC (so you
don't have to edit a multi-gig file)
(SqlPlus has this, even though it isn't very pretty.)
A built-in report writer, capable of things like column totals.
There are a billion and one tools that do this without us having to
reinvent the wheel. Why would we support that?There are other databases out there, too, why reinvent the wheel by
working on PostgreSQL? :-)The question shoud be, would this be USEFUL?
Personally, I don't think so but others may disagree.
Joshua D. Drake
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579
On 09/13/2011 04:52 PM, Tom Lane wrote:
Rodrigo Gonzalez<rjgonzale@estrads.com.ar> writes:
In a perfect world, it would be nice if one could do combined queries
linking a PostgreSQL database with an Oracle one, or a MySQL one,Can't you do that with FDW that is present in 9.1?
FDW provides the structure within which that will eventually be
possible, but there's no Oracle or MySQL wrapper today ... and there are
a lot of FDW restrictions that need to be worked on, too.regards, tom lane
They are both listed at wiki
I know there are a lot of limitations....but OP message says "Even
something that is several hundred times slower is going to be faster
than merging the databases together. When I do this today, I have to
write a program (in perl or php) that accesses both databases and merges
it by hand."
Am I wrong that this is currently possible using FDW?
Thanks
Rodrigo Gonzalez
Rodrigo Gonzalez wrote:
On 09/13/2011 04:52 PM, Tom Lane wrote:
FDW provides the structure within which that will eventually be
possible, but there's no Oracle or MySQL wrapper today ...
They are both listed at wiki
And here:
http://www.pgxn.org/tag/foreign%20data%20wrapper/
-Kevin
Import Notes
Resolved by subject fallback
On Tue, Sep 13, 2011 at 2:55 PM, Joshua D. Drake <jd@commandprompt.com>wrote:
On 09/13/2011 11:51 AM, Michael Nolan wrote:
The ability to restore a table from a backup file to a different
table
name in the same database and schema.This can be done but agreed it is not intuitive.
Can you elaborate on tha a bit, please? The only way I've been able to
do it is to edit the dump file to change the table name. That's not
very practical with a several gigabyte dump file, even less so with one
that is much larger. If this capability already exists, is it documented?You use the -Fc method, extract the TOC and edit just the TOC (so you don't
have to edit a multi-gig file)That is, at best, a bit obscure. I've wondered at times if the -f tar
option would have any benefits here, though it appears to have significant
downsides.
A downside of either method may be that I can't predict in advance when I
will want to do a restore of a single table from a backup file,
so I'd have to always use that method of generating the file.
I did propose an extension to pg_restore a couple of months ago to add an
option to re-name a table as it is restored, but that seemed to have
generated no interest.
Maybe an external tool that reads a pg_dump file looking for a specific
table and writes that portion of the dump file to a separate file, changing
the table name would be easier? It'd probably have to handle most of or all
of the different pg_dump formats, but that doesn't sound like an
unachievable goal.
--
Mike Nolan
On Mon, Sep 12, 2011 at 8:21 AM, David E. Wheeler <david@kineticode.com>wrote:
So, what do you want to work on? Let me know, I'll do as much match-making
at the conference as I can.
Here is my list:
* Additional approximate string matching functions and index access for them
using gin/gist/spgist.
* Signature indexing with gist/spgist in various fields. For example,
indexing of image signatures with similar images retreival.
* Statistics collection and selectivity estimation for geometric datatypes.
------
With best regards,
Alexander Korotkov.