feature proposal ...

Started by Hans-Jürgen Schönigover 20 years ago32 messageshackers
Jump to latest
#1Hans-Jürgen Schönig
postgres@cybertec.at

hackers,

currently we have to hack tons of export scripts for various customers.
the problem is: if tables can be exported straight forward COPY will
give you all you need but when data has to be transformed while
exporting things start becoming a bit more complex. usually people want
to have CSV file (excel-ify data) which is supported by COPY.

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

the advantage would be that COPY would then be able to export data and
transform it on the fly. this would save many people a lot of work
because complex data extractors could in many cases be replaced by
simple SQL scripts.

how we plan to implement that:
currently copy simply opens a table and loops through the tuples (see
command/copy.c starting at line 1115).
to implement the desired feature we just had to add some SPI code to the
scenery (SPI will also return HeapTuples so it should fit in there).

Any comments?

Best regards,

Hans

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

#2Rod Taylor
rbt@rbt.ca
In reply to: Hans-Jürgen Schönig (#1)
Re: feature proposal ...

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

I have run into plenty of cases where I wanted to dump part of a
structure and this could be used for that, but I've always found that
temporary tables were sufficient and equally SQL scriptable

CREATE TEMP TABLE tab AS SELECT ...; COPY tab TO file_name;
--

#3Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Rod Taylor (#2)
Re: feature proposal ...

Rod Taylor wrote:

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

I have run into plenty of cases where I wanted to dump part of a
structure and this could be used for that, but I've always found that
temporary tables were sufficient and equally SQL scriptable

CREATE TEMP TABLE tab AS SELECT ...; COPY tab TO file_name;

Hi Rod,

TEMP TABLE are not suitable for my case. Using a temp table would
essentially mean that we had to store the data 3 times: Original data,
temp table + dump. Temp tables are only fine for small amounts of data
but we are talking about too much data here (my smallest export will
contain 15.000.000 records).

Best regards,

Hans

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

#4Paolo Magnoli
pmagnoli@systemevolution.it
In reply to: Hans-Jürgen Schönig (#1)
R: feature proposal ...

Can't you just use a view?

-----Messaggio originale-----
Da: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]Per conto di Hans-J�rgen
Sch�nig
Inviato: mercoled� 21 settembre 2005 15.30
A: pgsql-hackers@postgresql.org; eg@cybertec.at
Oggetto: [HACKERS] feature proposal ...

hackers,

currently we have to hack tons of export scripts for various customers.
the problem is: if tables can be exported straight forward COPY will
give you all you need but when data has to be transformed while
exporting things start becoming a bit more complex. usually people want
to have CSV file (excel-ify data) which is supported by COPY.

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

the advantage would be that COPY would then be able to export data and
transform it on the fly. this would save many people a lot of work
because complex data extractors could in many cases be replaced by
simple SQL scripts.

how we plan to implement that:
currently copy simply opens a table and loops through the tuples (see
command/copy.c starting at line 1115).
to implement the desired feature we just had to add some SPI code to the
scenery (SPI will also return HeapTuples so it should fit in there).

Any comments?

Best regards,

Hans

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#5Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Paolo Magnoli (#4)
Re: R: feature proposal ...

no because a new is not a heap ...

em=# create view x as select * from pg_class;
CREATE VIEW

em=# copy x to '/tmp/x';
ERROR: cannot copy from view "x"

best regards,

hans

Paolo Magnoli wrote:

Can't you just use a view?

-----Messaggio originale-----
Da: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]Per conto di Hans-J�rgen
Sch�nig
Inviato: mercoled� 21 settembre 2005 15.30
A: pgsql-hackers@postgresql.org; eg@cybertec.at
Oggetto: [HACKERS] feature proposal ...

hackers,

currently we have to hack tons of export scripts for various customers.
the problem is: if tables can be exported straight forward COPY will
give you all you need but when data has to be transformed while
exporting things start becoming a bit more complex. usually people want
to have CSV file (excel-ify data) which is supported by COPY.

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

the advantage would be that COPY would then be able to export data and
transform it on the fly. this would save many people a lot of work
because complex data extractors could in many cases be replaced by
simple SQL scripts.

how we plan to implement that:
currently copy simply opens a table and loops through the tuples (see
command/copy.c starting at line 1115).
to implement the desired feature we just had to add some SPI code to the
scenery (SPI will also return HeapTuples so it should fit in there).

Any comments?

Best regards,

Hans

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

From thhal at mailblocks.com Wed Sep 21 14:54:13 2005
From: thhal at mailblocks.com (Thomas Hallgren)
Date: Wed, 21 Sep 2005 16:54:13 +0200
Subject: [Pljava-dev] Where is pgxs?
References: <43317415.7090503@mailblocks.com>
<thhal-0mAwNBEFQ8bQWDbfABskJWLlSNUKoz+@mailblocks.com>
Message-ID: <thhal-0mQwNBENQ8bQ5KN/zhiJaVVpTCADzIC@mailblocks.com>

Hi,
I just installed the PostgreSQL 8.1-beta2 on my Windows box using the
windows installer. I then try to compile PL/Java. It fails to find pgxs.
Is it moved to somewhere else? Or is it not included?

Regards,
Thomas Hallgren

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hans-Jürgen Schönig (#1)
Re: feature proposal ...

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

COPY TO file_name USING some_select_statement;

I think this has been discussed before, check the archives.

to implement the desired feature we just had to add some SPI code to the
scenery (SPI will also return HeapTuples so it should fit in there).

Any implementation that depends on SPI here is wrong.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hans-Jürgen Schönig (#5)
Re: R: feature proposal ...

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

Paolo Magnoli wrote:

Can't you just use a view?

no because a new is not a heap ...

I think Paolo's idea is much better than munging the syntax of COPY,
though. Fixing COPY so that you *could* copy from a view would provide
all the desired functionality without any syntactic warts.

regards, tom lane

#8Joshua D. Drake
jd@commandprompt.com
In reply to: Hans-Jürgen Schönig (#5)
Re: R: feature proposal ...

Hans-J�rgen Sch�nig wrote:

no because a new is not a heap ...

Why not use a function with a temporary table?

That way you can pass a table parameter that
is the temporary table with a select statement
that you can populate the temp table with.

Sincerely,

Joshua D. Drake

em=# create view x as select * from pg_class;
CREATE VIEW

em=# copy x to '/tmp/x';
ERROR: cannot copy from view "x"

best regards,

hans

Paolo Magnoli wrote:

Can't you just use a view?

-----Messaggio originale-----
Da: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]Per conto di Hans-J�rgen
Sch�nig
Inviato: mercoled� 21 settembre 2005 15.30
A: pgsql-hackers@postgresql.org; eg@cybertec.at
Oggetto: [HACKERS] feature proposal ...

hackers,

currently we have to hack tons of export scripts for various customers.
the problem is: if tables can be exported straight forward COPY will
give you all you need but when data has to be transformed while
exporting things start becoming a bit more complex. usually people want
to have CSV file (excel-ify data) which is supported by COPY.

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

the advantage would be that COPY would then be able to export data and
transform it on the fly. this would save many people a lot of work
because complex data extractors could in many cases be replaced by
simple SQL scripts.

how we plan to implement that:
currently copy simply opens a table and loops through the tuples (see
command/copy.c starting at line 1115).
to implement the desired feature we just had to add some SPI code to the
scenery (SPI will also return HeapTuples so it should fit in there).

Any comments?

Best regards,

Hans

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

#9Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Joshua D. Drake (#8)
Re: R: feature proposal ...

Joshua D. Drake wrote:

Hans-J�rgen Sch�nig wrote:

no because a new is not a heap ...

Why not use a function with a temporary table?

That way you can pass a table parameter that
is the temporary table with a select statement
that you can populate the temp table with.

Sincerely,

Joshua D. Drake

hi joshua ...

temp tables are not an option - there is too much data around.
view are better here, i think ...

cheers,

hans

--
Cybertec Geschwinde & Sch�nig GmbH
Sch�ngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

#10Cristian Prieto
cristian@clickdiario.com
In reply to: Hans-Jürgen Schönig (#5)
Re: R: feature proposal ...

So, that means copy doesn't support views? If it is like that, then why not
work in the View support for the Copy statement?

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Hans-Jürgen Schönig
Sent: Miércoles, 21 de Septiembre de 2005 08:04 a.m.
To: Paolo Magnoli
Cc: pgsql-hackers@postgresql.org; eg@cybertec.at
Subject: Re: R: [HACKERS] feature proposal ...

no because a new is not a heap ...

em=# create view x as select * from pg_class;
CREATE VIEW

em=# copy x to '/tmp/x';
ERROR: cannot copy from view "x"

best regards,

hans

Paolo Magnoli wrote:

Can't you just use a view?

-----Messaggio originale-----
Da: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]Per conto di Hans-Jürgen
Schönig
Inviato: mercoledì 21 settembre 2005 15.30
A: pgsql-hackers@postgresql.org; eg@cybertec.at
Oggetto: [HACKERS] feature proposal ...

hackers,

currently we have to hack tons of export scripts for various customers.
the problem is: if tables can be exported straight forward COPY will
give you all you need but when data has to be transformed while
exporting things start becoming a bit more complex. usually people want
to have CSV file (excel-ify data) which is supported by COPY.

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

the advantage would be that COPY would then be able to export data and
transform it on the fly. this would save many people a lot of work
because complex data extractors could in many cases be replaced by
simple SQL scripts.

how we plan to implement that:
currently copy simply opens a table and loops through the tuples (see
command/copy.c starting at line 1115).
to implement the desired feature we just had to add some SPI code to the
scenery (SPI will also return HeapTuples so it should fit in there).

Any comments?

Best regards,

Hans

--
Cybertec Geschwinde & Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

--
Cybertec Geschwinde & Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#11Chris Browne
cbbrowne@acm.org
In reply to: Paolo Magnoli (#4)
Re: R: feature proposal ...

jd@commandprompt.com ("Joshua D. Drake") writes:

Hans-J�rgen Sch�nig wrote:

no because a new is not a heap ...

Why not use a function with a temporary table?

That way you can pass a table parameter that
is the temporary table with a select statement
that you can populate the temp table with.

That means having to instantiate the temp table on disk "twice," once
as temp table, and once as the output file.

It would sure be nice to do it just once; that should lead to there
only being data written out once, which saves a lot on I/O.
--
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/spiritual.html
Rules of the Evil Overlord #59. "I will never build a sentient
computer smarter than I am." <http://www.eviloverlord.com/&gt;

#12Trent Shipley
tshipley@deru.com
In reply to: Hans-Jürgen Schönig (#3)
Re: feature proposal ...

On Wednesday 2005-09-21 07:01, Hans-Jürgen Schönig wrote:

Rod Taylor wrote:

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

I have run into plenty of cases where I wanted to dump part of a
structure and this could be used for that, but I've always found that
temporary tables were sufficient and equally SQL scriptable

CREATE TEMP TABLE tab AS SELECT ...; COPY tab TO file_name;

Hi Rod,

TEMP TABLE are not suitable for my case. Using a temp table would
essentially mean that we had to store the data 3 times: Original data,
temp table + dump. Temp tables are only fine for small amounts of data
but we are talking about too much data here (my smallest export will
contain 15.000.000 records).

Wouldn't you also need a CREATE TEMP TABLE privilege but the
COPY TO file USING select_statement
would only need select. (In other words using a temp table would not seem to
be as secure nor as general as the requested feature.)

Ideally COPYing from a view would be supported. As a user I like to treat a
relation as a relation without having to worry about it's type.
Nevertheless, there remains the issue of atomic permissions. One ought to be
able to make selecting, copying, and creating views independent permissions
for groups, roles, and users. A user should be able to copy and select
without being able to create views.

One can think of a table as a prototypical relation.
Views are virtual tables.
SELECT statements are ephemeral views.
In a select statement you can use a proper table, a pre-defined view, or
another select statement in the FROM or WHERE clause. Parallel behavior for
COPY is reasonable.

#13Rod Taylor
rbt@rbt.ca
In reply to: Trent Shipley (#12)
Re: feature proposal ...

On Wed, 2005-09-21 at 15:25 -0700, Trent Shipley wrote:

On Wednesday 2005-09-21 07:01, Hans-Jürgen Schönig wrote:

Rod Taylor wrote:

the problem is: COPY can write data returned by a SELECT statement to a
file. our idea is to implement precisely that.

example:

COPY TO file_name USING some_select_statement;

I have run into plenty of cases where I wanted to dump part of a
structure and this could be used for that, but I've always found that
temporary tables were sufficient and equally SQL scriptable

CREATE TEMP TABLE tab AS SELECT ...; COPY tab TO file_name;

Hi Rod,

TEMP TABLE are not suitable for my case. Using a temp table would
essentially mean that we had to store the data 3 times: Original data,
temp table + dump. Temp tables are only fine for small amounts of data
but we are talking about too much data here (my smallest export will
contain 15.000.000 records).

Wouldn't you also need a CREATE TEMP TABLE privilege but the
COPY TO file USING select_statement
would only need select. (In other words using a temp table would not seem to
be as secure nor as general as the requested feature.)

Writing a file on the server requires significant privilege, including
access to the server itself so you can retrieve the results.
--

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Rod Taylor (#13)
Re: feature proposal ...

Rod Taylor wrote:

On Wed, 2005-09-21 at 15:25 -0700, Trent Shipley wrote:

Wouldn't you also need a CREATE TEMP TABLE privilege but the
COPY TO file USING select_statement
would only need select. (In other words using a temp table would not seem to
be as secure nor as general as the requested feature.)

Writing a file on the server requires significant privilege, including
access to the server itself so you can retrieve the results.

But we also do COPY to STDOUT which requires no special privileges on
the server.

Incidentally, if we are going to allow copy out from views, it would be
nice and orthogonal to allow copy in too. Hasn't there been some talk
about making automatically writeable views?

cheers

andrew

#15Rod Taylor
rbt@rbt.ca
In reply to: Andrew Dunstan (#14)
Re: feature proposal ...

On Wed, 2005-09-21 at 19:55 -0400, Andrew Dunstan wrote:

Rod Taylor wrote:

On Wed, 2005-09-21 at 15:25 -0700, Trent Shipley wrote:

Wouldn't you also need a CREATE TEMP TABLE privilege but the
COPY TO file USING select_statement
would only need select. (In other words using a temp table would not seem to
be as secure nor as general as the requested feature.)

Writing a file on the server requires significant privilege, including
access to the server itself so you can retrieve the results.

But we also do COPY to STDOUT which requires no special privileges on
the server.

Incidentally, if we are going to allow copy out from views, it would be
nice and orthogonal to allow copy in too. Hasn't there been some talk
about making automatically writeable views?

Sure. But if you are using STDOUT then why does this need to be a server
side item at all?

You either have code issuing the commands and collecting the results
making a standard select just as fast or you are using psql which
already has multiple display types for SELECT data, including XML
output, but another could easily be added for CSV style output.

--

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Rod Taylor (#15)
Re: feature proposal ...

Rod Taylor wrote:

You either have code issuing the commands and collecting the results
making a standard select just as fast or you are using psql which
already has multiple display types for SELECT data, including XML
output, but another could easily be added for CSV style output.

We have CSV output now and it's produced by the server. psql's \copy is
in fact just a very thin veneer over the server-side COPY. Besides, we
might well be using another client - your assertion that if COPY output
is going to the client it must be psql is simply wrong.

cheers

andrew

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#14)
Re: feature proposal ...

Andrew Dunstan <andrew@dunslane.net> writes:

Rod Taylor wrote:

Writing a file on the server requires significant privilege, including
access to the server itself so you can retrieve the results.

But we also do COPY to STDOUT which requires no special privileges on
the server.

Currently, we have a special privilege type about creating temporary
tables, which I think also restricts creating temporary views --- but
now that I think about it, it's not obvious why that should follow.
The only good argument I can see for restricting temp table creation
is that one might eat up large amounts of server disk space with a temp
table, and of course this argument doesn't apply to a temp view. So we
could refute this argument by just not making the permission check for
CREATE TEMP VIEW.

Incidentally, if we are going to allow copy out from views, it would be
nice and orthogonal to allow copy in too. Hasn't there been some talk
about making automatically writeable views?

Sure, but until we actually have automatically writable views, it's a
bit premature to worry about that.

regards, tom lane

#18Hannu Krosing
hannu@tm.ee
In reply to: Rod Taylor (#15)
Re: feature proposal ...

On K, 2005-09-21 at 20:34 -0400, Rod Taylor wrote:

Sure. But if you are using STDOUT then why does this need to be a server
side item at all?

You either have code issuing the commands and collecting the results
making a standard select just as fast or you are using psql which
already has multiple display types for SELECT data, including XML
output, but another could easily be added for CSV style output.

Another advantage of server-side COPY is that pgsql/libpq already does
not try to collect the whole resultset in memory before starting the
display/output process.

--
Hannu Krosing <hannu@skype.net>

#19Karel Zak
zakkr@zf.jcu.cz
In reply to: Tom Lane (#7)
Re: R: feature proposal ...

On Wed, 2005-09-21 at 11:31 -0400, Tom Lane wrote:

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

Paolo Magnoli wrote:

Can't you just use a view?

no because a new is not a heap ...

I think Paolo's idea is much better than munging the syntax of COPY,
though. Fixing COPY so that you *could* copy from a view would provide
all the desired functionality without any syntactic warts.

Well, I will probably help Juergen with the implementation. It seems
that fetch data from VIEW is possible by portal stuff.

Tom, do you think that there's any other (better) way how we can
implement it?

Karel

--
Karel Zak <zakkr@zf.jcu.cz>

#20Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#17)
Re: feature proposal ...

Tom Lane <tgl@sss.pgh.pa.us> writes:

So we could refute this argument by just not making the permission check for
CREATE TEMP VIEW.

This is the first time I've ever heard of CREATE TEMP VIEW. What's the point
of it since you can always directly do:

SELECT * FROM (...)

?

--
greg

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#14)
#23Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#7)
#24Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Jim Nasby (#23)
#25A.M.
agentm@themactionfaction.com
In reply to: Jim Nasby (#23)
#26Hannu Krosing
hannu@tm.ee
In reply to: Hans-Jürgen Schönig (#24)
#27Andrew Dunstan
andrew@dunslane.net
In reply to: Jim Nasby (#23)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#27)
#29Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#28)
#30Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#28)
#31Bruce Momjian
bruce@momjian.us
In reply to: A.M. (#25)
#32Trent Shipley
tshipley@deru.com
In reply to: Andrew Dunstan (#27)