postgres external table
all
is there a external table create method ( similar to oracle external table )
? where to find the information ?
thanks
Amy
Amy Smith wrote:
all
is there a external table create method ( similar to oracle external
table ) ? where to find the information ?
For those non-Oracle users among us, what's an external table? What are
you trying to achieve?
Random guess: you might be looking for tablespaces.
--
Craig Ringer
On Sun, Jan 17, 2010 at 07:27:34PM -0800, Amy Smith wrote:
all
is there a external table create method ( similar to oracle external table )
? where to find the information ?
There is a project on pgfoundry which has had some activity lately
that's similar. You might also try DBI-Link.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Craig Ringer wrote:
For those non-Oracle users among us, what's an external table?
External tables let you map a text file directly to a table without
explicitly loading it. In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it. If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.
So the quick answer is no, there is no built-in external table support
in PostgreSQL; normally people load the data using COPY instead.
There's a work in progress on this topic at
http://pgfoundry.org/projects/pgexternaltable/ , but it's extremely
rough at this point and I'm not sure if it's even moving in the right
direction--the main project would do something like this via SQL/MED,
and I don't think that's how the prototype is being built at all.
The only PostgreSQL-based product I'm aware of that has working external
table support already is Greenplum DB.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith <greg@2ndquadrant.com> writes:
Craig Ringer wrote:
For those non-Oracle users among us, what's an external table?
External tables let you map a text file directly to a table without
explicitly loading it. In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it. If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.
I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?
regards, tom lane
On Mon, Jan 18, 2010 at 2:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?
I think it's usually more "my data is updated by other tools and it
would be hard/impossible/annoying to insert another step into the
pipeline to copy it to yet another place". The main benefit is that
you can access the authoritative data directly without having to copy
it and have some sort of process in place to do that regularly.
Text files are kind of useless but they're a baseline bit of
functionality on top of which to add more sophisticated external forms
such as data available over at some url or over some kind of rpc -- to
which various conditions could be pushed using external indexes -- or
ultimately in another database to which whole joins can be pushed.
--
greg
On Mon, Jan 18, 2010 at 09:57:02AM -0500, Tom Lane wrote:
Greg Smith <greg@2ndquadrant.com> writes:
Craig Ringer wrote:
For those non-Oracle users among us, what's an external table?
External tables let you map a text file directly to a table without
explicitly loading it. In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it. If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?
I've mostly wanted something like it when I've been dealing with
externally maintained data. The best idea I've seen so far has just
been a function similar to:
copy_csv_from(filename text) returns setof text[]
to be used as:
SELECT d[0]::Int AS id, d[1] AS name
FROM copy_csv_from('/tmp/usernames.csv') d;
This could be wrapped in a VIEW giving what I'd expect to be similar
semantics to an "external table", however I've never used one so I could
be missing something.
It's possible to write this function at the moment, it's somewhat
suboptimal as the csv file is completely imported before anything else
happens so is only good for small files.
--
Sam http://samason.me.uk/
On Mon, Jan 18, 2010 at 7:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Greg Smith <greg@2ndquadrant.com> writes:
Craig Ringer wrote:
For those non-Oracle users among us, what's an external table?
External tables let you map a text file directly to a table without
explicitly loading it. In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it. If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?
I've used it mostly for importing in the past. Saves the step of
loading a large file into a table with no constraints as a middle
step.
Tom Lane wrote:
I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?
I've mainly seen it used for data loading where there's some sort of
transformation going on, typically to cleanup junk fields that would
fail a constraint or derive new columns. If you have external tables,
there's no need to load the data into a temporary table if all you're
going to do is modify a few things and then write the result to
somewhere else. Most of these use cases process the whole file anyway,
so having to do a whole scan isn't an issue. I used to run an app that
imported gigabytes a day of text files dumped from another server that
used a weird date format I had to process via pl/pgsql function. Having
to pass them through COPY and then INSERT processed versions to
somewhere else was really a drag, given that there was no use for the
intermediate data.
It also can be handy for bootstrapping apps that are converting stuff
out of a legacy system too. Just make the mainframe/whatever dump a new
text file periodically into where the external table looks for its data,
and you skip having to schedule reloads when the content changes. Can
make your life easier while running the two systems in parallel initially.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com
2010/1/18 Tom Lane <tgl@sss.pgh.pa.us>:
Greg Smith <greg@2ndquadrant.com> writes:
Craig Ringer wrote:
For those non-Oracle users among us, what's an external table?
External tables let you map a text file directly to a table without
explicitly loading it. In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it. If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...
--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS
On 19/01/2010 1:13 AM, Vincenzo Romano wrote:
Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...
How can that work without a transactional file system, though? If the
external process writes to the file while you're half-way through
reading it, what's the database to do? In general, how do external
tables cope with the fact that they're on non-transactional storage?
--
Craig Ringer
Having 'external tables' lets us avoid the step of loading data from a
file into the table. We do not have to check whether a load job has run
successfully, whether the data in the table really corresponds to the data
in the file etc. It also lets us decide how many rejects can be allowed
and so forth.
http://www.adp-gmbh.ch/ora/misc/ext_table.html
Other than that, I have not found any advantage.
Jayadevan
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Greg Smith <greg@2ndquadrant.com>
Cc: Craig Ringer <craig@postnewspapers.com.au>, Amy Smith
<vah123@gmail.com>, pgsql-general@postgresql.org
Date: 01/18/2010 08:25 PM
Subject: Re: [GENERAL] postgres external table
Sent by: pgsql-general-owner@postgresql.org
Greg Smith <greg@2ndquadrant.com> writes:
Craig Ringer wrote:
For those non-Oracle users among us, what's an external table?
External tables let you map a text file directly to a table without
explicitly loading it. In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it. If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.
I'm finding it hard to visualize a use-case for that. We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it. Which among other things
implies that every query will seqscan the whole table. Where's the
savings?
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
DISCLAIMER:
"The information in this e-mail and any attachment is intended only for
the person to whom it is addressed and may contain confidential and/or
privileged material. If you have received this e-mail in error, kindly
contact the sender and destroy all copies of the original communication.
IBS makes no warranty, express or implied, nor guarantees the accuracy,
adequacy or completeness of the information contained in this email or any
attachment and is not liable for any errors, defects, omissions, viruses
or for resultant loss or damage, if any, direct or indirect."
On Tue, Jan 19, 2010 at 4:41 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
How can that work without a transactional file system, though? If the
external process writes to the file while you're half-way through reading
it, what's the database to do? In general, how do external tables cope with
the fact that they're on non-transactional storage?
Well if you use mv to replace the old file with the new one then it
should be safe. Unless your query involves opening the table multiple
times or your transactions are more complex than a single query...
--
greg
Yes. We get quite a few files as 'feeds' from external systems. Once the
files are in our network, we know that no changes will happen to those
files. We access them using Oracle external tables and process them (the
data, after some processing, end up in other real tables). If external
tables were not there, we would have had to schedule some job to load
these files.
Jayadevan
From: Greg Stark <gsstark@mit.edu>
To: Craig Ringer <craig@postnewspapers.com.au>
Cc: Vincenzo Romano <vincenzo.romano@notorand.it>, Tom Lane
<tgl@sss.pgh.pa.us>, Greg Smith <greg@2ndquadrant.com>, Amy Smith
<vah123@gmail.com>, pgsql-general@postgresql.org
Date: 01/19/2010 04:37 PM
Subject: Re: [GENERAL] postgres external table
Sent by: pgsql-general-owner@postgresql.org
On Tue, Jan 19, 2010 at 4:41 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
How can that work without a transactional file system, though? If the
external process writes to the file while you're half-way through
reading
it, what's the database to do? In general, how do external tables cope
with
the fact that they're on non-transactional storage?
Well if you use mv to replace the old file with the new one then it
should be safe. Unless your query involves opening the table multiple
times or your transactions are more complex than a single query...
--
greg
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
DISCLAIMER:
"The information in this e-mail and any attachment is intended only for
the person to whom it is addressed and may contain confidential and/or
privileged material. If you have received this e-mail in error, kindly
contact the sender and destroy all copies of the original communication.
IBS makes no warranty, express or implied, nor guarantees the accuracy,
adequacy or completeness of the information contained in this email or any
attachment and is not liable for any errors, defects, omissions, viruses
or for resultant loss or damage, if any, direct or indirect."
On 19 Jan 2010, at 12:16, Jayadevan M wrote:
Yes. We get quite a few files as 'feeds' from external systems. Once the files are in our network, we know that no changes will happen to those files. We access them using Oracle external tables and process them (the data, after some processing, end up in other real tables). If external tables were not there, we would have had to schedule some job to load these files.
I'm surprised you even need to download them to your network. I suspect it would be quite possible to implement 'external tables' so that they could be read from eg. an RSS feed. That would be a pretty slick feature.
There's a gazillion number of data formats that people would want support for though; CSV, probably excel, RSS feeds, web services, etc.
Most of that should probably be implemented by means of stored procedures (using the unsafe languages - pl/pythonu, pl/perlu, etc) returning TABLE (...), which means this is in fact already possible I think? It's just that nobody's (publicly) thought of doing this so far.
Alban Hertroys
--
Screwing up is the best way to attach something to the ceiling.
!DSPAM:737,4b559f7d10601504612258!
Hi,
We use Oracle's external tables to process the files (so no need to look
for the other alternatives you mentioned). We do not have access to the
data providers' network to access the files. That is why we process them
after we receive the files.
If there is a database implementation of 'external tables' in the way you
mentioned (CSV, probably excel, RSS feeds, web services, etc) at db level,
that would be great. Most ETL tools already do accept feeds of the type
you mentioned and load those into tables.
Jayadevan
From: Alban Hertroys <dalroi@solfertje.student.utwente.nl>
To: Jayadevan M <Jayadevan.Maymala@ibsplc.com>
Cc: pgsql-general@postgresql.org
Date: 01/19/2010 05:31 PM
Subject: Re: [GENERAL] postgres external table
On 19 Jan 2010, at 12:16, Jayadevan M wrote:
Yes. We get quite a few files as 'feeds' from external systems. Once the
files are in our network, we know that no changes will happen to those
files. We access them using Oracle external tables and process them (the
data, after some processing, end up in other real tables). If external
tables were not there, we would have had to schedule some job to load
these files.
I'm surprised you even need to download them to your network. I suspect it
would be quite possible to implement 'external tables' so that they could
be read from eg. an RSS feed. That would be a pretty slick feature.
There's a gazillion number of data formats that people would want support
for though; CSV, probably excel, RSS feeds, web services, etc.
Most of that should probably be implemented by means of stored procedures
(using the unsafe languages - pl/pythonu, pl/perlu, etc) returning TABLE
(...), which means this is in fact already possible I think? It's just
that nobody's (publicly) thought of doing this so far.
Alban Hertroys
--
Screwing up is the best way to attach something to the ceiling.
!DSPAM:1019,4b559f7b10604920110887!
DISCLAIMER:
"The information in this e-mail and any attachment is intended only for
the person to whom it is addressed and may contain confidential and/or
privileged material. If you have received this e-mail in error, kindly
contact the sender and destroy all copies of the original communication.
IBS makes no warranty, express or implied, nor guarantees the accuracy,
adequacy or completeness of the information contained in this email or any
attachment and is not liable for any errors, defects, omissions, viruses
or for resultant loss or damage, if any, direct or indirect."
Craig Ringer wrote:
On 19/01/2010 1:13 AM, Vincenzo Romano wrote:
Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...How can that work without a transactional file system, though? If the
external process writes to the file while you're half-way through
reading it, what's the database to do? In general, how do external
tables cope with the fact that they're on non-transactional storage?
With Oracle's implementation, you version the input files (timestamp or
something) and then do ALTER TABLE to change the location the external
table points to. That will block waiting for exclusive access before it
fires, then you get a clean switch to the new location. There is no
smartness here to cope with weird behavior built-in here--I expect it
will just crash the query. One thing you always have to be careful
about when using these is that an external table might return a weird
query error under odd circumstances such as you describe, which you
might not normally expect from a simple SELECT.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com