dblink() from GridSQL

Started by Sam Nelsonalmost 15 years ago5 messagesgeneral
Jump to latest
#1Sam Nelson
samn@consistentstate.com

Hi List,

We have a customer who is trying to migrate a few PostgresPlus instances to
GridSQL clusters. They have a process that pulls data from another server
using dblink every night, and we're trying to replicate that on the GridSQL
instance, but grid is being a bit of a pain.

Grid doesn't seem to allow functions in from statements, and, of course, it
spits out errors about functions returning records being called in the wrong
context if we just try "select dblink(foo, bar);" (we had to try it).

Has anyone else run into this specific issue? Is there a known workaround?
Any ideas on what else we should try?
---
===========================
Samuel Nelson
Consistent State
www.consistentstate.com
303-955-0509
===========================

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Sam Nelson (#1)
Re: dblink() from GridSQL

On Thu, May 5, 2011 at 12:13 PM, Sam Nelson <samn@consistentstate.com> wrote:

Hi List,
We have a customer who is trying to migrate a few PostgresPlus instances to
GridSQL clusters.  They have a process that pulls data from another server
using dblink every night, and we're trying to replicate that on the GridSQL
instance, but grid is being a bit of a pain.
Grid doesn't seem to allow functions in from statements, and, of course, it
spits out errors about functions returning records being called in the wrong
context if we just try "select dblink(foo, bar);" (we had to try it).
Has anyone else run into this specific issue?  Is there a known workaround?
 Any ideas on what else we should try?

have you considered wrapping the output of the dblink query in a view?

merlin

#3Scott Mead
scottm@openscg.com
In reply to: Merlin Moncure (#2)
Re: dblink() from GridSQL

On Thu, May 5, 2011 at 1:40 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Thu, May 5, 2011 at 12:13 PM, Sam Nelson <samn@consistentstate.com>
wrote:

Hi List,
We have a customer who is trying to migrate a few PostgresPlus instances

to

GridSQL clusters. They have a process that pulls data from another

server

using dblink every night, and we're trying to replicate that on the

GridSQL

instance, but grid is being a bit of a pain.
Grid doesn't seem to allow functions in from statements, and, of course,

it

spits out errors about functions returning records being called in the

wrong

context if we just try "select dblink(foo, bar);" (we had to try it).
Has anyone else run into this specific issue?

GridSQL itself doesn't support functions.

Is there a known workaround?

Any ideas on what else we should try?

You'd have to present the data to be partitioned to the gsql controller for
partitioning to happen properly, or use the high-speed import that it comes
with. Could you dump the data to an intermediary csv and then push it at
the import utility?

--Scott

Show quoted text

have you considered wrapping the output of the dblink query in a view?

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Sam Nelson
samn@consistentstate.com
In reply to: Scott Mead (#3)
Re: dblink() from GridSQL

Grid passes functions off to underlying databases. Unfortunately, it
doesn't do so when the functions are in the from clause. If it did, that
would work.

But I digress. We're attempting to try either the csv import (which would
require a new script, but no biggie) or a data pull on the underlying
database to a table that only exists on one node.
---
===========================
Samuel Nelson
Consistent State
www.consistentstate.com
303-955-0509
===========================

On Thu, May 5, 2011 at 12:36 PM, Scott Mead <scottm@openscg.com> wrote:

Show quoted text

On Thu, May 5, 2011 at 1:40 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Thu, May 5, 2011 at 12:13 PM, Sam Nelson <samn@consistentstate.com>
wrote:

Hi List,
We have a customer who is trying to migrate a few PostgresPlus instances

to

GridSQL clusters. They have a process that pulls data from another

server

using dblink every night, and we're trying to replicate that on the

GridSQL

instance, but grid is being a bit of a pain.
Grid doesn't seem to allow functions in from statements, and, of course,

it

spits out errors about functions returning records being called in the

wrong

context if we just try "select dblink(foo, bar);" (we had to try it).
Has anyone else run into this specific issue?

GridSQL itself doesn't support functions.

Is there a known workaround?

Any ideas on what else we should try?

You'd have to present the data to be partitioned to the gsql controller for
partitioning to happen properly, or use the high-speed import that it comes
with. Could you dump the data to an intermediary csv and then push it at
the import utility?

--Scott

have you considered wrapping the output of the dblink query in a view?

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Mason S
masonlists@gmail.com
In reply to: Sam Nelson (#4)
Re: dblink() from GridSQL

Hi List,
We have a customer who is trying to migrate a few PostgresPlus instances

to

GridSQL clusters. They have a process that pulls data from another

server

using dblink every night, and we're trying to replicate that on the

GridSQL

instance, but grid is being a bit of a pain.
Grid doesn't seem to allow functions in from statements, and, of course,

it

spits out errors about functions returning records being called in the

wrong

context if we just try "select dblink(foo, bar);" (we had to try it).
Has anyone else run into this specific issue?

GridSQL itself doesn't support functions.

GridSQL supports most PostgreSQL built-in functions. In terms of other
user-defined functions, one can create them directly on the data nodes, then
teach the coordinator about them by putting the function signature in the
GridSQL config file, and restarting the Coordinator. (This will not solve
the functions in FROM clause issue of course.)

I suspect there will be issues with dblink, however.

I would not recommend using dblink for this anyway. Are there only inserts
going on here, or updates and deletes, too? If only adding data, I would
just dump out data periodically and then load via COPY to the GridSQL
coordinator, as Scott mentions. It will run much, much faster.

If you really need dblink, and you are clear about the
distribution/partitioning, you could run this directly from the underlying
nodes, but you can really mess things up if you pull in data to the wrong
node. I would avoid this.

Another workaround would be to create a dummy database in PostgreSQL, access
the source data via dblink, then insert it into GridSQL.

Mason

Show quoted text

Is there a known workaround?

Any ideas on what else we should try?

You'd have to present the data to be partitioned to the gsql controller for
partitioning to happen properly, or use the high-speed import that it comes
with. Could you dump the data to an intermediary csv and then push it at
the import utility?

--Scott

have you considered wrapping the output of the dblink query in a view?

merlin

--
Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general