Is there a way to avoid hard coding database connection info into views?

Started by Mike Christensenalmost 14 years ago7 messagesgeneral
Jump to latest
#1Mike Christensen
mike@kitchenpc.com

I often manually pull in production data into my test database so I
can test new code on realistic data, as well as test upgrade scenarios
or repro data specific bugs. To do this, I've setup a `VIEW` for each
production table in my test database. These views look something like
this:

CREATE VIEW ProdLink.Users AS
select * from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select * from users') as t1(userid uuid, email varchar(50),
alias varchar(50), fullname varchar(50), password varchar(100));

Now, on my production database I can run:

SELECT * FROM ProdLink.Users;

And see all users on my production database. I can then do things like:

INSERT INTO Users SELECT * FROM ProdLink.Users L WHERE NOT EXISTS
(select 1 from Users where Users.UserId = L.UserId);

Allowing me to pull in every user from production that doesn't already
exist in test.

I have about 30 of these views to *proxy* the production data, however
I find it somewhat hacky to have to hardcode in the production
database connection info into each view.

Is there a good way to avoid hardcoding, or at least duplicating this
connection info on each view? Can I use database level variables,
environment variables, or anything else instead?

PS - I also posted this question on StackOverflow for anyone who wants
points: http://stackoverflow.com/questions/10605622/postgres-9-x-is-there-a-way-to-avoid-hard-coding-database-connection-info-into

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Mike Christensen (#1)
Re: Is there a way to avoid hard coding database connection info into views?

On Tue, May 15, 2012 at 2:28 PM, Mike Christensen <mike@kitchenpc.com> wrote:

I often manually pull in production data into my test database so I
can test new code on realistic data, as well as test upgrade scenarios
or repro data specific bugs.  To do this, I've setup a `VIEW` for each
production table in my test database.  These views look something like
this:

   CREATE VIEW ProdLink.Users AS
      select * from dblink(
         'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
         'select * from users') as t1(userid uuid, email varchar(50),
alias varchar(50), fullname varchar(50), password varchar(100));

Now, on my production database I can run:

   SELECT * FROM ProdLink.Users;

And see all users on my production database.  I can then do things like:

   INSERT INTO Users SELECT * FROM ProdLink.Users L WHERE NOT EXISTS
(select 1 from Users where Users.UserId = L.UserId);

Allowing me to pull in every user from production that doesn't already
exist in test.

I have about 30 of these views to *proxy* the production data, however
I find it somewhat hacky to have to hardcode in the production
database connection info into each view.

Is there a good way to avoid hardcoding, or at least duplicating this
connection info on each view?  Can I use database level variables,
environment variables, or anything else instead?

sure: why don't you set up a named connection and just make sure it's
established before hitting any of the views?

merlin

#3Mike Christensen
mike@kitchenpc.com
In reply to: Merlin Moncure (#2)
Re: Is there a way to avoid hard coding database connection info into views?

Thanks!

I've never done that in PG before, but I've used named connections
with Oracle. Is it the same sort of deal? There's a file on the disk
somewhere with the connection info? Either way, I'm sure it's a RTFM
thing so I'll look into it.

Mike

Show quoted text

On Tue, May 15, 2012 at 12:45 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, May 15, 2012 at 2:28 PM, Mike Christensen <mike@kitchenpc.com> wrote:

I often manually pull in production data into my test database so I
can test new code on realistic data, as well as test upgrade scenarios
or repro data specific bugs.  To do this, I've setup a `VIEW` for each
production table in my test database.  These views look something like
this:

   CREATE VIEW ProdLink.Users AS
      select * from dblink(
         'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
         'select * from users') as t1(userid uuid, email varchar(50),
alias varchar(50), fullname varchar(50), password varchar(100));

Now, on my production database I can run:

   SELECT * FROM ProdLink.Users;

And see all users on my production database.  I can then do things like:

   INSERT INTO Users SELECT * FROM ProdLink.Users L WHERE NOT EXISTS
(select 1 from Users where Users.UserId = L.UserId);

Allowing me to pull in every user from production that doesn't already
exist in test.

I have about 30 of these views to *proxy* the production data, however
I find it somewhat hacky to have to hardcode in the production
database connection info into each view.

Is there a good way to avoid hardcoding, or at least duplicating this
connection info on each view?  Can I use database level variables,
environment variables, or anything else instead?

sure: why don't you set up a named connection and just make sure it's
established before hitting any of the views?

merlin

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Mike Christensen (#3)
Re: Is there a way to avoid hard coding database connection info into views?

On Tue, May 15, 2012 at 2:51 PM, Mike Christensen <mike@kitchenpc.com> wrote:

Thanks!

I've never done that in PG before, but I've used named connections
with Oracle.  Is it the same sort of deal?  There's a file on the disk
somewhere with the connection info?  Either way, I'm sure it's a RTFM
thing so I'll look into it.

yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html

btw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:

select (u).* from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select u from users u') as t1(u users);

it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.

merlin

#5Mike Christensen
mike@kitchenpc.com
In reply to: Merlin Moncure (#4)
Re: Is there a way to avoid hard coding database connection info into views?

I've never done that in PG before, but I've used named connections
with Oracle.  Is it the same sort of deal?  There's a file on the disk
somewhere with the connection info?  Either way, I'm sure it's a RTFM
thing so I'll look into it.

yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html

btw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid  having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:

select (u).* from dblink(
        'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
        'select u from users u') as t1(u users);

it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.

Ah ok, now I'm following.. Yea, I had read up on the dblink_connect()
function, however it seemed like an extra step to have to open this
connection every time. It would avoid duplicating the connection info
across multiple views though. What I was hoping for was the ability
to store this information somewhere. Doesn't PG allow custom
variables for sessions, users, and databases? Or is this something
that could be stored in pg_*.conf or as an environment variable?

Eventually, this whole thing might evolve into a Python script or
something that can migrate data over, and at that point the script
could open the connection and migrate the appropriate data over.

Also, thanks for the tip using the "users" type rather than explicitly
spelling out each column. The majority of my tables are identical in
schema, so this would work great. However, sometimes I add new
columns in the development schema, and I have to account for that
difference on the INSERT line, such as:

INSERT INTO Recipes SELECT *, false as SomeNewColumn FROM
ProdLink.Recipes WHERE ...

So, this may or may not work. Still, good to know!

Mike

#6Merlin Moncure
mmoncure@gmail.com
In reply to: Mike Christensen (#5)
Re: Is there a way to avoid hard coding database connection info into views?

On Tue, May 15, 2012 at 3:16 PM, Mike Christensen <mike@kitchenpc.com> wrote:

I've never done that in PG before, but I've used named connections
with Oracle.  Is it the same sort of deal?  There's a file on the disk
somewhere with the connection info?  Either way, I'm sure it's a RTFM
thing so I'll look into it.

yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html

btw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid  having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:

select (u).* from dblink(
        'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
        'select u from users u') as t1(u users);

it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.

Ah ok, now I'm following..  Yea, I had read up on the dblink_connect()
function, however it seemed like an extra step to have to open this
connection every time.  It would avoid duplicating the connection info
across multiple views though.  What I was hoping for was the ability
to store this information somewhere.  Doesn't PG allow custom
variables for sessions, users, and databases?  Or is this something
that could be stored in pg_*.conf or as an environment variable?

yes, they are called 'tables' :-). stick your connection string in a
table somewhere and do:

create view v as
select (u).* from dblink((select connstr from yadda where yadda), ...);

merlin

#7Mike Christensen
mike@kitchenpc.com
In reply to: Merlin Moncure (#6)
Re: Is there a way to avoid hard coding database connection info into views?

I've never done that in PG before, but I've used named connections
with Oracle.  Is it the same sort of deal?  There's a file on the disk
somewhere with the connection info?  Either way, I'm sure it's a RTFM
thing so I'll look into it.

yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html

btw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid  having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:

select (u).* from dblink(
        'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
        'select u from users u') as t1(u users);

it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.

Ah ok, now I'm following..  Yea, I had read up on the dblink_connect()
function, however it seemed like an extra step to have to open this
connection every time.  It would avoid duplicating the connection info
across multiple views though.  What I was hoping for was the ability
to store this information somewhere.  Doesn't PG allow custom
variables for sessions, users, and databases?  Or is this something
that could be stored in pg_*.conf or as an environment variable?

yes, they are called 'tables' :-). stick your connection string in a
table somewhere and do:

create view v as
 select (u).* from dblink((select connstr from yadda where yadda), ...);

That's definitely an approach. I think I know the possible options
anyway. I just wanted to make sure there wasn't anything I was
missing. Thanks for your help!

Mike