Is there a way to avoid hard coding database connection info into views?
I often manually pull in production data into my test database so I
can test new code on realistic data, as well as test upgrade scenarios
or repro data specific bugs. To do this, I've setup a `VIEW` for each
production table in my test database. These views look something like
this:
CREATE VIEW ProdLink.Users AS
select * from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select * from users') as t1(userid uuid, email varchar(50),
alias varchar(50), fullname varchar(50), password varchar(100));
Now, on my production database I can run:
SELECT * FROM ProdLink.Users;
And see all users on my production database. I can then do things like:
INSERT INTO Users SELECT * FROM ProdLink.Users L WHERE NOT EXISTS
(select 1 from Users where Users.UserId = L.UserId);
Allowing me to pull in every user from production that doesn't already
exist in test.
I have about 30 of these views to *proxy* the production data, however
I find it somewhat hacky to have to hardcode in the production
database connection info into each view.
Is there a good way to avoid hardcoding, or at least duplicating this
connection info on each view? Can I use database level variables,
environment variables, or anything else instead?
PS - I also posted this question on StackOverflow for anyone who wants
points: http://stackoverflow.com/questions/10605622/postgres-9-x-is-there-a-way-to-avoid-hard-coding-database-connection-info-into
On Tue, May 15, 2012 at 2:28 PM, Mike Christensen <mike@kitchenpc.com> wrote:
I often manually pull in production data into my test database so I
can test new code on realistic data, as well as test upgrade scenarios
or repro data specific bugs. To do this, I've setup a `VIEW` for each
production table in my test database. These views look something like
this:CREATE VIEW ProdLink.Users AS
select * from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select * from users') as t1(userid uuid, email varchar(50),
alias varchar(50), fullname varchar(50), password varchar(100));Now, on my production database I can run:
SELECT * FROM ProdLink.Users;
And see all users on my production database. I can then do things like:
INSERT INTO Users SELECT * FROM ProdLink.Users L WHERE NOT EXISTS
(select 1 from Users where Users.UserId = L.UserId);Allowing me to pull in every user from production that doesn't already
exist in test.I have about 30 of these views to *proxy* the production data, however
I find it somewhat hacky to have to hardcode in the production
database connection info into each view.Is there a good way to avoid hardcoding, or at least duplicating this
connection info on each view? Can I use database level variables,
environment variables, or anything else instead?
sure: why don't you set up a named connection and just make sure it's
established before hitting any of the views?
merlin
Thanks!
I've never done that in PG before, but I've used named connections
with Oracle. Is it the same sort of deal? There's a file on the disk
somewhere with the connection info? Either way, I'm sure it's a RTFM
thing so I'll look into it.
Mike
Show quoted text
On Tue, May 15, 2012 at 12:45 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, May 15, 2012 at 2:28 PM, Mike Christensen <mike@kitchenpc.com> wrote:
I often manually pull in production data into my test database so I
can test new code on realistic data, as well as test upgrade scenarios
or repro data specific bugs. To do this, I've setup a `VIEW` for each
production table in my test database. These views look something like
this:CREATE VIEW ProdLink.Users AS
select * from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select * from users') as t1(userid uuid, email varchar(50),
alias varchar(50), fullname varchar(50), password varchar(100));Now, on my production database I can run:
SELECT * FROM ProdLink.Users;
And see all users on my production database. I can then do things like:
INSERT INTO Users SELECT * FROM ProdLink.Users L WHERE NOT EXISTS
(select 1 from Users where Users.UserId = L.UserId);Allowing me to pull in every user from production that doesn't already
exist in test.I have about 30 of these views to *proxy* the production data, however
I find it somewhat hacky to have to hardcode in the production
database connection info into each view.Is there a good way to avoid hardcoding, or at least duplicating this
connection info on each view? Can I use database level variables,
environment variables, or anything else instead?sure: why don't you set up a named connection and just make sure it's
established before hitting any of the views?merlin
On Tue, May 15, 2012 at 2:51 PM, Mike Christensen <mike@kitchenpc.com> wrote:
Thanks!
I've never done that in PG before, but I've used named connections
with Oracle. Is it the same sort of deal? There's a file on the disk
somewhere with the connection info? Either way, I'm sure it's a RTFM
thing so I'll look into it.
yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html
btw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:
select (u).* from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select u from users u') as t1(u users);
it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.
merlin
I've never done that in PG before, but I've used named connections
with Oracle. Is it the same sort of deal? There's a file on the disk
somewhere with the connection info? Either way, I'm sure it's a RTFM
thing so I'll look into it.yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.htmlbtw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:select (u).* from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select u from users u') as t1(u users);it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.
Ah ok, now I'm following.. Yea, I had read up on the dblink_connect()
function, however it seemed like an extra step to have to open this
connection every time. It would avoid duplicating the connection info
across multiple views though. What I was hoping for was the ability
to store this information somewhere. Doesn't PG allow custom
variables for sessions, users, and databases? Or is this something
that could be stored in pg_*.conf or as an environment variable?
Eventually, this whole thing might evolve into a Python script or
something that can migrate data over, and at that point the script
could open the connection and migrate the appropriate data over.
Also, thanks for the tip using the "users" type rather than explicitly
spelling out each column. The majority of my tables are identical in
schema, so this would work great. However, sometimes I add new
columns in the development schema, and I have to account for that
difference on the INSERT line, such as:
INSERT INTO Recipes SELECT *, false as SomeNewColumn FROM
ProdLink.Recipes WHERE ...
So, this may or may not work. Still, good to know!
Mike
On Tue, May 15, 2012 at 3:16 PM, Mike Christensen <mike@kitchenpc.com> wrote:
I've never done that in PG before, but I've used named connections
with Oracle. Is it the same sort of deal? There's a file on the disk
somewhere with the connection info? Either way, I'm sure it's a RTFM
thing so I'll look into it.yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.htmlbtw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:select (u).* from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select u from users u') as t1(u users);it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.Ah ok, now I'm following.. Yea, I had read up on the dblink_connect()
function, however it seemed like an extra step to have to open this
connection every time. It would avoid duplicating the connection info
across multiple views though. What I was hoping for was the ability
to store this information somewhere. Doesn't PG allow custom
variables for sessions, users, and databases? Or is this something
that could be stored in pg_*.conf or as an environment variable?
yes, they are called 'tables' :-). stick your connection string in a
table somewhere and do:
create view v as
select (u).* from dblink((select connstr from yadda where yadda), ...);
merlin
I've never done that in PG before, but I've used named connections
with Oracle. Is it the same sort of deal? There's a file on the disk
somewhere with the connection info? Either way, I'm sure it's a RTFM
thing so I'll look into it.yeah, there's a good example in the docs here:
http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.htmlbtw, if you have a structure in test that matches production, then you
can use a composite type trick to avoid having to specify fields as
long as you keep those structures in sync (which you have to do
anyways). try:select (u).* from dblink(
'hostaddr=123.123.123.123 dbname=ProductionDB user=ROUser
password=secret',
'select u from users u') as t1(u users);it should work as long as users exists on both sides and has exactly
the same structure. using that method it's trivial to make a dblink
wrapper that could query any table but you couldn't wrap it into a
single view obviously.Ah ok, now I'm following.. Yea, I had read up on the dblink_connect()
function, however it seemed like an extra step to have to open this
connection every time. It would avoid duplicating the connection info
across multiple views though. What I was hoping for was the ability
to store this information somewhere. Doesn't PG allow custom
variables for sessions, users, and databases? Or is this something
that could be stored in pg_*.conf or as an environment variable?yes, they are called 'tables' :-). stick your connection string in a
table somewhere and do:create view v as
select (u).* from dblink((select connstr from yadda where yadda), ...);
That's definitely an approach. I think I know the possible options
anyway. I just wanted to make sure there wasn't anything I was
missing. Thanks for your help!
Mike