Isolation of multiple databse instances provided by a single postgres server

Started by stanover 6 years ago11 messagesgeneral
Jump to latest
#1stan
stanb@panix.com

I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.

We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still good way to move the production
instance to the sandbox, and perhaps even the other way around?

--
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin

#2Ron
ronljohnsonjr@gmail.com
In reply to: stan (#1)
Re: Isolation of multiple databse instances provided by a single postgres server

On 11/20/19 4:03 PM, stan wrote:

I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.

We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

What specifically do you mean by "instance"?  (I know what it means in the
SQL Server world, and in Postgres all the databases accessible via a single
$PGDATA are called a *cluster*.)

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still good way to move the production
instance to the sandbox, and perhaps even the other way around?

Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
isolating them from each other.

OTOH, you can initdb multiple clusters on the same host, accessing them via
different $PGDATA variables and port numbers.

--
Angular momentum makes the world go 'round.

#3stan
stanb@panix.com
In reply to: Ron (#2)
Re: Isolation of multiple databse instances provided by a single postgres server

On Wed, Nov 20, 2019 at 04:24:40PM -0600, Ron wrote:

On 11/20/19 4:03 PM, stan wrote:

I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.

We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

What specifically do you mean by "instance"??? (I know what it means in the
SQL Server world, and in Postgres all the databases accessible via a single
$PGDATA are called a *cluster*.)

Sorry for my incorrect terminology. I am probably confused about the
technology here. Let me try to explain what I think I understand.

It seems to me that I can have one Postgres "server" running listening on a
single port on a single machine. It appears that the data files for this
"server" are managed internally by the Postgres server instance, and I
have no control of what is stored where in them. In an Oracle world, I can
create tablespaces, which have a relationship to OS files, and I can
explicitly control what objects are stored in which tablespaces (OS file),
thus, for example, when I do a hot backup, I put a specific tablespaces in
backup mode, and can then safely copy this OS file (yes I have to properly
deal with archive logs). Thus I would be somewhat comfortable have to
distinct "instance: provided by that one Oracle "server".

It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct? If
so, how does this cores pond to physical on disk storage?

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still good way to move the production
instance to the sandbox, and perhaps even the other way around?

Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
isolating them from each other.

Makes sense.

OTOH, you can initdb multiple clusters on the same host, accessing them via
different $PGDATA variables and port numbers.

That is consistent with what I thought I understood.

Thanks for taking time to educate me.
--
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin

#4Ron
ronljohnsonjr@gmail.com
In reply to: stan (#3)
Re: Isolation of multiple databse instances provided by a single postgres server

On 11/21/19 5:55 AM, stan wrote:

On Wed, Nov 20, 2019 at 04:24:40PM -0600, Ron wrote:

On 11/20/19 4:03 PM, stan wrote:

I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.

We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

What specifically do you mean by "instance"??? (I know what it means in the
SQL Server world, and in Postgres all the databases accessible via a single
$PGDATA are called a *cluster*.)

Sorry for my incorrect terminology. I am probably confused about the
technology here. Let me try to explain what I think I understand.

It seems to me that I can have one Postgres "server" running listening on a
single port on a single machine. It appears that the data files for this
"server" are managed internally by the Postgres server instance, and I
have no control of what is stored where in them. In an Oracle world, I can
create tablespaces, which have a relationship to OS files, and I can
explicitly control what objects are stored in which tablespaces (OS file),

Same in Postgres.

https://www.postgresql.org/docs/9.6/sql-createtablespace.html

CREATE TABLESPACEtablespace_name
[ OWNER {new_owner | CURRENT_USER | SESSION_USER } ]
LOCATION 'directory'
[ WITH (tablespace_option =value [, ... ] ) ]

thus, for example, when I do a hot backup, I put a specific tablespaces in
backup mode, and can then safely copy this OS file (yes I have to properly
deal with archive logs). Thus I would be somewhat comfortable have to
distinct "instance: provided by that one Oracle "server".

There's no ability to backup a single tablespace in Postgres, because the
purpose behind them is not the same in Oracle and Postgres.

It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct?

Yes, but ... schema in Postgres are /different/ from Oracle schema.  In
Postgres, the CREATE SCHEMA command creates a schema, whereas CREATE USER
creates a schema in Oracle.

If
so, how does this cores pond to physical on disk storage?

It corresponds not at all.

The directory that a table's files go in is solely dependent on the
tablespace it lives in (typically "pg_default", who's location is $PGDATA).

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still good way to move the production
instance to the sandbox, and perhaps even the other way around?

Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
isolating them from each other.

Makes sense.

OTOH, you can initdb multiple clusters on the same host, accessing them via
different $PGDATA variables and port numbers.

That is consistent with what I thought I understood.

Thanks for taking time to educate me.

--
Angular momentum makes the world go 'round.

#5stan
stanb@panix.com
In reply to: Ron (#4)
Re: Isolation of multiple databse instances provided by a single postgres server

On Thu, Nov 21, 2019 at 08:18:21AM -0600, Ron wrote:

On 11/21/19 5:55 AM, stan wrote:

On Wed, Nov 20, 2019 at 04:24:40PM -0600, Ron wrote:

On 11/20/19 4:03 PM, stan wrote:

I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.

We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

What specifically do you mean by "instance"??? (I know what it means in the
SQL Server world, and in Postgres all the databases accessible via a single
$PGDATA are called a *cluster*.)

Sorry for my incorrect terminology. I am probably confused about the
technology here. Let me try to explain what I think I understand.

It seems to me that I can have one Postgres "server" running listening on a
single port on a single machine. It appears that the data files for this
"server" are managed internally by the Postgres server instance, and I
have no control of what is stored where in them. In an Oracle world, I can
create tablespaces, which have a relationship to OS files, and I can
explicitly control what objects are stored in which tablespaces (OS file),

Same in Postgres.

https://www.postgresql.org/docs/9.6/sql-createtablespace.html

CREATE TABLESPACEtablespace_name
[ OWNER {new_owner | CURRENT_USER | SESSION_USER } ]
LOCATION 'directory'
[ WITH (tablespace_option =value [, ... ] ) ]

OH, that does look familiar, thanks for taking time to educate me.

thus, for example, when I do a hot backup, I put a specific tablespaces in
backup mode, and can then safely copy this OS file (yes I have to properly
deal with archive logs). Thus I would be somewhat comfortable have to
distinct "instance: provided by that one Oracle "server".

There's no ability to backup a single tablespace in Postgres, because the
purpose behind them is not the same in Oracle and Postgres.

Got that.

It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct?

Yes, but ... schema in Postgres are /different/ from Oracle schema.?? In
Postgres, the CREATE SCHEMA command creates a schema, whereas CREATE USER
creates a schema in Oracle.

I am still struggling with the schema - role - user relationship in Postgres.
It appears to me there is one more layer than is needed/useful.

If
so, how does this cores pond to physical on disk storage?

It corresponds not at all.

Understand that now.

The directory that a table's files go in is solely dependent on the
tablespace it lives in (typically "pg_default", who's location is $PGDATA).

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still good way to move the production
instance to the sandbox, and perhaps even the other way around?

Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
isolating them from each other.

Makes sense.

OTOH, you can initdb multiple clusters on the same host, accessing them via
different $PGDATA variables and port numbers.

That is consistent with what I thought I understood.

Thanks for taking time to educate me.

--
Angular momentum makes the world go 'round.

Thanks again.

BTW, I do appreciate your taking time to clarify things I am confused on,
Hopefully I can pass on this knowledge to to others.

I have decide to isolate my sandbox instance by running on a physically
different machine BTW, which was what the original question was about, but I
am learning other things that I knew I did not have a full understanding of,
and I appreciate that.

--
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: stan (#3)
Re: Isolation of multiple databse instances provided by a single postgres server

On Thu, 2019-11-21 at 06:55 -0500, stan wrote:

It seems to me that I can have one Postgres "server" running listening on a
single port on a single machine. It appears that the data files for this
"server" are managed internally by the Postgres server instance, and I
have no control of what is stored where in them. In an Oracle world, I can
create tablespaces, which have a relationship to OS files, and I can
explicitly control what objects are stored in which tablespaces (OS file),
thus, for example, when I do a hot backup, I put a specific tablespaces in
backup mode, and can then safely copy this OS file (yes I have to properly
deal with archive logs). Thus I would be somewhat comfortable have to
distinct "instance: provided by that one Oracle "server".

It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct? If
so, how does this cores pond to physical on disk storage?

You can use tablespaces in PostgreSQL, which are directories on a
different file system, to put your data elsewhere.
But that has very limited use-cases, and normally you don't
create a tablespace.

About isolation:
- The different databases in a cluster are physically located in
the same tablespace, but they are logically strictly separated.
You cannot connect to one database and access another database from there.

- There can be several schemas in a database.
You can access a table in a schema if you have the required privilege
on both the schema and the table.

This is entirely independent of physical storage, which is provided
by tablespaces. Tables from different databases can be located in the
same tablespace and vice versa.

Think of "database" and "schema" as a logical separation in SQL.

You cannot backup and restore an individual tablespace, only the
whole cluster.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#7Ron
ronljohnsonjr@gmail.com
In reply to: stan (#5)
Re: Isolation of multiple databse instances provided by a single postgres server

On 11/21/19 8:42 AM, stan wrote:

On Thu, Nov 21, 2019 at 08:18:21AM -0600, Ron wrote:

[snip]

Yes, but ... schema in Postgres are /different/ from Oracle schema.?? In
Postgres, the CREATE SCHEMA command creates a schema, whereas CREATE USER
creates a schema in Oracle.

I am still struggling with the schema - role - user relationship in Postgres.
It appears to me there is one more layer than is needed/useful.

For one thing, there are no users in Postgres anymore.

https://www.postgresql.org/docs/9.6/sql-createuser.html

CREATE USERis now an alias forCREATE ROLE
<https://www.postgresql.org/docs/9.6/sql-createrole.html&gt;. The only
difference is that when the command is spelledCREATE USER,LOGINis assumed by
default, whereasNOLOGINis assumed when the command is spelledCREATE ROLE.

--
Angular momentum makes the world go 'round.

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Laurenz Albe (#6)
Re: Isolation of multiple databse instances provided by a single postgres server

On 11/21/19 6:51 AM, Laurenz Albe wrote:

On Thu, 2019-11-21 at 06:55 -0500, stan wrote:

You can use tablespaces in PostgreSQL, which are directories on a
different file system, to put your data elsewhere.
But that has very limited use-cases, and normally you don't
create a tablespace.

About isolation:
- The different databases in a cluster are physically located in
the same tablespace, but they are logically strictly separated.
You cannot connect to one database and access another database from there.

dblink(https://www.postgresql.org/docs/11/dblink.html) and
FDW(https://www.postgresql.org/docs/11/sql-createforeigntable.html)
would beg to differ.

- There can be several schemas in a database.

There are several schemas in a database. In a new database:

test=# \dnS
List of schemas
Name | Owner
--------------------+----------
information_schema | postgres
pg_catalog | postgres
pg_temp_1 | postgres
pg_toast | postgres
pg_toast_temp_1 | postgres
public | postgres

You can access a table in a schema if you have the required privilege
on both the schema and the table.

This is entirely independent of physical storage, which is provided
by tablespaces. Tables from different databases can be located in the
same tablespace and vice versa.

Think of "database" and "schema" as a logical separation in SQL.

You cannot backup and restore an individual tablespace, only the
whole cluster.

Yours,
Laurenz Albe

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Laurenz Albe (#6)
Re: Isolation of multiple databse instances provided by a single postgres server

On 11/21/19 6:51 AM, Laurenz Albe wrote:

On Thu, 2019-11-21 at 06:55 -0500, stan wrote:

You can use tablespaces in PostgreSQL, which are directories on a
different file system, to put your data elsewhere.
But that has very limited use-cases, and normally you don't
create a tablespace.

About isolation:
- The different databases in a cluster are physically located in
the same tablespace, but they are logically strictly separated.
You cannot connect to one database and access another database from there.

dblink(https://www.postgresql.org/docs/11/dblink.html) and
FDW(https://www.postgresql.org/docs/11/sql-createforeigntable.html)
would beg to differ.

- There can be several schemas in a database.

There are several schemas in a database. In a new database:

test=# \dnS
List of schemas
Name | Owner
--------------------+----------
information_schema | postgres
pg_catalog | postgres
pg_temp_1 | postgres
pg_toast | postgres
pg_toast_temp_1 | postgres
public | postgres

You can access a table in a schema if you have the required privilege
on both the schema and the table.

This is entirely independent of physical storage, which is provided
by tablespaces. Tables from different databases can be located in the
same tablespace and vice versa.

Think of "database" and "schema" as a logical separation in SQL.

You cannot backup and restore an individual tablespace, only the
whole cluster.

Yours,
Laurenz Albe

--
Adrian Klaver
adrian.klaver@aklaver.com

#10Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Ron (#4)
Re: Isolation of multiple databse instances provided by a single postgres server

On 2019-11-21 08:18:21 -0600, Ron wrote:

It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct?

[...]

If so, how does this cores pond to physical on disk storage?

It corresponds not at all.

That's not quite true. Each database has a specific directory (per
tablespace) to keep its files in. Schemas on the other hand do not
correspond to anything on the filesystem.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#11Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Adrian Klaver (#8)
Re: Isolation of multiple databse instances provided by a single postgres server

On 2019-11-21 07:21:10 -0800, Adrian Klaver wrote:

On 11/21/19 6:51 AM, Laurenz Albe wrote:

- The different databases in a cluster are physically located in
the same tablespace, but they are logically strictly separated.
You cannot connect to one database and access another database from there.

dblink(https://www.postgresql.org/docs/11/dblink.html) and
FDW(https://www.postgresql.org/docs/11/sql-createforeigntable.html) would
beg to differ.

I was thinking of FDW.

But I think Laurenz had the SQL syntax in mind which does include the
database in the fully qualified form of a tablename. Something like

db2=# select * from db2.public.t3;
╔════╤═══════════════════════╗
║ id │ t ║
╟────┼───────────────────────╢
║ 1 │ this is db2.public.t3 ║
╚════╧═══════════════════════╝
(1 row)

works. But when you try to access a table in a different database you
get an error:

db2=# select * from db1.public.t1;
ERROR: cross-database references are not implemented: "db1.public.t1"
LINE 1: select * from db1.public.t1;

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"