Adding a concept of TEMPORARY TABLESPACE for the use in temp_tablespaces

Started by Mitarabout 7 years ago5 messages

mmitar@gmail.com

about 7 years ago

Hi!

I have read around the Internet a lot about the idea of using /dev/shm
for a tablespace to put tables in and issues with that. But I still
have not managed to get a good grasp why would that be a bad idea for
using it for temporary objects. I understand that for regular tables
this might prevent database startup and recovery because tables and
all files associated with tables would be missing. While operations
for those tables could reside in the oplog. (Not sure if this means
that unlogged tables can be stored on such tablesspace.)

I have experimented a bit and performance really improves if /dev/shm
is used. I have experimented with creating temporary tables inside a
regular (SSD backed) tablespace /dev/shm and I have seen at least 2x
improvement in time it takes for a set of modification+select queries
to complete.

I have also tested what happens if I kill all processes with KILL and
restart it. There is noise in logs about missing files, but it does
start up. Dropping and recreating the tablespace works.

So I wonder, should we add a TEMPORARY flag to a TABLESPACE which
would mark a tablespace such that if at startup its location is empty,
it is automatically recreated, without warnings/errors? (Maybe some
other term could be used for this.)

Ideally, such tablespace could be set as temp_tablespaces and things
should work out: PostgreSQL should recreate the tablespace before
trying to use temp_tablespaces for the first time.

We could even make it so that only temporary objects are allowed to be
created in a TEMPORARY TABLESPACE, to make sure user does not make a
mistake.

Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Mitar

mmitar@gmail.com

about 7 years ago

In reply to: Mitar (#1)

Re: Adding a concept of TEMPORARY TABLESPACE for the use in temp_tablespaces

Hi!

On Sun, Jan 6, 2019 at 11:01 AM Mitar <mmitar@gmail.com> wrote:

I have experimented a bit and performance really improves if /dev/shm
is used. I have experimented with creating temporary tables inside a
regular (SSD backed) tablespace /dev/shm and I have seen at least 2x
improvement in time it takes for a set of modification+select queries
to complete.

I also tried just to increase temp_buffers to half the memory, and
things are better, but not to the same degree as using a /dev/shm
tablespace. Why is that? (All my temporary objects in my experiments
are small, few 10k rows, few MBs.)

Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Bruce Momjian

bruce@momjian.us

almost 7 years ago

In reply to: Mitar (#1)

Re: Adding a concept of TEMPORARY TABLESPACE for the use in temp_tablespaces

On Sun, Jan 6, 2019 at 11:01:52AM -0800, Mitar wrote:

Hi!

I have read around the Internet a lot about the idea of using /dev/shm
for a tablespace to put tables in and issues with that. But I still
have not managed to get a good grasp why would that be a bad idea for
using it for temporary objects. I understand that for regular tables
this might prevent database startup and recovery because tables and
all files associated with tables would be missing. While operations
for those tables could reside in the oplog. (Not sure if this means
that unlogged tables can be stored on such tablesspace.)

I have experimented a bit and performance really improves if /dev/shm
is used. I have experimented with creating temporary tables inside a
regular (SSD backed) tablespace /dev/shm and I have seen at least 2x
improvement in time it takes for a set of modification+select queries
to complete.

I have also tested what happens if I kill all processes with KILL and
restart it. There is noise in logs about missing files, but it does
start up. Dropping and recreating the tablespace works.

So I wonder, should we add a TEMPORARY flag to a TABLESPACE which
would mark a tablespace such that if at startup its location is empty,
it is automatically recreated, without warnings/errors? (Maybe some
other term could be used for this.)

Ideally, such tablespace could be set as temp_tablespaces and things
should work out: PostgreSQL should recreate the tablespace before
trying to use temp_tablespaces for the first time.

We could even make it so that only temporary objects are allowed to be
created in a TEMPORARY TABLESPACE, to make sure user does not make a
mistake.

I wrote a blog entry about this:

https://momjian.us/main/blogs/pgblog/2017.html#June_2_2017

This is certainly an area we can improve, but it would require changes
in several parts of the system to handle cases where the tablespace
disappears.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Mitar

mmitar@gmail.com

almost 7 years ago

In reply to: Bruce Momjian (#3)

Re: Adding a concept of TEMPORARY TABLESPACE for the use in temp_tablespaces

Hi!

On Fri, Jan 25, 2019 at 2:32 PM Bruce Momjian <bruce@momjian.us> wrote:

I wrote a blog entry about this:

https://momjian.us/main/blogs/pgblog/2017.html#June_2_2017

This is certainly an area we can improve, but it would require changes
in several parts of the system to handle cases where the tablespace
disappears.

Yes, I read the discussion thread you point at the end of your blog
post. [1]/messages/by-id/20170529185308.GB28209@momjian.us This is why I posted an e-mail to the mailing list because
some statements from that thread do not hold anymore. For example, in
the thread it is stated:

"Just pointing the tablespace to non'restart'safe storage will get you
an installation that fails to boot after a restart, since there's a
tree structure that is expected to survive, and when it's not found,
postgres just fails to boot."

This does not seem to be true (anymore?) based on my testing. You get
noise in logs, but installation boots without a problem.

So maybe we are closer to this than we realize?

[1]: /messages/by-id/20170529185308.GB28209@momjian.us

Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Bruce Momjian

bruce@momjian.us

almost 7 years ago

In reply to: Mitar (#4)

Re: Adding a concept of TEMPORARY TABLESPACE for the use in temp_tablespaces

On Thu, Mar 14, 2019 at 12:53:02AM -0700, Mitar wrote:

Hi!

On Fri, Jan 25, 2019 at 2:32 PM Bruce Momjian <bruce@momjian.us> wrote:

I wrote a blog entry about this:

https://momjian.us/main/blogs/pgblog/2017.html#June_2_2017

This is certainly an area we can improve, but it would require changes
in several parts of the system to handle cases where the tablespace
disappears.

Yes, I read the discussion thread you point at the end of your blog
post. [1] This is why I posted an e-mail to the mailing list because
some statements from that thread do not hold anymore. For example, in
the thread it is stated:

"Just pointing the tablespace to non'restart'safe storage will get you
an installation that fails to boot after a restart, since there's a
tree structure that is expected to survive, and when it's not found,
postgres just fails to boot."

This does not seem to be true (anymore?) based on my testing. You get
noise in logs, but installation boots without a problem.

So maybe we are closer to this than we realize?

Interesting. What happens when you references objects that were in the
tablespace? What would we want to happen?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +