Storage Inefficiency In PostgreSQL

Started by Ray Cheungalmost 7 years ago3 messagesgeneral
Jump to latest
#1Ray Cheung
ray.cheung@silverpowersystems.com

Hi ,

We are currently contemplating switching from MySQL to PostgreSQL, the main
attraction being the use of the TimescaleDB extension. Having done much of
the ground investigation there is one area of significant concern - the
storage requirement of PostgreSQL. Put simply, comparing like for like for a
set of tables, PostgreSQL consumes far more storage space than MySQL:

- MySQL (5.6): 156 MB
- PostgreSQL (11.2): 246 MB
- PostgreSQL + TimescaleDB (partitioned/chunked data): 324 MB

I've also submitted this in stackoverflow:
https://stackoverflow.com/questions/55655272/how-to-reduce-postgresql-databa
se-size.

I can rearrange the table/column-alignment to save 6 bytes per row of the
main table, with a saving of a few mega-bytes. Not enough to make any real
difference. Does anyone know:

- Why PostgreSQL is so storage inefficient in comparison?
- What existing methods can be used to reduce the storage consumption (I've
already tried realignment and vacuum full)?
- Are there any plans to address this storage consumption inefficiency (in
comparison to MySQL) problem?

Many thanks,

sps-ray

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

#2Chris Travers
chris.travers@gmail.com
In reply to: Ray Cheung (#1)
Re: Storage Inefficiency In PostgreSQL

On Mon, Apr 15, 2019 at 10:43 AM Ray Cheung <
ray.cheung@silverpowersystems.com> wrote:

Hi ,

We are currently contemplating switching from MySQL to PostgreSQL, the main
attraction being the use of the TimescaleDB extension. Having done much of
the ground investigation there is one area of significant concern - the
storage requirement of PostgreSQL. Put simply, comparing like for like for
a
set of tables, PostgreSQL consumes far more storage space than MySQL:

- MySQL (5.6): 156 MB
- PostgreSQL (11.2): 246 MB
- PostgreSQL + TimescaleDB (partitioned/chunked data): 324 MB

I've also submitted this in stackoverflow:

https://stackoverflow.com/questions/55655272/how-to-reduce-postgresql-databa
se-size
<https://stackoverflow.com/questions/55655272/how-to-reduce-postgresql-database-size&gt;
.

I can rearrange the table/column-alignment to save 6 bytes per row of the
main table, with a saving of a few mega-bytes. Not enough to make any real
difference. Does anyone know:

- Why PostgreSQL is so storage inefficient in comparison?

The storage strategies are different enough you can't really assume direct
comparisons.

Long story short, iMySQL is optimized for two things: primary key lookups,
and reducing disk I/O from updates to heavily indexed tables.
PostgreSQL is optimized for a lot of things, including access through
secondary indexes and sequential scans. This means that both tables and
indexes are structured differently.

- What existing methods can be used to reduce the storage consumption (I've
already tried realignment and vacuum full)?

You could take a look at extensions that give you foreign data wrappers for
columnar stores, but note this has a number of important tradeoffs in
performance and is not recommended for OLTP systems. However if space is
your primary concern, I would assume you are trying to set up some sort of
OLAP system?

- Are there any plans to address this storage consumption inefficiency (in
comparison to MySQL) problem?

Long run pluggable storage should give people a different set of options
and choices to make here.

Many thanks,

sps-ray

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

#3Kevin Wilkinson
w.kevin.wilkinson@gmail.com
In reply to: Ray Cheung (#1)
Re: Storage Inefficiency In PostgreSQL

we were able to use a brin index for our time-series data and that saved
a lot of space. basically, we used a btree for recent data and, once the
data was "stable", reclustered the data to get high correlation on the
brin index, created a brin index and dropped the btree. it works well.

Show quoted text

On 4/15/2019 1:42 AM, Ray Cheung wrote:

Hi ,

We are currently contemplating switching from MySQL to PostgreSQL, the main
attraction being the use of the TimescaleDB extension. Having done much of
the ground investigation there is one area of significant concern - the
storage requirement of PostgreSQL. Put simply, comparing like for like for a
set of tables, PostgreSQL consumes far more storage space than MySQL:

- MySQL (5.6): 156 MB
- PostgreSQL (11.2): 246 MB
- PostgreSQL + TimescaleDB (partitioned/chunked data): 324 MB

I've also submitted this in stackoverflow:
https://stackoverflow.com/questions/55655272/how-to-reduce-postgresql-databa
se-size.

I can rearrange the table/column-alignment to save 6 bytes per row of the
main table, with a saving of a few mega-bytes. Not enough to make any real
difference. Does anyone know:

- Why PostgreSQL is so storage inefficient in comparison?
- What existing methods can be used to reduce the storage consumption (I've
already tried realignment and vacuum full)?
- Are there any plans to address this storage consumption inefficiency (in
comparison to MySQL) problem?

Many thanks,

sps-ray

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus