pg_largeobject

Started by Sridhar N Bamandlapallyalmost 10 years ago10 messages
#1Sridhar N Bamandlapally
sridhar.bn1@gmail.com

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

Thanks
Sridhar

#2John R Pierce
pierce@hogranch.com
In reply to: Sridhar N Bamandlapally (#1)
Re: pg_largeobject

On 3/29/2016 2:13 AM, Sridhar N Bamandlapally wrote:

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

I would as soon use a NFS file store for larger files like images,
audio, videos, or whatever. use SQL for the relational metadata.

just sayin'....

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Sridhar N Bamandlapally
sridhar.bn1@gmail.com
In reply to: John R Pierce (#2)
Re: pg_largeobject

all media files are stored in database with size varies from 1MB - 5GB

based on media file types and user-group we storing in different tables,
but PostgreSQL store OID/Large-object in single table (pg_largeobject), 90%
of database size is with table pg_largeobject

due to size limitation BYTEA was not considered

Thanks
Sridhar

On Tue, Mar 29, 2016 at 3:05 PM, John R Pierce <pierce@hogranch.com> wrote:

Show quoted text

On 3/29/2016 2:13 AM, Sridhar N Bamandlapally wrote:

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

I would as soon use a NFS file store for larger files like images, audio,
videos, or whatever. use SQL for the relational metadata.

just sayin'....

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Alvaro Aguayo Garcia-Rada
aaguayo@opensysperu.com
In reply to: Sridhar N Bamandlapally (#3)
Re: pg_largeobject

Some time ago I had to setup a replicated file system between multiple linux servers. I tried everything I could based on postgres, including large objects, but everything was significantly slower than a regular filesystem.

My conclussion: postgres is not suitable for storing large files efficiently.

Do you need that for replication, or just for file storage?

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103  | RPC: (+51) 954183248
Website: www.ocs.pe

Sent from my Sony Xperia™ smartphone

---- Sridhar N Bamandlapally wrote ----

all media files are stored in database with size varies from 1MB - 5GB

based on media file types and user-group we storing in different tables,
but PostgreSQL store OID/Large-object in single table (pg_largeobject), 90%
of database size is with table pg_largeobject

due to size limitation BYTEA was not considered

Thanks
Sridhar

On Tue, Mar 29, 2016 at 3:05 PM, John R Pierce <pierce@hogranch.com> wrote:

Show quoted text

On 3/29/2016 2:13 AM, Sridhar N Bamandlapally wrote:

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

I would as soon use a NFS file store for larger files like images, audio,
videos, or whatever. use SQL for the relational metadata.

just sayin'....

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Sridhar N Bamandlapally
sridhar.bn1@gmail.com
In reply to: Alvaro Aguayo Garcia-Rada (#4)
Re: pg_largeobject

We are doing application/database migration compatible with postgresql on
cloud, DR/replication also in plan

at present I feel need of configurable multi-table storage instead of
pg_largeobject only

Thanks
Sridhar

On Tue, Mar 29, 2016 at 6:08 PM, Alvaro Aguayo Garcia-Rada <
aaguayo@opensysperu.com> wrote:

Show quoted text

Some time ago I had to setup a replicated file system between multiple
linux servers. I tried everything I could based on postgres, including
large objects, but everything was significantly slower than a regular
filesystem.

My conclussion: postgres is not suitable for storing large files
efficiently.

Do you need that for replication, or just for file storage?

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC:
(+51) 954183248
Website: www.ocs.pe

Sent from my Sony Xperia™ smartphone

---- Sridhar N Bamandlapally wrote ----

all media files are stored in database with size varies from 1MB - 5GB

based on media file types and user-group we storing in different tables,
but PostgreSQL store OID/Large-object in single table (pg_largeobject), 90%
of database size is with table pg_largeobject

due to size limitation BYTEA was not considered

Thanks
Sridhar

On Tue, Mar 29, 2016 at 3:05 PM, John R Pierce <pierce@hogranch.com>
wrote:

On 3/29/2016 2:13 AM, Sridhar N Bamandlapally wrote:

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

I would as soon use a NFS file store for larger files like images, audio,
videos, or whatever. use SQL for the relational metadata.

just sayin'....

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Alvaro Aguayo Garcia-Rada
aaguayo@opensysperu.com
In reply to: Sridhar N Bamandlapally (#5)
Re: pg_largeobject

Amongst all my tries, I also tried that. I created two tables, one for basic file data, and another for file content(splitted in pages, as in large objects), but the performance was almost the same as with pg_largeobject; he great difference was that, with my own tables, I could replicate without problems with pgpool2, which was troublesome with large objects.

Based on my experience, I would seriously recommend to search for another solution, as postgres may not be suitable for large files storage. In my case, I ended up using MS DFS-R, but there are some other solutions like Ceph, GlusterFS, and many others. Also, I've recently heard about MongoDB, which has it's own database-backed filesystem optimized for large files; never tried it, but may be worth a try.

Regards,

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC: (+51) 954183248
Website: www.ocs.pe

----- Original Message -----
From: "Sridhar N Bamandlapally" <sridhar.bn1@gmail.com>
To: "Alvaro Aguayo Garcia-Rada" <aaguayo@opensysperu.com>
Cc: "John R Pierce" <pierce@hogranch.com>, "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Tuesday, 29 March, 2016 10:09:10
Subject: Re: [GENERAL] pg_largeobject

We are doing application/database migration compatible with postgresql on
cloud, DR/replication also in plan

at present I feel need of configurable multi-table storage instead of
pg_largeobject only

Thanks
Sridhar

On Tue, Mar 29, 2016 at 6:08 PM, Alvaro Aguayo Garcia-Rada <
aaguayo@opensysperu.com> wrote:

Some time ago I had to setup a replicated file system between multiple
linux servers. I tried everything I could based on postgres, including
large objects, but everything was significantly slower than a regular
filesystem.

My conclussion: postgres is not suitable for storing large files
efficiently.

Do you need that for replication, or just for file storage?

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC:
(+51) 954183248
Website: www.ocs.pe

Sent from my Sony Xperia™ smartphone

---- Sridhar N Bamandlapally wrote ----

all media files are stored in database with size varies from 1MB - 5GB

based on media file types and user-group we storing in different tables,
but PostgreSQL store OID/Large-object in single table (pg_largeobject), 90%
of database size is with table pg_largeobject

due to size limitation BYTEA was not considered

Thanks
Sridhar

On Tue, Mar 29, 2016 at 3:05 PM, John R Pierce <pierce@hogranch.com>
wrote:

On 3/29/2016 2:13 AM, Sridhar N Bamandlapally wrote:

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

I would as soon use a NFS file store for larger files like images, audio,
videos, or whatever. use SQL for the relational metadata.

just sayin'....

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Daniel Verite
daniel@manitou-mail.org
In reply to: Sridhar N Bamandlapally (#3)
Re: pg_largeobject

Sridhar N Bamandlapally wrote:

due to size limitation BYTEA was not considered

You could adopt for a custom table the same kind of structure that
pg_largeobject has, that is an ordered series of BYTEA chunks.

# \d pg_largeobject
Table "pg_catalog.pg_largeobject"
Column | Type | Modifiers
--------+---------+-----------
loid | oid | not null
pageno | integer | not null
data | bytea |

Say you create a table looking like this:
(
object_id int
pageno integer
data bytea
)
with a unique index on (object_id,pageno),
and octet_length(data) never exceeding a reasonable max size,
such as 1MB, so if you want a 5GB object, that's just 5*1024 rows in
that table, one row per pageno.

It's really a good plan if your client code cooperates by streaming
contents ordered by pageno instead of handling the blob as
a monolith.

About the chunk size, by comparison, the large object facility limits
pg_largeobject.data to a quarter of a page, or 2048 bytes per row
(=LOBLKSIZE, see comments around
http://doxygen.postgresql.org/large__object_8h_source.html#l00072 )

Having your own table has several advantages:
- it contains much less rows for the same contents, if the choosen chunk
size is
much larger than 2048 bytes.
- TOAST storage is enabled so that the main relation is way smaller.
- it can be partitioned.
- it can have triggers (good for custom replication)

The drawback being that your application has to provide the equivalent
code to the lo_* client-side and server-side functions that it needs.
But that's a relatively easy work for a programmer, especially if the blobs
happen to be immutable, as is often the case.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Jerome Wagner
jerome.wagner@laposte.net
In reply to: Sridhar N Bamandlapally (#5)
Re: pg_largeobject

I am not saying that this will solve your problem (I never tried id even
though I keep it in my radar), but this project seems to implement
something close to what Daniel is describing:

https://github.com/andreasbaumann/pgfuse

+ it gives you a FUSE wrapper so the client can use fs calls.

the proposed schema is here
https://github.com/andreasbaumann/pgfuse/blob/master/schema.sql

On Tue, Mar 29, 2016 at 5:09 PM, Sridhar N Bamandlapally <
sridhar.bn1@gmail.com> wrote:

Show quoted text

We are doing application/database migration compatible with postgresql on
cloud, DR/replication also in plan

at present I feel need of configurable multi-table storage instead of
pg_largeobject only

Thanks
Sridhar

On Tue, Mar 29, 2016 at 6:08 PM, Alvaro Aguayo Garcia-Rada <
aaguayo@opensysperu.com> wrote:

Some time ago I had to setup a replicated file system between multiple
linux servers. I tried everything I could based on postgres, including
large objects, but everything was significantly slower than a regular
filesystem.

My conclussion: postgres is not suitable for storing large files
efficiently.

Do you need that for replication, or just for file storage?

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC:
(+51) 954183248
Website: www.ocs.pe

Sent from my Sony Xperia™ smartphone

---- Sridhar N Bamandlapally wrote ----

all media files are stored in database with size varies from 1MB - 5GB

based on media file types and user-group we storing in different tables,
but PostgreSQL store OID/Large-object in single table (pg_largeobject), 90%
of database size is with table pg_largeobject

due to size limitation BYTEA was not considered

Thanks
Sridhar

On Tue, Mar 29, 2016 at 3:05 PM, John R Pierce <pierce@hogranch.com>
wrote:

On 3/29/2016 2:13 AM, Sridhar N Bamandlapally wrote:

Hi

pg_largeobject is creating performance issues as it grow due to single
point storage(for all tables)

is there any alternate apart from bytea ?

like configuration large-object-table at table-column level and oid
PK(primary key) stored at pg_largeobject

I would as soon use a NFS file store for larger files like images,
audio, videos, or whatever. use SQL for the relational metadata.

just sayin'....

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Sridhar N Bamandlapally
sridhar.bn1@gmail.com
In reply to: Daniel Verite (#7)
Re: pg_largeobject

Is there any way we can change the segment file size,

I am trying to look into the possibility of segment file size Vs bytea size
limitation

PostgreSQL installation

step 1: ./configure --enable-largefile --with-segsize ( throwing error
"configure: error: Large file support is not enabled. Segment size cannot
be larger than 1GB" )

Thanks
Sridhar

On Tue, Mar 29, 2016 at 9:01 PM, Daniel Verite <daniel@manitou-mail.org>
wrote:

Show quoted text

Sridhar N Bamandlapally wrote:

due to size limitation BYTEA was not considered

You could adopt for a custom table the same kind of structure that
pg_largeobject has, that is an ordered series of BYTEA chunks.

# \d pg_largeobject
Table "pg_catalog.pg_largeobject"
Column | Type | Modifiers
--------+---------+-----------
loid | oid | not null
pageno | integer | not null
data | bytea |

Say you create a table looking like this:
(
object_id int
pageno integer
data bytea
)
with a unique index on (object_id,pageno),
and octet_length(data) never exceeding a reasonable max size,
such as 1MB, so if you want a 5GB object, that's just 5*1024 rows in
that table, one row per pageno.

It's really a good plan if your client code cooperates by streaming
contents ordered by pageno instead of handling the blob as
a monolith.

About the chunk size, by comparison, the large object facility limits
pg_largeobject.data to a quarter of a page, or 2048 bytes per row
(=LOBLKSIZE, see comments around
http://doxygen.postgresql.org/large__object_8h_source.html#l00072 )

Having your own table has several advantages:
- it contains much less rows for the same contents, if the choosen chunk
size is
much larger than 2048 bytes.
- TOAST storage is enabled so that the main relation is way smaller.
- it can be partitioned.
- it can have triggers (good for custom replication)

The drawback being that your application has to provide the equivalent
code to the lo_* client-side and server-side functions that it needs.
But that's a relatively easy work for a programmer, especially if the blobs
happen to be immutable, as is often the case.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#10Andreas Kretschmer
akretschmer@spamfence.net
In reply to: Sridhar N Bamandlapally (#9)
Re: pg_largeobject

Sridhar N Bamandlapally <sridhar.bn1@gmail.com> wrote:

Is there any way we can change the segment file size,�

I am trying to look into the possibility of segment file size Vs bytea size
limitation

PostgreSQL installation

step 1: �./configure�--enable-largefile --with-segsize ( throwing error
"configure: error: Large file support is not enabled. Segment size cannot be
larger than 1GB" )

check if your filesystem supports large files:

getconf FILESIZEBITS /some/path

If the result is 64, LFS is supported.

Regards, Andreas Kretschmer
--
Andreas Kretschmer
http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general