Speed of lo_unlink vs. DELETE on BYTEA

Started by Reuven M. Lernerover 14 years ago7 messagesgeneral

reuven@lerner.co.il

over 14 years ago

Reuven M. Lerner

reuven@lerner.co.il

over 14 years ago

In reply to: Reuven M. Lerner (#1)

Re: Speed of lo_unlink vs. DELETE on BYTEA

<html style="direction: ltr;">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<style>body p { margin-bottom: 10pt; margin-top: 0pt; } </style>
</head>
<body style="direction: ltr;"
bidimailui-detected-decoding-type="UTF-8" bgcolor="#FFFFFF"
text="#000000">
Hi again, everyone. I'm replying to my own posting, to add some
information: I decided to do some of my own benchmarking. And if my
benchmarks are at all accurate, then I'm left wondering why people
use large objects at all, given their clunky API and their extremely
slow speed. I'm posting my benchmarks as a sanity test, because I'm
blown away by the results. 
 
I basically tried three different scenarios, each with 1,000 and
10,000 records. In each scenario, there was a table named
MasterTable that contained a SERIAL "id" column and a "one_value"
integer column, containing a number from generate_series, and a
second table named SecondaryTable containing its own SERIAL "id"
column, a "one_value" value (from generate_series, identical to the
"id" column, and a "master_value" column that's a foreign key back
to the main table. That is, here's the definition of the tables in
the 10,000-record benchmark: 
 
CREATE TABLE MasterTable ( 
 id SERIAL NOT NULL, 
 one_value INTEGER NOT NULL, 
 
 PRIMARY KEY(id) 
); 
INSERT INTO MasterTable (one_value) values
(generate_series(1,10000)); 
 
CREATE TABLE SecondaryTable ( 
 id SERIAL NOT NULL, 
 one_value INTEGER NOT NULL, 
 master_value INTEGER NOT NULL REFERENCES MasterTable ON
DELETE CASCADE 
 
 PRIMARY KEY(id) 
); 
 
INSERT INTO SecondaryTable (master_value, one_value) 
(SELECT s.a, s.a FROM generate_series(1,10000) AS s(a)); 
 
I also had two other versions of SecondaryTable: In one scenario,
there is a my_blob column, of type BYTEA, containing 5 million 'x"
characters. A final version had a 5-million 'x' character document
loaded into a large object in SecionaryTable. 
 
The idea was simple: I wanted to see how much faster or slower it
was to delete (not truncate) all of the records in MasterTable,
given these different data types. Would bytea be significantly
faster than large objects? How would the cascading delete affect
things? And how long does it take to pg_dump with large objects
around? 
 
Here are the results, which were pretty dramatic. Basically,
pg_dump seems to always be far, far slower than BYTEA columns.
Again, I'm wondering whether I'm doing something wrong here, or if
this explains why in my many years of using PostgreSQL, I've neither
used nor been tempted to use large objects before. 
 
<tt>1.1 1,000 records 
================== 
 
 Delete Dump 
---------------+---------+-------- 
Empty content 0.172s 0.057s 
bytea 0.488s 0.066s 
large object 30.833s 9.275s 
 
 
1.2 10,000 records 
=================== 
 
 Delete Dump 
---------------+-----------+----------- 
Empty content 8.162s 0.064s 
bytea 1m0.417s 0.157s 
large object 4m44.501s 1m38.454s 
</tt> 

Any ideas? If this is true, should we be warning people away from
large objects in the documentation, and toward bytea? 
 
Reuven 
</body>
</html>

Laurenz Albe

laurenz.albe@cybertec.at

over 14 years ago

In reply to: Reuven M. Lerner (#1)

Re: Speed of lo_unlink vs. DELETE on BYTEA

Reuven M. Lerner wrote:

When a record in the main table is deleted, there is a rule (yes a rule -- not a trigger) in the
referencing table that performs a lo_unlink on the associated object.

I just want to check that my intuition is correct: Wouldn't it be way faster and more efficient
for us to use BYTEA columns to store the data (which can get into the 20-50 MB range),
and for us to just depend on ON DELETE CASCADE, rather than a rule?

[followed by dramatic performance numbers]

Could you try with a trigger instead of a rule and see if the performance is better?

Yours,
Laurenz Albe

Reuven M. Lerner

reuven@lerner.co.il

over 14 years ago

In reply to: Laurenz Albe (#3)

Re: Speed of lo_unlink vs. DELETE on BYTEA

Daniel Verite

daniel@manitou-mail.org

over 14 years ago

In reply to: Reuven M. Lerner (#2)

Re: Speed of lo_unlink vs. DELETE on BYTEA

Reuven M. Lerner wrote:

1.1 1,000 records
==================

Delete Dump
---------------+---------+--------
Empty content 0.172s 0.057s
bytea 0.488s 0.066s
large object 30.833s 9.275s

How much bytea are you dumping for it to take only 0.066s?
The fact that it takes about the same time than dumping the "empty content"
looks very suspicious.

Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

Reuven M. Lerner

reuven@lerner.co.il

over 14 years ago

In reply to: Daniel Verite (#5)

Re: Speed of lo_unlink vs. DELETE on BYTEA

<html style="direction: ltr;">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<style>body
p { margin-bottom: 10pt; margin-top: 0pt; } </style>
<style>body p { margin-bottom: 10pt; margin-top: 0pt; } </style>
</head>
<body style="direction: ltr;"
bidimailui-detected-decoding-type="UTF-8" bgcolor="#FFFFFF"
text="#000000">
Hi, everyone. Daniel Verite <a class="moz-txt-link-rfc2396E"
href="mailto:daniel@manitou-mail.org"><daniel@manitou-mail.org></a>
wrote: 

 
<blockquote type="cite">
<pre wrap="">How much bytea are you dumping for it to take only 0.066s?
The fact that it takes about the same time than dumping the "empty content"
looks very suspicious.

On my desktop machine, if I create a table with 1000 blobs containing strings
of 5 million 'x', which is what I understood you basically did (perhaps I
misunderstood?), then it takes about 200s to dump it with pg_dump -Fc</pre>
</blockquote>
OK, this is an egg-on-my-face moment with my benchmarks: I added the
pg_dump timing after the "delete" timing, and so I was actually
dumping the database when it was empty! Not very effective, to say
the least. 
 
I've updated my benchmark, and updated the results, as well: 
 
<tt>| | Delete | Dump | Database
size | Dump size | 
|---------------------------+-----------+-----------+---------------+-----------| 
| Empty content | 0m0.151s | 0m38.875s | 88
kB | 11K | 
| bytea | 0m0.505s | 1m59.565s | 57
MB | 4.7M | 
| large object with rule | 0m31.438s | 2m42.079s | 88
kB | 4.7M | 
| large object with trigger | 0m28.612s | 2m17.156s | 88
kB | 4.7M | 
 
 
 
** 10,000 records 
 
| | Delete | Dump | Database
size | Dump size | 
|---------------------------+-----------+------------+---------------+-----------| 
| Empty content | 0m7.436s | 0m0.089s | 680
kB | 66K | 
| bytea | 1m5.771s | 20m40.842s | 573
MB | 47M | 
| large object with rule | 5m26.254s | 21m7.695s | 680
kB | 47M | 
| large object with trigger | 5m13.718s | 20m56.195s | 680
kB | 47M | 
</tt>
 

It would thus appear that there's a slight edge for dumping
bytea, but nothing super-amazing. Deleting, however, is still
much faster with bytea than large objects. 

 
I've put my benchmark code up on GitHub for people to run and play
with, to see if they can reproduce my results:
<a class="moz-txt-link-freetext" href="https://github.com/reuven/pg-delete-benchmarks">https://github.com/reuven/pg-delete-benchmarks</a> 

 

Reuven 

<pre class="moz-signature" cols="72">--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner
</pre>
</body>
</html>

Import Notes

Resolved by subject fallback

Eduardo Morras

nec556@retena.com

over 14 years ago

In reply to: Reuven M. Lerner (#6)

Re: Speed of lo_unlink vs. DELETE on BYTEA

At 07:43 25/09/2011, Reuven M. Lerner wrote:

Hi, everyone.Â Daniel Verite
<mailto:daniel@manitou-mail.org><daniel@manitou-mail.org> wrote:

It would thus appear that there's a slight edge
for dumping bytea, but nothing
super-amazing.Â Deleting, however, is still
much faster with bytea than large objects.

The problem you have is with
compression/decompression on large objects. If
you see at it's sizes, you get 680KB for large
objects and 573MB for bytea. Postgresql needs to
decompress them before the dump. Even worse, if
your dump is compressed, postgres decompress each
large object , dump it and recompress. For this
test, switch off compression on large
objects/toast. For long term, perhaps a request
to postgresql hackers to directly dump the
already compressed large objects. The toast maybe
more difficult because there are not only big
size columns, but any column whose* size is
bigger than a threshold (don't remember now, 1-2KB or similar)

* Is it whose the correct word? I hope i have expressed correctly.

EFME