heads up on large text fields.

Started by Rob Sargentover 7 years ago5 messagesgeneral
Jump to latest
#1Rob Sargent
robjsargent@gmail.com

Playing around with files-in-text-field.  I can happily slam a 10M file
into a text field in a table defined as

gtdb=# \d gt.ld
                    Table "gt.ld"
    Column    | Type | Collation | Nullable | Default
--------------+------+-----------+----------+---------
 id           | uuid |           | not null |
 name         | text |           |          |
 markerset_id | uuid |           | not null |
 ld           | text |           |          |
Indexes:
    "ld_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "ld_markerset_id_fkey" FOREIGN KEY (markerset_id) REFERENCES
base.markerset(id)

gtdb=# select id, length(ld), substring(ld,  300000, 100) from gt.ld;
                  id                  | length  | substring
--------------------------------------+---------+------------------------------
 28f8dc94-c9d1-4c45-b504-fda585b497f8 | 6742760
|                             +
                                      |         | 3 2
rs1858447|5852230|10+
                                      |         |  0.500000
0.500000         +
                                      |         | 3 2
rs1567706|5853767|10+
                                      |         |  0.500000 0.500000
(1 row)

And I can regenerate the file using java (with jOOQ) in respectable time.

However, I get into deep dodo when I try redirecting psql output such as

select ld from gt.ld\g /tmp/regen.file

"/tmp/regen.file" gets very large, very fast and I have to
pg_terminate_backend.  Tried this three times, once using "\o test.blob"
instead.

h009357:loader$ ls -ltr
total 2048
-rwxr-xr-x 1 u0138544 camplab 10002460672 Sep 21 15:49 test.blob

Frankly, I'm suspicious of that ls (it's an smb mount of 25T partition)
but that's what's in the emacs shell buffer!
The re-direct isn't a must-have, but was hoping that would be an easy
way to get a file back.

Have I simply gone too far with text type?

h009357:share$ psql --version
psql (PostgreSQL) 10.5 (Ubuntu 10.5-0ubuntu0.18.04)

postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

#2Andres Freund
andres@anarazel.de
In reply to: Rob Sargent (#1)
Re: heads up on large text fields.

On 2018-09-21 18:28:37 -0600, Rob Sargent wrote:

"/tmp/regen.file" gets very large, very fast and I have to
pg_terminate_backend.� Tried this three times, once using "\o test.blob"
instead.

h009357:loader$ ls -ltr
total 2048
-rwxr-xr-x 1 u0138544 camplab 10002460672 Sep 21 15:49 test.blob

I suspect the layouting of such wide columns problably creates a lot of
pain. I'd try \copy and doing the query after \a.

Greetings,

Andres Freund

#3Rob Sargent
robjsargent@gmail.com
In reply to: Andres Freund (#2)
Re: heads up on large text fields.

On Sep 21, 2018, at 7:59 PM, Andres Freund <andres@anarazel.de> wrote:

On 2018-09-21 18:28:37 -0600, Rob Sargent wrote:
"/tmp/regen.file" gets very large, very fast and I have to
pg_terminate_backend. Tried this three times, once using "\o test.blob"
instead.

h009357:loader$ ls -ltr
total 2048
-rwxr-xr-x 1 u0138544 camplab 10002460672 Sep 21 15:49 test.blob

I suspect the layouting of such wide columns problably creates a lot of
pain. I'd try \copy and doing the query after \a.

Greetings,

Andres Freund

The formatting could be an issue for sure: there are a couple of very long lines early and late in the file. But my real concern is the unending output stream. If I haven’t made some obvious mistake, psql redirect of large text values may have an issue.

#4Andreas Kretschmer
andreas@a-kretschmer.de
In reply to: Rob Sargent (#1)
Re: heads up on large text fields.

Am 22.09.2018 um 02:28 schrieb Rob Sargent:

However, I get into deep dodo when I try redirecting psql output such as

select ld from gt.ld\g /tmp/regen.file

works for me if i start psql with -t -A -o /path/to/file
(pg 10.5, but psql from 11beta3)

Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

#5Rob Sargent
robjsargent@gmail.com
In reply to: Andreas Kretschmer (#4)
Re: heads up on large text fields.

On 09/22/2018 06:00 AM, Andreas Kretschmer wrote:

Am 22.09.2018 um 02:28 schrieb Rob Sargent:

However, I get into deep dodo when I try redirecting psql output such as

    select ld from gt.ld\g /tmp/regen.file

works for me if i start psql with -t -A -o /path/to/file
(pg 10.5, but psql from 11beta3)

Regards, Andreas

OK, I'm a little slow on the uptake.  The few very wide lines (728035
characters) demand that all the other lines be padded and with 132236
lines you end up with a 96G file (with out the smarts provided about).