constant crashing

Started by jackabout 2 years ago23 messagesgeneral
Jump to latest
#1jack
jack4pg@a7q.com

Hello,
I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the updating (UPDATE table SET field=UPPER(field), etc.).

The errors are all "Server closed the connection unexpectedly"

The problem is that the errors are never the same and occur at different times and at different steps in the process.
If I update the machine (apt update/upgrade), reboot and try again, the errors occur at different places.
The errors messages and logs are not helpful and I have not been able to determine the cause.

I decided to load the CSV files into temporary tables and then perform the updates on the temporary tables before moving the data to the main table.
This works for most of the tables, but there are always 1 or 2 that crash when I try to update them.
In my last attempt, all tables updated successfully except the last one, with about 100 million records.

I tried changing these settings to 60 minutes and it still crashes.
SET statement_timeout = '60min';
SET idle_in_transaction_session_timeout = '60min';

Then I tried to update the data in blocks of 100,000 records and it crashed 4 times on 4 different blocks.
So I updated the first crashed block down to the a block of 10 records, until it crashed.
Then I updated each of the 10 records individually to identify the record that is problemantic, but then all 10 records updated without crashing!
Pure insanity!

I am using a dedicated machine which is a 10 core i9 with 128 GIG of RAM and 2 x 4 TB NVMEs.
There is only 1 user on this stand-alone machine, me, and the machine is not connected to the internet or a network.
I have reinstalled ubuntu 22.04 server many times, wiping out the discs and starting over.
Last attempts have been with ubuntu 22.04.04 and postgreSQL 16.
But the errors persist.

I have been working on this for just over 1 year now, documenting every step, and I am still unable to get this to work without it crashing somewhere along the way.
I am beginning to wonder if postgreSQL is bi-polar.

Any help would be greatly appreciated.
Thank you

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: jack (#1)
Re: constant crashing

On 4/14/24 07:24, jack wrote:

Hello,
I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually
into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when
I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the
updating (UPDATE table SET field=UPPER(field), etc.).

The errors are all "Server closed the connection unexpectedly"

I have been working on this for just over 1 year now, documenting every
step, and I am still unable to get this to work without it crashing
somewhere along the way.
I am beginning to wonder if postgreSQL is bi-polar.

More information needed, to start:

1) Complete example of the code in the script.

2) Table definition(s) of those being copied into.

3) Sample of the data being copied.

4) The error message(s) generated.

5) Database locale

Any help would be greatly appreciated.
Thank you

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: jack (#1)
Re: constant crashing

jack <jack4pg@a7q.com> writes:

I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the updating (UPDATE table SET field=UPPER(field), etc.).

The errors are all "Server closed the connection unexpectedly"

What appears in the postmaster log when this happens?

It would also be instructive to run "top" while you trigger
the problem, and see if there is obvious distress like the
server process's memory consumption ballooning.

It smells like a resource-exhaustion problem. Which probably
shouldn't be happening on a machine with 128G RAM, but maybe
something is misconfigured to give up way below that. In any
case, you'll never find the cause from the client side.

Last attempts have been with ubuntu 22.04.04 and postgreSQL 16.

16 dot which?

regards, tom lane

#4jack
jack4pg@a7q.com
In reply to: Tom Lane (#3)
re: constant crashing

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost
PostgreSQL 16.2
I also believe it is a resource issue which can be rectified with a setting, but which setting?
If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified = REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'), '\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND POSITION('--' IN category_modified)>0;
UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified, '-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND category_modified LIKE '%-';

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: jack (#4)
Re: constant crashing

On 4/14/24 09:20, jack wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost

Look at the OS system log.

PostgreSQL 16.2

I also believe it is a resource issue which can be rectified with a
setting, but which setting?
If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified =
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'),
'\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND
POSITION('--' IN category_modified)>0;
UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified,
'-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
category_modified LIKE '%-';

Is the above all being done in one script/transaction?

Again what are the table definitions for the tables being copied into
and/or modified?

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adrian Klaver (#5)
Re: constant crashing

Adrian Klaver <adrian.klaver@aklaver.com> writes:

On 4/14/24 09:20, jack wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost

Look at the OS system log.

Really the most detail would be in postgres' own log. I'd only
expect the system log to contain relevant info if the problem
turns out to be an overeager OOM killer --- but if that's what's
happening, we should be able to diagnose from the postmaster's
log too (because it'd show up as the backend dying from SIGKILL).

regards, tom lane

#7Francisco Olarte
folarte@peoplecall.com
In reply to: jack (#4)
Re: constant crashing

Jack:

On Sun, 14 Apr 2024 at 18:20, jack <jack4pg@a7q.com> wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost

This is a CLIENT-side error. What people are asking you is for the
corresponding error in the SERVER log. When the server has an severe
error, ( probably a resource exhaustion, which normally leads to
abnormal termination as it has been pointed out previously ) the
connection gets closed, and you should go to the server to look at the
eigen cause.

PostgreSQL 16.2
I also believe it is a resource issue which can be rectified with a setting, but which setting?

Not all resource issues can be solved, sometimes one asks for
something which is too much for the available hardware. Anyway, as
pointed out, posting the configuration, the server error, and maybe
monitoring memory ( the usual culprit ) with top may help some one.
Look for memory because a common cause of this kind of things in Linux
is having overcommit on ( overcommit is nice, but basically it lies to
postgres, it tells the server she can have X Gb and then when it tries
to use them kills -9 it )

If you were updating 100 million records what settings would you adjust?

From what you told earlier ( loading from CSV ) and with ....

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified = REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'), '\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND POSITION('--' IN category_modified)>0;
UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified, '-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND category_modified LIKE '%-';

this kind of updates I would recommend stream-filtering the data on
load. I do not know how it does it presently, but those are six
hundred million updates ( unless some are supressed, which is what I
do not remember now ), which makes for huge transactions and is asking
for problems ( or maybe only 100_000_000 if you are autocomitting ).

If postgres does not supress redundant updates ( lots of people can
answer that ) I would start by conditioning the updates ( UPDATE table
SET category_modified = UPPER(category) where category_modified is not
null and category_modified <> UPPER(category); ), this can shave time
and resource usages.

But if your updates are of this kind ( transformation of data on the
current row ( your sample is on a single value of a row, even easier )
without touching other things in the database ), I would encourage to
use a small perl/awk/python/whatever program to filter and transform
the data before loading it. It has several advantages, one is speed,
other is minimizing server load, other is testability ( pipe sample
lines to the filter, check result, when satisfied pipe full file and
pipe result to psql ), and choosing the right language ( I would say
perl, it was dessigned practically for this sort of things ) makes the
filtering really easy. And it seems you are forcing it, I see null
handling in some updates, but not others, this hints off relying on
strict behaviour ( functions returning null on null input ). The only
apparent difficulty is parsing csv, but there are millions of modules
parsing the one understood by postgres ( if you generate your data,
the default text format for postgres is much easier to parse, i.e., in
perl normally a simple split// call does the trick ).

Doing it with a filtering stream would allow you to easily process
gigabytes of data using a few megabytes of ram in the client. I'm not
sure about the server, but stream filtering lends itself to very easy
batching of copies, and from what I read your server is beefy.

Francisco Olarte.

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: Francisco Olarte (#7)
Re: constant crashing

On Sun, Apr 14, 2024 at 10:20 AM Francisco Olarte <folarte@peoplecall.com>
wrote:

If postgres does not supress redundant updates ( lots of people can
answer that )

It does not. My other usual piece of advice along these lines, if doing
the transform outside the database is not desirable, is to at least ensure
you are doing them on a staging table that is defined as either being
temporary or unlogged. WAL production probably isn't causing the crash but
can definitely be an issue and is pointless to incur during most
transformation processing.

David J.

#9Vijaykumar Jain
vijaykumarjain.github@gmail.com
In reply to: jack (#4)
Re: constant crashing

On Sun, 14 Apr 2024 at 21:50, jack <jack4pg@a7q.com> wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost

PostgreSQL 16.2

I also believe it is a resource issue which can be rectified with a
setting, but which setting?
If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified =
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'),
'\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND
POSITION('--' IN category_modified)>0;
UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified,
'-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
category_modified LIKE '%-';

independent of best practices, i just want to check if there is a leak.
I created a sample table with text data and ran updates like yours and I
could not see mem growth, but I have a small vm and ofc your
category_modified field might be more complex than simple text fields for
30-40 chars.

can you grab the pid of your psql backend and (if you have pidstat
installed) monitor resource usage for that pid

postgres@pg:~/udemy/16$ psql
psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
Type "help" for help.

postgres=# select pg_backend_pid();
pg_backend_pid
----------------
1214
(1 row)

# pidstat 2 100 -rud -h -p 1214
(get all stats for that pid) that might help to figure out if there is a
leak or the server has other things competing for memory and your updates
were picked by the killer.

Linux 5.15.0-101-generic (pg) 04/15/24 _x86_64_ (1 CPU)

# Time UID PID %usr %system %guest %wait %CPU CPU
minflt/s majflt/s VSZ RSS %MEM kB_rd/s kB_wr/s kB_ccwr/s
iodelay Command
00:40:25 113 1214 0.00 0.00 0.00 0.00 0.00 0
0.00 0.00 354112 220940 24.18 0.00 0.00 0.00
0 postgres

# Time UID PID %usr %system %guest %wait %CPU CPU
minflt/s majflt/s VSZ RSS %MEM kB_rd/s kB_wr/s kB_ccwr/s
iodelay Command
00:40:27 113 1214 0.00 0.00 0.00 0.00 0.00 0
0.00 0.00 354112 220940 24.18 0.00 0.00 0.00
0 postgres
....

ofc, if there is a genuine leak , then there might be more digging
needed Finding
memory leaks in Postgres C code (enterprisedb.com)
<https://www.enterprisedb.com/blog/finding-memory-leaks-postgres-c-code&gt;
just kill the process requesting more mem than available Memory context:
how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)
<https://www.cybertec-postgresql.com/en/memory-context-for-postgresql-memory-management/&gt;

--
Thanks,
Vijay
LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/&gt;

#10jack
jack4pg@a7q.com
In reply to: Vijaykumar Jain (#9)
re: constant crashing

Here is an excerpt of /var/log/postgresql/postgresql-16-main.log

2024-04-14 12:17:42.321 EDT [7124] LOG: checkpoint starting: wal
2024-04-14 12:17:43.153 EDT [1227] LOG: server process (PID 7289) was terminated by signal 11: Segmentation fault
2024-04-14 12:17:43.153 EDT [1227] DETAIL: Failed process was running: UPDATE main SET category_modified = UPPER(category), sub_category=UPPER(sub_category), code1=UPPER(code1), code2=UPPER(code2);
2024-04-14 12:17:43.153 EDT [1227] LOG: terminating any other active server processes
2024-04-14 12:17:43.166 EDT [1227] LOG: all server processes terminated; reinitializing
2024-04-14 12:17:43.206 EDT [7320] LOG: database system was interrupted; last known up at 2024-04-14 12:17:42 EDT
2024-04-14 12:17:43.484 EDT [7320] LOG: database system was not properly shut down; automatic recovery in progress
2024-04-14 12:17:43.487 EDT [7320] LOG: redo starts at 1260/66400608
2024-04-14 12:17:44.985 EDT [7320] LOG: unexpected pageaddr 1260/4A7CA000 in WAL segment 00000001000012600000008C, LSN 1260/8C7CA000, offset 8167424
2024-04-14 12:17:44.986 EDT [7320] LOG: redo done at 1260/8C7C9F50 system usage: CPU: user: 1.20 s, system: 0.28 s, elapsed: 1.49 s
2024-04-14 12:17:44.999 EDT [7321] LOG: checkpoint starting: end-of-recovery immediate wait
2024-04-14 12:17:45.251 EDT [7321] LOG: checkpoint complete: wrote 16284 buffers (99.4%); 0 WAL file(s) added, 0 removed, 38 recycled; write=0.082 s, sync=0.143 s, total=0.252 s; sync files=6, longest=0.075 s, average=0.024 s; distance=626470 kB, estimate=626470 kB; lsn=1260/8C7CA048, redo lsn=1260/8C7CA048
2024-04-14 12:17:45.264 EDT [1227] LOG: database system is ready to accept connections
2024-04-14 12:28:32.526 EDT [7321] LOG: checkpoint starting: wal
2024-04-14 12:28:45.066 EDT [7321] LOG: checkpoint complete: wrote 12 buffers (0.1%); 0 WAL file(s) added, 0 removed, 33 recycled; write=12.371 s, sync=0.144 s, total=12.541 s; sync files=11, longest=0.121 s, average=0.013 s; distance=536816 kB, estimate=617505 kB; lsn=1260/CB4790B8, redo lsn=1260/AD406208
2024-04-14 12:28:45.887 EDT [7321] LOG: checkpoints are occurring too frequently (13 seconds apart)
2024-04-14 12:28:45.887 EDT [7321] HINT: Consider increasing the configuration parameter "max_wal_size".
2024-04-14 12:28:45.887 EDT [7321] LOG: checkpoint starting: wal
Should I increase the max_wal_size to 2GB ?

#11Vijaykumar Jain
vijaykumarjain.github@gmail.com
In reply to: Vijaykumar Jain (#9)
Re: constant crashing

Ignore my thread, I guess there might be a bug given it segfaulted.

On Mon, Apr 15, 2024, 12:48 AM Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

Show quoted text

On Sun, 14 Apr 2024 at 21:50, jack <jack4pg@a7q.com> wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost

PostgreSQL 16.2

I also believe it is a resource issue which can be rectified with a
setting, but which setting?
If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified =
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'),
'\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND
POSITION('--' IN category_modified)>0;
UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified,
'-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
category_modified LIKE '%-';

independent of best practices, i just want to check if there is a leak.
I created a sample table with text data and ran updates like yours and I
could not see mem growth, but I have a small vm and ofc your
category_modified field might be more complex than simple text fields for
30-40 chars.

can you grab the pid of your psql backend and (if you have pidstat
installed) monitor resource usage for that pid

postgres@pg:~/udemy/16$ psql
psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
Type "help" for help.

postgres=# select pg_backend_pid();
pg_backend_pid
----------------
1214
(1 row)

# pidstat 2 100 -rud -h -p 1214
(get all stats for that pid) that might help to figure out if there is a
leak or the server has other things competing for memory and your updates
were picked by the killer.

Linux 5.15.0-101-generic (pg) 04/15/24 _x86_64_ (1 CPU)

# Time UID PID %usr %system %guest %wait %CPU CPU
minflt/s majflt/s VSZ RSS %MEM kB_rd/s kB_wr/s kB_ccwr/s
iodelay Command
00:40:25 113 1214 0.00 0.00 0.00 0.00 0.00 0
0.00 0.00 354112 220940 24.18 0.00 0.00 0.00
0 postgres

# Time UID PID %usr %system %guest %wait %CPU CPU
minflt/s majflt/s VSZ RSS %MEM kB_rd/s kB_wr/s kB_ccwr/s
iodelay Command
00:40:27 113 1214 0.00 0.00 0.00 0.00 0.00 0
0.00 0.00 354112 220940 24.18 0.00 0.00 0.00
0 postgres
....

ofc, if there is a genuine leak , then there might be more digging needed Finding
memory leaks in Postgres C code (enterprisedb.com)
<https://www.enterprisedb.com/blog/finding-memory-leaks-postgres-c-code&gt;
just kill the process requesting more mem than available Memory context:
how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)
<https://www.cybertec-postgresql.com/en/memory-context-for-postgresql-memory-management/&gt;

--
Thanks,
Vijay
LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/&gt;

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: jack (#10)
Re: constant crashing

jack <jack4pg@a7q.com> writes:

Here is an excerpt of /var/log/postgresql/postgresql-16-main.log
2024-04-14 12:17:43.153 EDT [1227] LOG: server process (PID 7289) was terminated by signal 11: Segmentation fault

That's pretty interesting; it suggests an actual PG bug.
Can you get a stack trace from that? See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Also, it's probably time you showed us the complete DDL definition
for the tables you're working with. It's fairly hard to believe
this would happen without something rather out-of-the-ordinary
in the definitions. Or ... do you have any PG extensions installed?

regards, tom lane

#13jack
jack4pg@a7q.com
In reply to: Tom Lane (#12)
re: constant crashing

Here is the table structure:

The fields being updated are the ones that are NOT named field##.

Except for "3fc" which I left as is to show that it is named differently in the unlikely chance that this would be causing problems.

CREATE TABLE main (
field01 character(10) COLLATE pg_catalog."default",
field02 integer,
field03 character varying(100) COLLATE pg_catalog."default",
field04 character varying(50) COLLATE pg_catalog."default",
field05 character varying(100) COLLATE pg_catalog."default",
category character varying(100) COLLATE pg_catalog."default",
field07 character varying(100) COLLATE pg_catalog."default",
category_modified character varying(100) COLLATE pg_catalog."default",
field09 text COLLATE pg_catalog."default",
field10 character varying(100) COLLATE pg_catalog."default",
field11 character(1) COLLATE pg_catalog."default",
"3fc" character(3) COLLATE pg_catalog."default",
field12 text COLLATE pg_catalog."default",
field13 text COLLATE pg_catalog."default",
field14 text COLLATE pg_catalog."default",
field15 text COLLATE pg_catalog."default",
field16 COLLATE pg_catalog."default",
sub_category character(10) COLLATE pg_catalog."default",
field17 character varying(100) COLLATE pg_catalog."default",
field18 character varying(100) COLLATE pg_catalog."default",
field19 character varying(100) COLLATE pg_catalog."default",
field20 character varying(50) COLLATE pg_catalog."default",
code1 character(5) COLLATE pg_catalog."default",
code2 character(10) COLLATE pg_catalog."default",
field21 character varying(100) COLLATE pg_catalog."default",
field22 character varying(50) COLLATE pg_catalog."default",
field23 character varying(50) COLLATE pg_catalog."default",
field24 character varying(50) COLLATE pg_catalog."default",
field25 character varying(50) COLLATE pg_catalog."default",
field26 character varying(50) COLLATE pg_catalog."default",
field27 character varying(50) COLLATE pg_catalog."default",
field28 character varying(50) COLLATE pg_catalog."default",
field29 character varying(50) COLLATE pg_catalog."default",
field31 character varying(50) COLLATE pg_catalog."default",
field32 character varying(10) COLLATE pg_catalog."default",
field33 varying(10) COLLATE pg_catalog."default",
field34 varying(10) COLLATE pg_catalog."default",
field35 varying(10) COLLATE pg_catalog."default",
field36 character varying(50) COLLATE pg_catalog."default",
field37 character(1) COLLATE pg_catalog."default",
field38 character varying(50) COLLATE pg_catalog."default",
field39 boolean,
field40 boolean,
field41 boolean);

#14Adrian Klaver
adrian.klaver@aklaver.com
In reply to: jack (#10)
Re: constant crashing

On 4/14/24 12:22, jack wrote:

Here is an excerpt of /var/log/postgresql/postgresql-16-main.log

Where and how are the CSV files being produced?

What is the database locale?

--
Adrian Klaver
adrian.klaver@aklaver.com

#15jack
jack4pg@a7q.com
In reply to: Adrian Klaver (#14)
re: constant crashing

To show you how bi-polar this is really becoming, I tried a work-around...
I took the table called us113 with 113 million records and tried to break it down into 10 smaller tables each having about 10 million records, using the following code:

\set ECHO all
\set ON_ERROR_STOP on
-- Create 10 new tables (us113_01 to us113_10) similar to the original table (us113)
CREATE TABLE us113_01 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num <= 10000000;
CREATE TABLE us113_02 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 10000000 AND row_num <= 20000000;
CREATE TABLE us113_03 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 20000000 AND row_num <= 30000000;
CREATE TABLE us113_04 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 30000000 AND row_num <= 40000000;
CREATE TABLE us113_05 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 40000000 AND row_num <= 50000000;
CREATE TABLE us113_06 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 50000000 AND row_num <= 60000000;
CREATE TABLE us113_07 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 60000000 AND row_num <= 70000000;
CREATE TABLE us113_08 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 70000000 AND row_num <= 80000000;
CREATE TABLE us113_09 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 80000000 AND row_num <= 90000000;
CREATE TABLE us113_10 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 90000000;
...and of course it crashed after creating 7 tables.

2024-04-14 15:59:12.294 EDT [1212] LOG: database system is ready to accept connections
2024-04-14 16:00:39.326 EDT [1668] postgres@lf ERROR: could not access status of transaction 3687904299
2024-04-14 16:00:39.326 EDT [1668] postgres@lf DETAIL: Could not open file "pg_xact/0DBD": No such file or directory.
2024-04-14 16:00:39.326 EDT [1668] postgres@lf STATEMENT: CREATE TABLE us113_08 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub WHERE row_num > 70000000 AND row_num <= 80000000;
This is what I am dealing with.
Every turn I take to get around a problem, I get more errors.

I am not sure if this makes a difference but the machine actually has 144 GIG of RAM not 128 GIG. I know that Windows may have an issue with this, but I would not think ubuntu would. But I thought I'd throw that into the mess anyway.

#16jack
jack4pg@a7q.com
In reply to: jack (#15)
re: constant crashing

The CSV files are being produced by another system, a WIndows app on a Windows machine. I then copy them to a USB key and copy them onto the ubuntu machine. The data is then imported via the COPY command.

COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table without any problems. The issue is only when I start to update the single table. And that is why I started using smaller temporary tables for each CSV file, to do the updates in the smaller tables before I move them all to a single large table.

After all the data is loaded and updated, I run php programs on the large table to generate reports. All of which works well EXCEPT for performing the updates on the data. And I do not want to use perl or any outside tool. I want it all one in SQL because I am required to document all my steps so that someone else can take over, so everything needs to be as simple as possible.

#17Adrian Klaver
adrian.klaver@aklaver.com
In reply to: jack (#16)
Re: constant crashing

On 4/14/24 13:18, jack wrote:

The CSV files are being produced by another system, a WIndows app on a
Windows machine. I then copy them to a USB key and copy them onto the
ubuntu machine. The data is then imported via the COPY command.

The app?

The locale in use on the Windows machine?

The locale in use in the database?

COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table
without any problems. The issue is only when I start to update the
single table. And that is why I started using smaller temporary tables
for each CSV file, to do the updates in the smaller tables before I move
them all to a single large table.

The import is just dumping the data in, my suspicion is the problem is
related to using string functions on the data.

After all the data is loaded and updated, I run php programs on the
large table to generate reports. All of which works well EXCEPT for
performing the updates on the data. And I do not want to use perl or any
outside tool. I want it all one in SQL because I am required to document
all my steps so that someone else can take over, so everything needs to
be as simple as possible.

--
Adrian Klaver
adrian.klaver@aklaver.com

#18Adrian Klaver
adrian.klaver@aklaver.com
In reply to: jack (#16)
Re: constant crashing

On 4/14/24 14:50, jack wrote:

Reply to list also
Ccing list

Hello,
I am not sure what "locale" means.

Go to the settings App for whatever version of Windows you are on and
search for locale.

The Windows app is an inhouse application which uses Actian-Zen SQL.
The data is exported to simple ASCII in a tab delimited format similar to CSV.

And you know it is ASCII for a fact?

Those files are then imported into the PostgreSQL table using COPY.
Importing the data is not an issue.
I am able to load all the data without any problems, even into 1 table which ends up with about 1.2 billion records.
But when I try to update the data in that table I get many errors, essentially crashes.

Repeating what has been asked and answered it not really going anywhere.

There may be some control characters (garbage) in the data but that should not crash postgresql, especially if it can import the data without issues.

Unless it does. That is the point of the questions, getting to what is
actually causing the issue. Until the problem can be boiled down to a
reproducible test case there really is not much hope of anything more
then the the 'yes you have a problem' answer. And there is a difference
between dumping data into a table and then doing an UPGRADE where the
data strings are manipulated by functions.

Anyway, I hope I answered your questions.
Thanks for your help.

On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 4/14/24 13:18, jack wrote:

The CSV files are being produced by another system, a WIndows app on a
Windows machine. I then copy them to a USB key and copy them onto the
ubuntu machine. The data is then imported via the COPY command.

The app?

The locale in use on the Windows machine?

The locale in use in the database?

COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table
without any problems. The issue is only when I start to update the
single table. And that is why I started using smaller temporary tables
for each CSV file, to do the updates in the smaller tables before I move
them all to a single large table.

The import is just dumping the data in, my suspicion is the problem is
related to using string functions on the data.

After all the data is loaded and updated, I run php programs on the
large table to generate reports. All of which works well EXCEPT for
performing the updates on the data. And I do not want to use perl or any
outside tool. I want it all one in SQL because I am required to document
all my steps so that someone else can take over, so everything needs to
be as simple as possible.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adrian Klaver (#18)
Re: constant crashing

Adrian Klaver <adrian.klaver@aklaver.com> writes:

Unless it does. That is the point of the questions, getting to what is
actually causing the issue. Until the problem can be boiled down to a
reproducible test case there really is not much hope of anything more
then the the 'yes you have a problem' answer.

IIUC, the crashes are not reproducible for the OP either. I hesitate
to suggest "maybe the hardware is flaky" because that seems like
passing the buck. Still, from the info we have the table definitions
are extremely simple and so are the queries, making it hard to believe
that some rare Postgres bug is being tripped over. It makes no sense
to be seeing this level of instability in mainstream usage on a stable
OS and stable Postgres release. So I'm starting to think that a
hardware problem might be the most plausible explanation.

In any case, there is no information we have that suggests any
plausible line of investigation towards a software problem.
If we can see a crash stack trace or two, maybe that would change.

regards, tom lane

#20jack
jack4pg@a7q.com
In reply to: Adrian Klaver (#18)
Re: constant crashing

I wrote the windows app.
I export all data to simple ASCII text where fields are delimited with a tab and then run the file through a UTF8 converter (convertcp_v8.3_x86).

I will try the entire process on a Xeon E5-1620 and let it run during the night to see what happens. But the current i9 machine is a machine from only 4 years ago which should have no issues.

On Sunday, April 14th, 2024 at 8:50 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Show quoted text

On 4/14/24 14:50, jack wrote:

Reply to list also
Ccing list

Hello,
I am not sure what "locale" means.

Go to the settings App for whatever version of Windows you are on and
search for locale.

The Windows app is an inhouse application which uses Actian-Zen SQL.
The data is exported to simple ASCII in a tab delimited format similar to CSV.

And you know it is ASCII for a fact?

Those files are then imported into the PostgreSQL table using COPY.
Importing the data is not an issue.
I am able to load all the data without any problems, even into 1 table which ends up with about 1.2 billion records.
But when I try to update the data in that table I get many errors, essentially crashes.

Repeating what has been asked and answered it not really going anywhere.

There may be some control characters (garbage) in the data but that should not crash postgresql, especially if it can import the data without issues.

Unless it does. That is the point of the questions, getting to what is
actually causing the issue. Until the problem can be boiled down to a
reproducible test case there really is not much hope of anything more
then the the 'yes you have a problem' answer. And there is a difference
between dumping data into a table and then doing an UPGRADE where the
data strings are manipulated by functions.

Anyway, I hope I answered your questions.
Thanks for your help.

On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver adrian.klaver@aklaver.com wrote:

On 4/14/24 13:18, jack wrote:

The CSV files are being produced by another system, a WIndows app on a
Windows machine. I then copy them to a USB key and copy them onto the
ubuntu machine. The data is then imported via the COPY command.

The app?

The locale in use on the Windows machine?

The locale in use in the database?

COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table
without any problems. The issue is only when I start to update the
single table. And that is why I started using smaller temporary tables
for each CSV file, to do the updates in the smaller tables before I move
them all to a single large table.

The import is just dumping the data in, my suspicion is the problem is
related to using string functions on the data.

After all the data is loaded and updated, I run php programs on the
large table to generate reports. All of which works well EXCEPT for
performing the updates on the data. And I do not want to use perl or any
outside tool. I want it all one in SQL because I am required to document
all my steps so that someone else can take over, so everything needs to
be as simple as possible.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#21David Rowley
dgrowleyml@gmail.com
In reply to: jack (#1)
#22David G. Johnston
david.g.johnston@gmail.com
In reply to: jack (#20)
#23Francisco Olarte
folarte@peoplecall.com
In reply to: jack (#16)