Intermittent errors when fetching cursor rows on PostgreSQL 16
Good day.
My name is Enrico Schenone, from Genoa, Italy.
I'm a software achitect working at Cleis Tech - Genoa - Italy -
http://gruppocleis.it
Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and
Ubuntu Server 22.04 LTS with no-cluster configuration.
Our applications are developed with 4Js Genero platform (classified as
LCAP) - https://4js.com
I whish to report an issue where I can't say if it happens at server or
client side (or both as well).
The problem occurs occasionally and only when fetching rows from a
server-side cursor. The related query may be complex with joins or very
easy (just one static table with 86 rows without WHERE conditions).
I have set the "debug5" verbosity level of PostgreSQL log and I have
extracted from millions of log line those who are belonging to separate
failing sessions/connections.
At the same time I have extracted the related application log.
For each failure reported into the client-side application log, I have a
distinct PostgreSQL detailed log.
Then I have merged client-side end server-side logs along the timeline
and I have observed what client and server does.
For example (S: means PostgreSQL Server log, while C: means Client log):
|S||:|
|||2024-12-16 17:27:14.*406* CET [2214722] cleistech@hh24odds_prod -
192.168.16.179000006*76054e0.21cb42* STATEMENT: *fetch forward 50 from cu6*
||C:|
|ERROR ; 2024-12-16 17:27:14.*407*; PID: 104257; User: genero; Ricerca
quote evento 1433958 fallita. General SQL error, check
SQLCA.SQLERRD[2]. - *SQLSTATE: XX001*|
|
S:|
|2024-12-16 17:27:14.*407* CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908006*676054e0.21cb42* LOG: ||*08006*||*: could not
receive data from client*: Connection reset by peer |
|2024-12-16 17:27:14.*407* CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908006*676054e0.21cb42* LOCATION: pq_recvbuf, pqcomm.c:953 |
|2024-12-16 17:27:14.*407* CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908003*676054e0.21cb42* DEBUG: *08003*: unexpected EOF on
client connection
||C:|
|ERROR ; 2024-12-16 17:27:14.*408*; PID: 104257; User: genero;
ver_quote: ERRORE in foreach ricerca bettype con spread. *SQLSTATE:
XX000* - SQLCODE: -6372 - -1 - *no connection to the
server* - abbandono validazione. |
Before failing on the reported cursor, the program succesfully creates
and uses other cursors.
When the issue is detected at client-side, the program terminates
without any abort and it is re-instantiated in seconds or minutes by a
scheduler.
After a variable time (normally some minutes) and several failures it
goes to normal end without errors.
What I reported in the body of this mail is only a subset of postgreSQL
and application logs. I can send several log files each reporting a
distinct and complete connection ID history.
I tried to reproduce the issue on a "in-vitro" environment, with
single-to-multiple instances of the same program (up to 99 parallel
instances) and I have executed more than half million of test processes
without errors.
This problem commonly happens only in production environments.
Production environments can be:
* Distinct application server and DB server on distinct subnets (no
dropped packet detected on firewall, no memory/disk/network failure
detected by "nmon" tool)
* Distinct application server and DB server on same subnet (no firewall)
* Same server for PostgreSQL and applications
The VM running the PostgreSQL that I have used for my test is an Ubuntu
Server 22.04 LTS with 16 CPUs and 64 GB of RAM.
For client applications I use Ubuntu Server 22.04 LTS.
The postgresql.conf file is attached to this e-mail.
I'm able to detect that there is an error but I really becomes mad
trying to find it. It seems a phantom that I know to exist but I can't
bring up.
I kindly ask you to help me understand what and where is the problem,
and how to solve it.
Hoping you can help me or address to someone who can do it.
Thanks in advance.
Enrico
--
*Enrico Schenone*
Software Architect
*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320 7709352
E-mail: eschenone@cleistech.it
On Wed, Dec 18, 2024 at 5:01 AM Enrico Schenone <eschenone@cleistech.it>
wrote:
Good day.
My name is Enrico Schenone, from Genoa, Italy.
I'm a software achitect working at Cleis Tech - Genoa - Italy -
http://gruppocleis.it
Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and Ubuntu
Server 22.04 LTS with no-cluster configuration.
Our applications are developed with 4Js Genero platform (classified as
LCAP) - https://4js.comI whish to report an issue where I can't say if it happens at server or
client side (or both as well).
The problem occurs occasionally and only when fetching rows from a
server-side cursor. The related query may be complex with joins or very
easy (just one static table with 86 rows without WHERE conditions).
I have set the "debug5" verbosity level of PostgreSQL log and I have
extracted from millions of log line those who are belonging to separate
failing sessions/connections.
At the same time I have extracted the related application log.
For each failure reported into the client-side application log, I have a
distinct PostgreSQL detailed log.Then I have merged client-side end server-side logs along the timeline and
I have observed what client and server does.
For example (S: means PostgreSQL Server log, while C: means Client log):
Can you replicate the error in Prod using psql and cursors?
See
https://www.postgresql.org/docs/current/plpgsql-cursors.html#PLPGSQL-CURSOR-USING
Section 41.7.3.5.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 12/17/24 08:30, Enrico Schenone wrote:
Good day.
My name is Enrico Schenone, from Genoa, Italy.
I'm a software achitect working at Cleis Tech - Genoa - Italy -
http://gruppocleis.it
Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and
Ubuntu Server 22.04 LTS with no-cluster configuration.
Our applications are developed with 4Js Genero platform (classified as
LCAP) - https://4js.comI whish to report an issue where I can't say if it happens at server or
client side (or both as well).
This:
"unexpected EOF on client connection "
makes me believe this is on client side.
To be clear the client is running on Ubuntu Server 22.04, correct?
Have you looked at the OS system log for relevant entries at the time
the error occurs?
If so what are they?
This only happens in production environment, is there anything in it
that is materially different from where you ran the test below?
Hoping you can help me or address to someone who can do it.
Thanks in advance.
Enrico
--*Enrico Schenone*
Software Architect
--
Adrian Klaver
adrian.klaver@aklaver.com
Good day, Adrian.
First of all I thank-you for you answer.
Yes, the client is running on Ubuntu 22.04 LTS in this case, bu I have
observed the same problem on Debian 10 and 12 and Postgres versions 12,
13 and 16.
In the reported case no relevant info has been reported on syslog
and/or kern.log at both client and server sides.
What is different between production and test environment in the
reported case is
the OS (Debian 12 on test DB server and Debian 10 on ckìlient server)
the test DB is a clone of the production DB but no other task was
running at client side, while in production we have a lot of SQL
operations running invoked by a lot of batch (daemons) services running
on production client.
This is a typical load of the DB server ...
top - 08:18:38 up 28 days, 13:53, 1 user, load average: 1,15, 1,39, 1,29
Tasks: 381 total, 2 running, 379 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3,5 us, 1,7 sy, 0,0 ni, 94,1 id, 0,0 wa, 0,0 hi, 0,7 si,
0,0 st
MiB Mem : 64394,2 total, 35903,3 free, 1001,4 used, 27489,6 buff/cache
MiB Swap: 65536,0 total, 65519,1 free, 16,9 used. 58034,8 avail Mem
At the time the error occurs, dozens of other SQL sessions are active &
running on DB server, and no-one is reporting any error at-all (not only
fetch errors).
This happens sometimes also in system with no (or low) stress situations.
One of things I don't understand is why at client side I get the XX001
error on the FETCH (normally the first fetch) while at server side I
heve no error related to the fetch forward ?
Another is why in the meantime no other client application report an
error, considering that there may be several parallel instances of the
same client application ?
And finally why after seconds or minutes the same process newly
instantiated works with no more errors ?
I can suppose that the client closes the connection once got the XX001
error, but I can't say why it receives this error while it is not
reported at server side and not block i/o error is reported.
Is it a false positive or what ?
Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client -
https://www.postgresql.org/docs/current/libpq.html >
At client side I have installed the following PostgreSQL packages ...
postgresql-client-16:amd64/jammy-pgdg 16.5-1.pgdg22.04+1
upgradeable to 16.6-1.pgdg22.04+1
postgresql-client-common:all/jammy-pgdg 262.pgdg22.04+1 upgradeable
to 267.pgdg22.04+1
Best regards.
Enrico
*Enrico Schenone*
Software Architect
*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320 7709352
E-mail: eschenone@cleistech.it
<https://gruppocleis.it><https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
Il 19/12/24 00:11, Adrian Klaver ha scritto:
Show quoted text
On 12/17/24 08:30, Enrico Schenone wrote:
Good day.
My name is Enrico Schenone, from Genoa, Italy.
I'm a software achitect working at Cleis Tech - Genoa - Italy -
http://gruppocleis.it
Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and
Ubuntu Server 22.04 LTS with no-cluster configuration.
Our applications are developed with 4Js Genero platform (classified
as LCAP) - https://4js.comI whish to report an issue where I can't say if it happens at server
or client side (or both as well).This:
"unexpected EOF on client connection "
makes me believe this is on client side.
To be clear the client is running on Ubuntu Server 22.04, correct?
Have you looked at the OS system log for relevant entries at the time
the error occurs?If so what are they?
This only happens in production environment, is there anything in it
that is materially different from where you ran the test below?Hoping you can help me or address to someone who can do it.
Thanks in advance.
Enrico
--*Enrico Schenone*
Software Architect
On 12/18/24 23:52, Enrico Schenone wrote:
Good day, Adrian.
First of all I thank-you for you answer.
At the time the error occurs, dozens of other SQL sessions are active &
running on DB server, and no-one is reporting any error at-all (not only
fetch errors).
This happens sometimes also in system with no (or low) stress situations.One of things I don't understand is why at client side I get the XX001
error on the FETCH (normally the first fetch) while at server side I
heve no error related to the fetch forward ?
Where are you fetching the client error messages from?
Another is why in the meantime no other client application report an
error, considering that there may be several parallel instances of the
same client application ?
And finally why after seconds or minutes the same process newly
instantiated works with no more errors ?
Answers to this and the below is going to need the client code.
I can suppose that the client closes the connection once got the XX001
error, but I can't say why it receives this error while it is not
reported at server side and not block i/o error is reported.
Is it a false positive or what ?Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client -
https://www.postgresql.org/docs/current/libpq.html >At client side I have installed the following PostgreSQL packages ...
postgresql-client-16:amd64/jammy-pgdg 16.5-1.pgdg22.04+1
upgradeable to 16.6-1.pgdg22.04+1
postgresql-client-common:all/jammy-pgdg 262.pgdg22.04+1 upgradeable
to 267.pgdg22.04+1Best regards.
Enrico*Enrico Schenone*
Software Architect*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320 7709352
E-mail: eschenone@cleistech.it<https://gruppocleis.it><https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
Il 19/12/24 00:11, Adrian Klaver ha scritto:On 12/17/24 08:30, Enrico Schenone wrote:
Good day.
My name is Enrico Schenone, from Genoa, Italy.
I'm a software achitect working at Cleis Tech - Genoa - Italy -
http://gruppocleis.it
Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and
Ubuntu Server 22.04 LTS with no-cluster configuration.
Our applications are developed with 4Js Genero platform (classified
as LCAP) - https://4js.comI whish to report an issue where I can't say if it happens at server
or client side (or both as well).This:
"unexpected EOF on client connection "
makes me believe this is on client side.
To be clear the client is running on Ubuntu Server 22.04, correct?
Have you looked at the OS system log for relevant entries at the time
the error occurs?If so what are they?
This only happens in production environment, is there anything in it
that is materially different from where you ran the test below?Hoping you can help me or address to someone who can do it.
Thanks in advance.
Enrico
--*Enrico Schenone*
Software Architect
--
Adrian Klaver
adrian.klaver@aklaver.com
Good day, Adrian.
I get the error inside the program by catching the exception and logging
it with diagnostic info provided by the DVM (a runtime interpreter
similar in concept to a JVM) that embed the PG driver.
This is the fragment of the source code where the error occurs ...
comments are in blue color
# composing the query string with hardcoded WHERE part. In other cases
the query is parametric and values are passed with the FOREACH (FOREACH
... USING <argument list> ...)
LET l_qry = "SELECT * FROM quote_forn "||
" WHERE evento_id = "||t_qtf.evento_id||
" and bt_id = "||idhh_bt
PREPARE q_stm FROM l_qry
DECLARE c_cur CURSOR FOR q_stm
TRY
LET c_qtf = ar_qtforn.getLength()
LET j = c_qtf
# FOREACH is a code structure who simplifies the OPEN/FETCH/CLOSE
structure. It is translated at runtime to OPEN cursor ... FETCH rows ...
CLOSE cursor.
FOREACH c_cur INTO r_qtf.*
FOR v = 1 TO c_qtf
IF ... ... THEN ... ... END IF
END FOR
LET j = j + 1
LET ar_qtforn[j].id = r_qtf.id
... ...
... ...
END FOREACH
CATCH
LET str_msg = "Some error message ... event: ", t_qtf.evento_id, "
has failed. ", SQLERRMESSAGE, " - SQLSTATE: ", SQLSTATE # SQLSTATE is a
predefined variable containing the native PG SQLSTATE
# Write an application log line (the one I have sent to you inside the
cross log comparison along the timeline)
CALL GesLog(NULL, 1, str_msg)
END TRY
... ...
... ...
After that the program tries a single-row SELECT on a table just to
check if it is still able to get data from DB, and it fails with
*SQLSTATE XX000* (what you see into the log fragment one millisecond
after the server log reports error *08003*).
In some cases the query can be very complex and in other very simple, it
seems doesn't matters.
As you can see the code is very simple, but 999 times it works fine and
one time it fail returning *XX001* for minutes, and in the meantime a
lot of SQL operations including FETCH cursor works well.
I can't give you info on what the DVM does at low level, but I can send
you the distinct full session log fragment at server side, where it is
quite easy to understand how the DVM translates the program's SQL
queries end what PostgreSQL does.
May I give you any other info ?
Do you think it can be useful to include in this thread the 4Js Suppory
guys ?
Thanks again and best regards.
Enrico
*Enrico Schenone*
Software Architect
*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320 7709352
E-mail: eschenone@cleistech.it
<https://gruppocleis.it><https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
Il 19/12/24 17:34, Adrian Klaver ha scritto:
Show quoted text
On 12/18/24 23:52, Enrico Schenone wrote:
Good day, Adrian.
First of all I thank-you for you answer.At the time the error occurs, dozens of other SQL sessions are active
& running on DB server, and no-one is reporting any error at-all (not
only fetch errors).
This happens sometimes also in system with no (or low) stress
situations.One of things I don't understand is why at client side I get the
XX001 error on the FETCH (normally the first fetch) while at server
side I heve no error related to the fetch forward ?Where are you fetching the client error messages from?
Another is why in the meantime no other client application report an
error, considering that there may be several parallel instances of
the same client application ?
And finally why after seconds or minutes the same process newly
instantiated works with no more errors ?Answers to this and the below is going to need the client code.
I can suppose that the client closes the connection once got the
XX001 error, but I can't say why it receives this error while it is
not reported at server side and not block i/o error is reported.
Is it a false positive or what ?Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client -
https://www.postgresql.org/docs/current/libpq.html >At client side I have installed the following PostgreSQL packages ...
postgresql-client-16:amd64/jammy-pgdg 16.5-1.pgdg22.04+1
upgradeable to 16.6-1.pgdg22.04+1
postgresql-client-common:all/jammy-pgdg 262.pgdg22.04+1
upgradeable to 267.pgdg22.04+1Best regards.
Enrico*Enrico Schenone*
Software Architect*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320 7709352
E-mail: eschenone@cleistech.it<https://gruppocleis.it><https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
Il 19/12/24 00:11, Adrian Klaver ha scritto:On 12/17/24 08:30, Enrico Schenone wrote:
Good day.
My name is Enrico Schenone, from Genoa, Italy.
I'm a software achitect working at Cleis Tech - Genoa - Italy -
http://gruppocleis.it
Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and
Ubuntu Server 22.04 LTS with no-cluster configuration.
Our applications are developed with 4Js Genero platform (classified
as LCAP) - https://4js.comI whish to report an issue where I can't say if it happens at
server or client side (or both as well).This:
"unexpected EOF on client connection "
makes me believe this is on client side.
To be clear the client is running on Ubuntu Server 22.04, correct?
Have you looked at the OS system log for relevant entries at the
time the error occurs?If so what are they?
This only happens in production environment, is there anything in it
that is materially different from where you ran the test below?Hoping you can help me or address to someone who can do it.
Thanks in advance.
Enrico
--*Enrico Schenone*
Software Architect
On 12/19/24 10:11, Enrico Schenone wrote:
Good day, Adrian.
I get the error inside the program by catching the exception and logging
it with diagnostic info provided by the DVM (a runtime interpreter
similar in concept to a JVM) that embed the PG driver.
DVM is this?:
https://www.geeksforgeeks.org/what-is-dvmdalvik-virtual-machine/
In other words an Android client?
I can't give you info on what the DVM does at low level, but I can send
you the distinct full session log fragment at server side, where it is
quite easy to understand how the DVM translates the program's SQL
queries end what PostgreSQL does.
That might be useful.
May I give you any other info ?
Not at the moment.
Do you think it can be useful to include in this thread the 4Js Suppory
guys ?
I could see filing an issue and pointing at this thread:
/messages/by-id/446423eb-4a4e-4135-bbb8-4d0e5c7aac3b@cleistech.it
Thanks again and best regards.
Enrico
--
Adrian Klaver
adrian.klaver@aklaver.com
Hello, my answers in line along your message ...
Thanks a lot again.
Enrico
*Enrico Schenone*
Software Architect
*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320 7709352
E-mail: eschenone@cleistech.it
<https://gruppocleis.it><https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
Il 19/12/24 19:27, Adrian Klaver ha scritto:
On 12/19/24 10:11, Enrico Schenone wrote:
Good day, Adrian.
I get the error inside the program by catching the exception and
logging it with diagnostic info provided by the DVM (a runtime
interpreter similar in concept to a JVM) that embed the PG driver.DVM is this?:
https://www.geeksforgeeks.org/what-is-dvmdalvik-virtual-machine/
The 4Js DVM (Dynamic Virtual Machine) is that one
https://4js.com/online_documentation/fjs-gas-manual-html/index.html#gas-topics/c_gas_what_is_dvm.html
In other words an Android client?
No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS and
other unix-like OSs. It ensures the portability of 4Js Genero compiled
programs (p-code) on several OS platforms.
4Js Genero is a Low Code Application Platform. The programming language,
named "BDL - Business Development Language", is an evolution of the
Informix-4gl.
Compiled programs needs a runtime interpreter (DVM) to be executed.
The DVM embeds at low-level the DB drivers provided by several vendors,
and at BDL high level the application program can easily connect to the
major DBs on the market thanks to its ODI (Open Database Interface).
I can't give you info on what the DVM does at low level, but I can
send you the distinct full session log fragment at server side, where
it is quite easy to understand how the DVM translates the program's
SQL queries end what PostgreSQL does.That might be useful.
Please take a look to the attached text file, that is the full failing
session log (filtered from the debug5 PostgreSQL server log).
Show quoted text
May I give you any other info ?
Not at the moment.
Do you think it can be useful to include in this thread the 4Js
Suppory guys ?I could see filing an issue and pointing at this thread:
/messages/by-id/446423eb-4a4e-4135-bbb8-4d0e5c7aac3b@cleistech.it
Thanks again and best regards.
Enrico
On 12/19/24 11:40 AM, Enrico Schenone wrote:
Hello, my answers in line along your message ...
Thanks a lot again.Enrico
On 12/19/24 10:11, Enrico Schenone wrote:
Good day, Adrian.
I get the error inside the program by catching the exception and
logging it with diagnostic info provided by the DVM (a runtime
interpreter similar in concept to a JVM) that embed the PG driver.
The 4Js DVM (Dynamic Virtual Machine) is that one
https://4js.com/online_documentation/fjs-gas-manual-html/index.html#gas-topics/c_gas_what_is_dvm.htmlIn other words an Android client?
No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS and
other unix-like OSs. It ensures the portability of 4Js Genero compiled
programs (p-code) on several OS platforms.
4Js Genero is a Low Code Application Platform. The programming language,
named "BDL - Business Development Language", is an evolution of the
Informix-4gl.
Compiled programs needs a runtime interpreter (DVM) to be executed.
The DVM embeds at low-level the DB drivers provided by several vendors,
From previous post you mentioned:
"Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client "
So are they building their own driver over libpq?
and at BDL high level the application program can easily connect to the
major DBs on the market thanks to its ODI (Open Database Interface).I can't give you info on what the DVM does at low level, but I can
send you the distinct full session log fragment at server side, where
it is quite easy to understand how the DVM translates the program's
SQL queries end what PostgreSQL does.That might be useful.
Please take a look to the attached text file, that is the full failing
session log (filtered from the debug5 PostgreSQL server log).
This is where it falls off the rails, but I can't see why?:
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOG: 00000: statement: fetch forward
50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: exec_simple_query,
postgres.c:1073
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 DEBUG: 00000: CommitTransaction(1)
name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.407 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908006676054e0.21cb42 LOG: 08006: could not receive data
from client: Connessione interrotta dal corrispondente
Thanks again and best regards.
Enrico
--
Adrian Klaver
adrian.klaver@aklaver.com
At 19/12/24 22:47, Adrian Klaver wrote:
On 12/19/24 11:40 AM, Enrico Schenone wrote:
Hello, my answers in line along your message ...
Thanks a lot again.Enrico
On 12/19/24 10:11, Enrico Schenone wrote:
Good day, Adrian.
I get the error inside the program by catching the exception and
logging it with diagnostic info provided by the DVM (a runtime
interpreter similar in concept to a JVM) that embed the PG driver.The 4Js DVM (Dynamic Virtual Machine) is that one
https://4js.com/online_documentation/fjs-gas-manual-html/index.html#gas-topics/c_gas_what_is_dvm.htmlIn other words an Android client?
No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS
and other unix-like OSs. It ensures the portability of 4Js Genero
compiled programs (p-code) on several OS platforms.
4Js Genero is a Low Code Application Platform. The programming
language, named "BDL - Business Development Language", is an
evolution of the Informix-4gl.
Compiled programs needs a runtime interpreter (DVM) to be executed.
The DVM embeds at low-level the DB drivers provided by several vendors,From previous post you mentioned:
"Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client "So are they building their own driver over libpq?
I think so.
They wrote ...
<
/The error “no connection to the server“ is definitively a PostgreSQL
error:/
/||/
/|./src/interfaces/libpq/fe-exec.c: libpq_append_conn_error(conn, "no
connection to the server");|/
/It is not normal that PostgreSQL client can connect to the server, do
some SQL with success and then the SQL connection gets dropped at the
next SQL statement execution. This is really suspicious./
and at BDL high level the application program can easily connect to
the major DBs on the market thanks to its ODI (Open Database Interface).I can't give you info on what the DVM does at low level, but I can
send you the distinct full session log fragment at server side,
where it is quite easy to understand how the DVM translates the
program's SQL queries end what PostgreSQL does.That might be useful.
Please take a look to the attached text file, that is the full
failing session log (filtered from the debug5 PostgreSQL server log).This is where it falls off the rails, but I can't see why?:
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOG: 00000: statement: fetch
forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: exec_simple_query,
postgres.c:1073
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 DEBUG: 00000: CommitTransaction(1)
name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid:
0/1/02024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.407 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908006676054e0.21cb42 LOG: 08006: could not receive data
from client: Connessione interrotta dal corrispondente
Yes, at 2024-12-16 17:27:14.407
This seems to match exactly with the error XX001 reported by the client
application.
Thanks again and best regards.
Enrico
Best regards.
Enrico
Hi, Adrian.
Today I have collected a tcpdump at client side with communications
between application server and db server while the issue was occurring
one time per second on another program.
I send you two files.
The first one is a zipped tarball (.tgz) containing a text
representation of the tcpdump starting at point where it reports the
declaration of the failing cursor ("cu4" as you can see in the first
line of the file) and subsequent fetch. Consider that the client
application log detected the XX001 error on the first FETCH of the
cursor at 2024-12-20 12:17:35.175
The second file (zipped tarball .tgz) is too big to be sent as
attachment, so I provide a link where it can be downloaded. It is the
fraction of tcpdump recorded during the program failure (occurred
several times). It is in .pcap format so it is possible to open it with
Wireshark or tcpdump -A -r
Anyone interested can download it at
https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgz
Consider that during the dump several different cursor was declared with
the name "cu4", but the one failing is the one of the first line.
Maybe an expert (I'm not so expert) can see if the disconnection is
really made by the client and/or if the data returned by the server are
really corrupted as per XX001 SQLSTATE.
Best regards.
Enrico
Il 19/12/24 22:47, Adrian Klaver ha scritto:
Show quoted text
On 12/19/24 11:40 AM, Enrico Schenone wrote:
Hello, my answers in line along your message ...
Thanks a lot again.Enrico
On 12/19/24 10:11, Enrico Schenone wrote:
Good day, Adrian.
I get the error inside the program by catching the exception and
logging it with diagnostic info provided by the DVM (a runtime
interpreter similar in concept to a JVM) that embed the PG driver.The 4Js DVM (Dynamic Virtual Machine) is that one
https://4js.com/online_documentation/fjs-gas-manual-html/index.html#gas-topics/c_gas_what_is_dvm.htmlIn other words an Android client?
No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS
and other unix-like OSs. It ensures the portability of 4Js Genero
compiled programs (p-code) on several OS platforms.
4Js Genero is a Low Code Application Platform. The programming
language, named "BDL - Business Development Language", is an
evolution of the Informix-4gl.
Compiled programs needs a runtime interpreter (DVM) to be executed.
The DVM embeds at low-level the DB drivers provided by several vendors,From previous post you mentioned:
"Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client "So are they building their own driver over libpq?
and at BDL high level the application program can easily connect to
the major DBs on the market thanks to its ODI (Open Database Interface).I can't give you info on what the DVM does at low level, but I can
send you the distinct full session log fragment at server side,
where it is quite easy to understand how the DVM translates the
program's SQL queries end what PostgreSQL does.That might be useful.
Please take a look to the attached text file, that is the full
failing session log (filtered from the debug5 PostgreSQL server log).This is where it falls off the rails, but I can't see why?:
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOG: 00000: statement: fetch
forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: exec_simple_query,
postgres.c:1073
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 DEBUG: 00000: CommitTransaction(1)
name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid:
0/1/02024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.407 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908006676054e0.21cb42 LOG: 08006: could not receive data
from client: Connessione interrotta dal corrispondenteThanks again and best regards.
Enrico
Attachments:
tcpdump_out_009_cu4.tgzapplication/x-compressed-tar; name=tcpdump_out_009_cu4.tgzDownload+17-15
On 12/19/24 23:57, Enrico Schenone wrote:
At 19/12/24 22:47, Adrian Klaver wrote:
So are they building their own driver over libpq?
I think so.
They wrote ...
</The error “no connection to the server“ is definitively a PostgreSQL
error://||/
/|./src/interfaces/libpq/fe-exec.c: libpq_append_conn_error(conn, "no
connection to the server");|//It is not normal that PostgreSQL client can connect to the server, do
some SQL with success and then the SQL connection gets dropped at the
next SQL statement execution. This is really suspicious./
They must work in a perfect world where networks never fail. One that
would render the blog below unneeded:
https://www.cybertec-postgresql.com/en/tcp-keepalive-for-a-better-postgresql-experience/
Best regards.
Enrico
--
Adrian Klaver
adrian.klaver@aklaver.com
On 12/20/24 07:02, Enrico Schenone wrote:
Hi, Adrian.
Today I have collected a tcpdump at client side with communications
between application server and db server while the issue was occurring
one time per second on another program.
I send you two files.
The first one is a zipped tarball (.tgz) containing a text
representation of the tcpdump starting at point where it reports the
declaration of the failing cursor ("cu4" as you can see in the first
line of the file) and subsequent fetch. Consider that the client
application log detected the XX001 error on the first FETCH of the
cursor at 2024-12-20 12:17:35.175
The second file (zipped tarball .tgz) is too big to be sent as
attachment, so I provide a link where it can be downloaded. It is the
fraction of tcpdump recorded during the program failure (occurred
several times). It is in .pcap format so it is possible to open it with
Wireshark or tcpdump -A -r
Anyone interested can download it at
https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgzConsider that during the dump several different cursor was declared with
the name "cu4", but the one failing is the one of the first line.
Maybe an expert (I'm not so expert) can see if the disconnection is
really made by the client and/or if the data returned by the server are
really corrupted as per XX001 SQLSTATE.
This is beyond me, someone else will need to chime in.
Best regards.
EnricoIl 19/12/24 22:47, Adrian Klaver ha scritto:
--
Adrian Klaver
adrian.klaver@aklaver.com
I'm getting a strange error message when I try to insert a date using the view/edit grid in pgadmin. See below. I've tried quotes, no quotes and various formats. The column type is clearly "date."
[cid:4ff69cfe-2efa-4636-8dde-6230512706f7]
Mark Brady, Ph.D.
Deputy Chief Data Officer, TRMC
amazon.com/author/markjbrady<https://amazon.com/author/markjbrady>
________________________________
From: Enrico Schenone <eschenone@cleistech.it>
Sent: Friday, December 20, 2024 10:02 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Cc: Massimo Catti <mcatti@cleistech.it>; Livio Pizzolo <lpizzolo@cleistech.it>
Subject: Re: Intermittent errors when fetching cursor rows on PostgreSQL 16
Hi, Adrian.
Today I have collected a tcpdump at client side with communications
between application server and db server while the issue was occurring
one time per second on another program.
I send you two files.
The first one is a zipped tarball (.tgz) containing a text
representation of the tcpdump starting at point where it reports the
declaration of the failing cursor ("cu4" as you can see in the first
line of the file) and subsequent fetch. Consider that the client
application log detected the XX001 error on the first FETCH of the
cursor at 2024-12-20 12:17:35.175
The second file (zipped tarball .tgz) is too big to be sent as
attachment, so I provide a link where it can be downloaded. It is the
fraction of tcpdump recorded during the program failure (occurred
several times). It is in .pcap format so it is possible to open it with
Wireshark or tcpdump -A -r
Anyone interested can download it at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcleislabs.cleistech.it%2Fdownloads%2Ftcpdump_out009.pcap.tgz&data=05%7C02%7C%7Cfe8da7a507744c7842d608dd210ec77b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638703069888918551%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=KrfcnJUpuwV8CqzzkPvOf6SHgewaxFB%2FjuFm8vSDkgM%3D&reserved=0<https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgz>
Consider that during the dump several different cursor was declared with
the name "cu4", but the one failing is the one of the first line.
Maybe an expert (I'm not so expert) can see if the disconnection is
really made by the client and/or if the data returned by the server are
really corrupted as per XX001 SQLSTATE.
Best regards.
Enrico
Il 19/12/24 22:47, Adrian Klaver ha scritto:
Show quoted text
On 12/19/24 11:40 AM, Enrico Schenone wrote:
Hello, my answers in line along your message ...
Thanks a lot again.Enrico
On 12/19/24 10:11, Enrico Schenone wrote:
Good day, Adrian.
I get the error inside the program by catching the exception and
logging it with diagnostic info provided by the DVM (a runtime
interpreter similar in concept to a JVM) that embed the PG driver.In other words an Android client?
No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS
and other unix-like OSs. It ensures the portability of 4Js Genero
compiled programs (p-code) on several OS platforms.
4Js Genero is a Low Code Application Platform. The programming
language, named "BDL - Business Development Language", is an
evolution of the Informix-4gl.
Compiled programs needs a runtime interpreter (DVM) to be executed.
The DVM embeds at low-level the DB drivers provided by several vendors,From previous post you mentioned:
"Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client "So are they building their own driver over libpq?
and at BDL high level the application program can easily connect to
the major DBs on the market thanks to its ODI (Open Database Interface).I can't give you info on what the DVM does at low level, but I can
send you the distinct full session log fragment at server side,
where it is quite easy to understand how the DVM translates the
program's SQL queries end what PostgreSQL does.That might be useful.
Please take a look to the attached text file, that is the full
failing session log (filtered from the debug5 PostgreSQL server log).This is where it falls off the rails, but I can't see why?:
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOG: 00000: statement: fetch
forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: exec_simple_query,
postgres.c:1073
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 DEBUG: 00000: CommitTransaction(1)
name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid:
0/1/02024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec,
xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.407 CET [2214722] cleistech@hh24odds_prod -
192.168.16.17908006676054e0.21cb42 LOG: 08006: could not receive data
from client: Connessione interrotta dal corrispondenteThanks again and best regards.
Enrico
Attachments:
On 12/20/24 12:05, mark bradley wrote:
I'm getting a strange error message when I try to insert a date using
the view/edit grid in pgadmin. See below. I've tried quotes, no quotes
and various formats. The column type is clearly "date."
Don't hijack a thread, start a new one.
Mark Brady, Ph.D.
Deputy Chief Data Officer, TRMC
_amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
------------------------------------------------------------------------
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi, Adrian.
I'm arranging a test program with two nested cursors in two versions:
1. 4Js Genero BDL language
2. pure C with libpq language
I'll put both programs in stress execution into the production
environment looking for some hours how they behaves.
Possible combinations are:
1. no-one throws an error
2. only the 4Js Genero version throws an error
3. only the pure C version throws an error
4. both versions throws the error
This stress test should address further investigations.
I'll keep you informed.
Regards.
Enrico Schenone
Il 20/12/24 17:43, Adrian Klaver ha scritto:
Show quoted text
On 12/20/24 07:02, Enrico Schenone wrote:
Hi, Adrian.
Today I have collected a tcpdump at client side with communications
between application server and db server while the issue was
occurring one time per second on another program.
I send you two files.
The first one is a zipped tarball (.tgz) containing a text
representation of the tcpdump starting at point where it reports the
declaration of the failing cursor ("cu4" as you can see in the first
line of the file) and subsequent fetch. Consider that the client
application log detected the XX001 error on the first FETCH of the
cursor at 2024-12-20 12:17:35.175
The second file (zipped tarball .tgz) is too big to be sent as
attachment, so I provide a link where it can be downloaded. It is the
fraction of tcpdump recorded during the program failure (occurred
several times). It is in .pcap format so it is possible to open it
with Wireshark or tcpdump -A -r
Anyone interested can download it at
https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgzConsider that during the dump several different cursor was declared
with the name "cu4", but the one failing is the one of the first line.
Maybe an expert (I'm not so expert) can see if the disconnection is
really made by the client and/or if the data returned by the server
are really corrupted as per XX001 SQLSTATE.This is beyond me, someone else will need to chime in.
Best regards.
EnricoIl 19/12/24 22:47, Adrian Klaver ha scritto:
On 12/24/24 14:23, Enrico Schenone wrote:
Hi, Adrian.
I'm arranging a test program with two nested cursors in two versions:1. 4Js Genero BDL language
2. pure C with libpq languageI'll put both programs in stress execution into the production
environment looking for some hours how they behaves.
Possible combinations are:1. no-one throws an error
2. only the 4Js Genero version throws an error
3. only the pure C version throws an error
4. both versions throws the errorThis stress test should address further investigations.
I'll keep you informed.
Yes, would like to see how this turns out.
Regards.
Enrico Schenone
--
Adrian Klaver
adrian.klaver@aklaver.com
Hello, Adrian.
As I said days ago, I have arranged a kind of stress test in production
environment.
I wrote a program that loads a temporary table, loads 2049 rows into
them from a baseline_table and finally declare two nested cursors.
The first cursor is on the temp table as parent while the second is on a
lookup table as child.
The program logic is the transposition of one fragment of several
production programs that was failing on cursors, and has to be intended
as a POC only.
The program has been wrote in both pure C with libpq (see attached
source program) and in 4Js Genero language.
Each program was executed by a shell script loop who ran 10 times the
program each minute with 1 second sleep between each run (see attachment).
An automatic scheduler has continuously submitted 4 parallel tasks (two
for C version and two for 4Js version programs).
The test was started the Dec, 29 2024 and it was kept in execution for
many days directly in production environment.
In total, nearly a billion of child test cursors were executed while all
other production tasks was running (normally 20 to 30 concurrent batch
services on a pool of 100).
And Well, I'm quite confused: no error at all has been detected, not
only on the test programs but in the whole production system. The error
was completely disappeared.
Then I have stopped the four tasks of the stress test leaving all other
services running for a week, and again no error at all.
No setup was changed nor servers was rebooted, nor infrastructure has
been upgraded during the test period.
As a result, at the moment I'm not understood not only Why & Where the
error was occurring, but also Why it is disappeared.
Anyone may feel free to give me his opinion.
For the moment I'll make no other test unless the error is knocking back
to my door.
*Enrico Schenone*
Software Architect
*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400 Fax: +39-0104073276
Mobile: +39-320
7709352file:///home/enrico/Documenti/Work/Clienti/hh24/Incident/err-6372/C-test/C-testCursors.c
E-mail: eschenone@cleistech.it
<https://gruppocleis.it><https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
<https://ibm.biz/BdqAJh>
Il 26/12/24 00:20, Adrian Klaver ha scritto:
Show quoted text
On 12/24/24 14:23, Enrico Schenone wrote:
Hi, Adrian.
I'm arranging a test program with two nested cursors in two versions:1. 4Js Genero BDL language
2. pure C with libpq languageI'll put both programs in stress execution into the production
environment looking for some hours how they behaves.
Possible combinations are:1. no-one throws an error
2. only the 4Js Genero version throws an error
3. only the pure C version throws an error
4. both versions throws the errorThis stress test should address further investigations.
I'll keep you informed.Yes, would like to see how this turns out.
Regards.
Enrico Schenone
On 1/13/25 00:45, Enrico Schenone wrote:
Hello, Adrian.
As I said days ago, I have arranged a kind of stress test in production
environment.
I wrote a program that loads a temporary table, loads 2049 rows into
them from a baseline_table and finally declare two nested cursors.
The first cursor is on the temp table as parent while the second is on a
lookup table as child.The program logic is the transposition of one fragment of several
production programs that was failing on cursors, and has to be intended
as a POC only.
And Well, I'm quite confused: no error at all has been detected, not
only on the test programs but in the whole production system. The error
was completely disappeared.Then I have stopped the four tasks of the stress test leaving all other
services running for a week, and again no error at all.No setup was changed nor servers was rebooted, nor infrastructure has
been upgraded during the test period.
You are absolutely sure about the above?
As a result, at the moment I'm not understood not only Why & Where the
error was occurring, but also Why it is disappeared.
Errors that 'fix' themselves are the most frustrating kind, as you know
in the back of your mind they will likely pop up again.
Anyone may feel free to give me his opinion.
For the moment I'll make no other test unless the error is knocking back
to my door.
That is all you can do.
*Enrico Schenone*
Software Architect
--
Adrian Klaver
adrian.klaver@aklaver.com
Il 13/01/25 17:19, Adrian Klaver ha scritto:
On 1/13/25 00:45, Enrico Schenone wrote:
Hello, Adrian.
As I said days ago, I have arranged a kind of stress test in
production environment.
I wrote a program that loads a temporary table, loads 2049 rows into
them from a baseline_table and finally declare two nested cursors.
The first cursor is on the temp table as parent while the second is
on a lookup table as child.The program logic is the transposition of one fragment of several
production programs that was failing on cursors, and has to be
intended as a POC only.And Well, I'm quite confused: no error at all has been detected, not
only on the test programs but in the whole production system. The
error was completely disappeared.Then I have stopped the four tasks of the stress test leaving all
other services running for a week, and again no error at all.No setup was changed nor servers was rebooted, nor infrastructure has
been upgraded during the test period.You are absolutely sure about the above?
I can say Yes. All test operations has been logged and verified against
the Postgresql log.
The only component not under my control is the Provider's
Infrastructure, but the infrastructure admin ensured me that no
operation at all has been made. I beleave him because it is a reliable
tecnician end a well known person.
As a result, at the moment I'm not understood not only Why & Where
the error was occurring, but also Why it is disappeared.Errors that 'fix' themselves are the most frustrating kind, as you
know in the back of your mind they will likely pop up again.
True, knocking again to my door ... I still can't beleave.
Anyone may feel free to give me his opinion.
For the moment I'll make no other test unless the error is knocking
back to my door.That is all you can do.
*Enrico Schenone*
Software Architect
Thanks a lot for your interest in sharing my strange experience.
Best regards.
Enrico
*Enrico Schenone*
Software Architect