How different is AWS-RDS postgres?

Started by Rob Sargentalmost 5 years ago19 messagesgeneral

robjsargent@gmail.com

almost 5 years ago

I have what purports to be Postgres 12 ( PostgreSQL 12.5 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat
4.8.5-11), 64-bit ) running on RDS. I testing a new release of my
service with an admittedly large data set (which may be my swamping AWS
bandwidth). But my app logs tell me a table was created and loaded (and
later, read)

From my tomcat log

00:09:58.567 [https-jsse-nio-10.0.2.28-15002-exec-3] DEBUG
edu.utah.camplab.jx.PayloadFromMux - STAGING TABLE CREATED:
bulk."rjs_GEV15_15_ff3ba73c_d927_431c_bd29_9687a47f1594"
00:09:58.585 [https-jsse-nio-10.0.2.28-15002-exec-3] INFO
edu.utah.camplab.jx.PayloadFromMux -
ff3ba73c-d927-431c-bd29-9687a47f1594: started COPY work at
1621987798585 00:09:58.586 [ff3ba73c-d927-431c-bd29-9687a47f1594]
INFO edu.utah.camplab.jx.PayloadWriterThread -
bulk."rjs_GEV15_15_ff3ba73c_d927_431c_bd29_9687a47f1594": Begin bulk
copy segment 00:10:01.597 [https-jsse-nio-10.0.2.28-15002-exec-11]
ERROR edu.utah.camplab.servlet.PayloadSaveServlet - received payload
00:10:01.598 [https-jsse-nio-10.0.2.28-15002-exec-11] DEBUG
org.jooq.impl.DefaultConnectionProvider - setting auto commit :
false 00:10:01.599 [https-jsse-nio-10.0.2.28-15002-exec-11] DEBUG
edu.utah.camplab.jx.PayloadFromMux -
074d449b-c3ba-499f-83e3-f48427fe0156: start transaction at
1621987801599 00:10:01.599 [https-jsse-nio-10.0.2.28-15002-exec-11]
DEBUG org.jooq.impl.DefaultConnectionProvider - set savepoint
00:10:07.511 [ff3ba73c-d927-431c-bd29-9687a47f1594] INFO
edu.utah.camplab.jx.PayloadWriterThread - bulk transfer of 2528447
took 8.925s 00:10:07.511 [ff3ba73c-d927-431c-bd29-9687a47f1594]
DEBUG edu.utah.camplab.jx.PayloadWriterThread - staged in 8925 ms
00:10:07.567 [https-jsse-nio-10.0.2.28-15002-exec-3] INFO
edu.utah.camplab.jx.PayloadFromMux -
ff3ba73c-d927-431c-bd29-9687a47f1594: Total segment save took 9486
ms 00:10:07.567 [https-jsse-nio-10.0.2.28-15002-exec-3] INFO
edu.utah.camplab.jx.AbstractPayload -
ff3ba73c-d927-431c-bd29-9687a47f1594: closing process
ff3ba73c-d927-431c-bd29-9687a47f1594 00:10:07.608
[https-jsse-nio-10.0.2.28-15002-exec-3] DEBUG
org.jooq.impl.DefaultConnectionProvider - release savepoint
00:10:07.609 [https-jsse-nio-10.0.2.28-15002-exec-3] DEBUG
edu.utah.camplab.jx.PayloadFromMux -
ff3ba73c-d927-431c-bd29-9687a47f1594: end transaction at 1621987807609

Which claims to have written 2,528,447 records in roughly 9 seconds into
the newly created table
"bulk.rjs.GEV15_15_FF3ba73c_d927_431c_bd29_9687147f1594". Nice.

However, no such table exists, though later processing renames it by
appending "_done" to the name (being careful to remain under 64 char)

My middleware does receive an exception notice

00:10:55.101 [https-jsse-nio-10.0.2.28-15002-exec-3] ERROR
edu.utah.camplab.jx.AbstractPayload - run
ff3ba73c-d927-431c-bd29-9687a47f1594: Exception from db write: SQL
[insert into segment select * from
bulk."rjs_GEV15_15_ff3ba73c_d927_431c_bd29_9687a47f1594" as s where
s.probandset_id >= 'a0000000-0000-0000-0000-000000000000' and
s.probandset_id < 'b0000000-0000-0000-0000-000000000000' ]; An I/O
error occurred while sending to the backend.: {}

which confirms(?) that the table /was/ there and read from 10 times
prior since I copy from this temp, bulk loaded table to the actual
target in 16 chunks by diddling the first hex digit from 0 through f.
Here zero through nine apparently didn't have a problem. These 16
inserts are in a single transaction, separate from the bulk copy.
(There are a dozen more of these files processed and disappeared.)

My question is:
Should I be surprised that the initial bulk loaded table is nowhere to
be seen, given the thumbs up from the logs? Is this frailty inherent in
AWS/RDS infrastructure?

Since this is an academic exorcise, I have minimal AWS support, which
has yet to chime in on this matter. My access to the logs is um, er,
constrained.