BUG #5395: UPDATE on shutdown overwrites table

Started by Lothar Bongartzabout 16 years ago3 messagesbugs

lotharbongartz@hotmail.com

about 16 years ago

The following bug has been logged online:

Bug reference: 5395
Logged by: Lothar Bongartz
Email address: lotharbongartz@hotmail.com
PostgreSQL version: 8.4.3
Operating system: Windows Server 2003 R2
Description: UPDATE on shutdown overwrites table
Details:

I have ported a community with about one million members from MS SQL to
Postgres.
While the overall performance is comparable, I have noticed, that
conditional bulk writing can be extremely slow in Postgres:
SELECT ... INTO ... WHERE
UPDATE ... FROM ... WHERE
For this reason the database is stalling from time to time and I have to
restart the server. For the second time I have detected, that Postgres
overwrites a table when shutting down. The table <onlineinfo> is only
updated with NOW() and only for a single matching row:
UPDATE onlineinfo SET date_end=NOW() WHERE memb_id=v_id
When this command is executed while Postgres shuts down, all rows contain a
"random" date like '2007-06-25' in the <date_end> column afterwards.
To repair this, I have created a table from a backup:
CREATE TEMP TABLE temp_onlineinfo_bak
(
memb_id integer NOT NULL,
date_end timestamp NOT NULL
)
After filling the temporary table from the backup, I use it to repair the
overwritten table:
UPDATE onlineinfo SET date_end=B.date_end
FROM onlineinfo O, temp_onlineinfo_bak B
WHERE O.date_end<'2010-03-21' AND B.memb_id=O.memb_id;
The WHERE condition is needed to prevent overwriting rows, which are updated
in the meantime.

Unfortunately this command needs several hours to complete, since this is
again a conditional bulk writing.

I tried all kind of variations like removing the index on
<onlineinfo><date_end> before the update or setting up an index on
<temp_onlineinfo_bak><memb_id>

Nothing helps. These are the data of the table:
CREATE TABLE onlineinfo
(
memb_id integer NOT NULL,
log_count integer NOT NULL,
date_end timestamp NOT NULL,
ip integer NOT NULL,
CONSTRAINT onlineinfo_pkey PRIMARY KEY (memb_id),
CONSTRAINT fk__onlineinf__memb___55009f39 FOREIGN KEY (memb_id)
REFERENCES member (memb_id) MATCH FULL
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE onlineinfo OWNER TO postgres;

-- Index: onlineinfo_date_end

-- DROP INDEX onlineinfo_date_end;

CREATE INDEX onlineinfo_date_end
ON onlineinfo
USING btree
(date_end);

Tom Lane

tgl@sss.pgh.pa.us

about 16 years ago

In reply to: Lothar Bongartz (#1)

Re: BUG #5395: UPDATE on shutdown overwrites table

"Lothar Bongartz" <lotharbongartz@hotmail.com> writes:

For this reason the database is stalling from time to time and I have to
restart the server. For the second time I have detected, that Postgres
overwrites a table when shutting down. The table <onlineinfo> is only
updated with NOW() and only for a single matching row:
UPDATE onlineinfo SET date_end=NOW() WHERE memb_id=v_id
When this command is executed while Postgres shuts down, all rows contain a
"random" date like '2007-06-25' in the <date_end> column afterwards.

This is quite hard to believe. Can you provide a reproducible test
case?

I have seen cases where someone wrote what he thought was a single-row
update, but it turned out to be a whole-table update because the WHERE
clause actually reduced to constant TRUE. Your reference to "v_id"
makes me think that you are issuing this query inside a plpgsql
function. One of the common ways to shoot yourself in the foot like
that is to be careless about whether a name could match both a table
column and a plpgsql variable or parameter; could that have happened in
your situation?

BTW, there are easier ways to cancel a single query than restarting the
whole server.

regards, tom lane

Craig Ringer

craig@2ndquadrant.com

about 16 years ago

In reply to: Lothar Bongartz (#1)

Re: BUG #5395: UPDATE on shutdown overwrites table

On 30/03/2010 11:27 PM, Lothar Bongartz wrote:

For this reason the database is stalling from time to time and I have to
restart the server.

Are those stalls accompanied by any mention of checkpoints in the logs?

The default PostgreSQL configuration isn't tuned for larger workloads -
it's very minimalist. You probably need to increase your
checkpoint_segments . It may also be a good idea to look at shared_buffers .

Before just re-starting the server, you should look at `iostat',
`vmstat', `top', `iotop' or other suitable performance monitoring tools
and see if the server's actually DOING anything. You'll probably find
it's busy with disk I/O and not really stalled at all.

--
Craig Ringer