Exception table ...

Started by Hans-Jürgen Schönigover 22 years ago14 messages
#1Hans-Jürgen Schönig
postgres@cybertec.at

I have just seen a nice feature provided by DB2 which seems very useful
to me.
When importing huge amounts of data (dozens of gigs) with the help of
COPY errors might occur from time to time (especially when migrating).
The problem with COPY is that it stops after the first error. So if the
first problem occurs after 200.000.000 records it is somehow annoying to
do the entire stuff again. If we had an option telling COPY to log all
problems into a logtable or into a separate logfile we could finish the
import and rollback the transaction after trying to import everything.
This would help a lot when migrating or importing a lot of data because
all problems with an import could be fixed at once based on the
exception table.
Did anybody think about a feature like that in the past? Does it make
sense to the group?

Best regards,

Hans

--
Cybertec Geschwinde u Schoenig
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/2952/30706; +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at

#2Paulo Scardine
paulos@cimed.ind.br
In reply to: Hans-Jürgen Schönig (#1)
SELECT FOR UPDATE NOWAIT

My boss is asking for something like Oracle's "SELECT FOR UPDATE NOWAIT".

Is there any such feature? If no, should I look forward into implementing
this? Any advice?

Thank you,
--
Paulo Scardine

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.490 / Virus Database: 289 - Release Date: 16/6/2003

#3Rod Taylor
rbt@rbt.ca
In reply to: Paulo Scardine (#2)
Re: SELECT FOR UPDATE NOWAIT

On Fri, 2003-07-18 at 19:46, Paulo Scardine wrote:

My boss is asking for something like Oracle's "SELECT FOR UPDATE NOWAIT".

Is there any such feature? If no, should I look forward into implementing
this? Any advice?

Lookup STATEMENT_TIMEOUT and set it to a very short time.

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Rod Taylor (#3)
Re: SELECT FOR UPDATE NOWAIT

Rod Taylor wrote:
-- Start of PGP signed section.

On Fri, 2003-07-18 at 19:46, Paulo Scardine wrote:

My boss is asking for something like Oracle's "SELECT FOR UPDATE NOWAIT".

Is there any such feature? If no, should I look forward into implementing
this? Any advice?

Lookup STATEMENT_TIMEOUT and set it to a very short time.

Some people have said they want to distinguish between a slow query
(busy system) and waiting on a lock. I can particulary see wanting to
do a NOWAIT only on exclusive locks --- not sure how many really want
that, though.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Christoph Haller
ch@rodos.fzk.de
In reply to: Bruce Momjian (#4)
Re: SELECT FOR UPDATE NOWAIT

Rod Taylor wrote:
-- Start of PGP signed section.

On Fri, 2003-07-18 at 19:46, Paulo Scardine wrote:

My boss is asking for something like Oracle's "SELECT FOR UPDATE

NOWAIT".

Is there any such feature? If no, should I look forward into

implementing

this? Any advice?

Lookup STATEMENT_TIMEOUT and set it to a very short time.

Some people have said they want to distinguish between a slow query
(busy system) and waiting on a lock. I can particulary see wanting to

do a NOWAIT only on exclusive locks --- not sure how many really want
that, though.

I think I'm a quite attentive to the SQL and HACKERS list, and I see
requests for a NOWAIT option at least once a month, and it's growing.
Regards, Christoph

#6Christoph Haller
ch@rodos.fzk.de
In reply to: Hans-Jürgen Schönig (#1)
Re: Exception table ...

I have just seen a nice feature provided by DB2 which seems very

useful

to me.
When importing huge amounts of data (dozens of gigs) with the help of
COPY errors might occur from time to time (especially when migrating).

The problem with COPY is that it stops after the first error. So if

the

first problem occurs after 200.000.000 records it is somehow annoying

to

do the entire stuff again. If we had an option telling COPY to log all

problems into a logtable or into a separate logfile we could finish

the

import and rollback the transaction after trying to import everything.

This would help a lot when migrating or importing a lot of data

because

all problems with an import could be fixed at once based on the
exception table.
Did anybody think about a feature like that in the past? Does it make
sense to the group?

The same goes for me, sounds very useful. And if I didn't dream it,
I'm pretty sure there have been requests for a feature like that before.

Regards, Christoph

#7Paulo Scardine
paulos@cimed.ind.br
In reply to: Christoph Haller (#5)
Re: SELECT FOR UPDATE NOWAIT

(Excuse my english, my native language is portuguese)

I think I'm a quite attentive to the SQL and HACKERS list, and I see
requests for a NOWAIT option at least once a month, and it's growing.
Regards, Christoph

Yes. I have done a little google about "pgsql select for update nowait" and
find lot of people looking for this feature.

If there is a lock and NOWAIT is specified, then raise error instead of
waiting for the lock to release. We have some long distance connections that
get very busy sometimes and is hard to distinguish between network problems
and locked rows.

Do you see it as something hard to implement? Any advice?
Is it just to add an "opt_nowait" to the "FOR UPDATE" clause in the parser
and checking for this option when trying to get the lock later?

TIA,
--
Paulo Scardine

----- Original Message -----
From: "Christoph Haller" <ch@rodos.fzk.de>
Sent: Wednesday, July 23, 2003 4:35 AM
Subject: Re: [HACKERS] SELECT FOR UPDATE NOWAIT
...

Lookup STATEMENT_TIMEOUT and set it to a very short time.

Some people have said they want to distinguish between a slow query
(busy system) and waiting on a lock. I can particulary see wanting to

do a NOWAIT only on exclusive locks --- not sure how many really want
that, though.

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.490 / Virus Database: 289 - Release Date: 16/6/2003

#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Paulo Scardine (#7)
Re: SELECT FOR UPDATE NOWAIT

Paulo Scardine wrote:

(Excuse my english, my native language is portuguese)

I think I'm a quite attentive to the SQL and HACKERS list, and I see
requests for a NOWAIT option at least once a month, and it's growing.
Regards, Christoph

Yes. I have done a little google about "pgsql select for update nowait" and
find lot of people looking for this feature.

If there is a lock and NOWAIT is specified, then raise error instead of
waiting for the lock to release. We have some long distance connections that
get very busy sometimes and is hard to distinguish between network problems
and locked rows.

Do you see it as something hard to implement? Any advice?
Is it just to add an "opt_nowait" to the "FOR UPDATE" clause in the parser
and checking for this option when trying to get the lock later?

My guess is that we will implement it was a SET variable so it can
control FOR UPDATE/LOCK/UPDATE/DELETE.

Added to TODO:

Add GUC variable to prevent waiting on locks

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Paulo Scardine
paulos@cimed.ind.br
In reply to: Bruce Momjian (#8)
Re: SELECT FOR UPDATE NOWAIT

My guess is that we will implement it was a SET variable so it can
control FOR UPDATE/LOCK/UPDATE/DELETE.

Added to TODO:

Add GUC variable to prevent waiting on locks

Interesting.

I have a lot of potentially dumb questions:
- Is this easier to implement as a user variable than in the parser or is
some kind of police (SQL compliance, etc.)?
- These locks are grant by LockAcquire() or by higher level functions?
- Where is the best place to put this?

TIA,
--
Paulo Scardine

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.490 / Virus Database: 289 - Release Date: 16/6/2003

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Paulo Scardine (#9)
Re: SELECT FOR UPDATE NOWAIT

"Paulo Scardine" <paulos@cimed.ind.br> writes:

- Where is the best place to put this?

I think it would be a really *bad* idea to put it in LockAcquire; that
risks breaking things that you don't want broken.

Whether it's special syntax or a GUC variable, the restriction should
only apply to SELECT FOR UPDATE row locks, perhaps user-commanded LOCK
TABLE operations, and maybe one or two other places that are known to
be used only for user-written operations and not for system-initiated
ones. Those places would need to check whether to do a conditional
or unconditional lock.

regards, tom lane

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#10)
Re: SELECT FOR UPDATE NOWAIT

Tom Lane wrote:

"Paulo Scardine" <paulos@cimed.ind.br> writes:

- Where is the best place to put this?

I think it would be a really *bad* idea to put it in LockAcquire; that
risks breaking things that you don't want broken.

Whether it's special syntax or a GUC variable, the restriction should
only apply to SELECT FOR UPDATE row locks, perhaps user-commanded LOCK
TABLE operations, and maybe one or two other places that are known to
be used only for user-written operations and not for system-initiated
ones. Those places would need to check whether to do a conditional
or unconditional lock.

My original idea was to have it apply only for exclusive locks. It
seemed those were mostly the locks didn't want to wait for. Shared
locking pg_class is something that you would want to wait for.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#12Paulo Scardine
paulos@cimed.ind.br
In reply to: Bruce Momjian (#11)
Re: SELECT FOR UPDATE NOWAIT

LockAcquire has a "dontWait" parameter, which do just what I want.

The executor level calls "heap_open(relid, RowShareLock)" when doing "FOR
UPDATE"s.
Should we define something like RowShareLockNoWait, so heap_open() or other
lower level functions can call LockAcquire() with dontWait set?

By the way, is this kind of question on-topic for pgsql-hackers?

TIA,
--
Paulo Scardine

----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Tom Lane" <tgl@sss.pgh.pa.us>
Cc: "Paulo Scardine" <paulos@cimed.ind.br>; <pgsql-hackers@postgresql.org>
Sent: Wednesday, July 23, 2003 4:30 PM
Subject: Re: [HACKERS] SELECT FOR UPDATE NOWAIT

Tom Lane wrote:

"Paulo Scardine" <paulos@cimed.ind.br> writes:

- Where is the best place to put this?

I think it would be a really *bad* idea to put it in LockAcquire; that
risks breaking things that you don't want broken.

Whether it's special syntax or a GUC variable, the restriction should
only apply to SELECT FOR UPDATE row locks, perhaps user-commanded LOCK
TABLE operations, and maybe one or two other places that are known to
be used only for user-written operations and not for system-initiated
ones. Those places would need to check whether to do a conditional
or unconditional lock.

My original idea was to have it apply only for exclusive locks. It
seemed those were mostly the locks didn't want to wait for. Shared
locking pg_class is something that you would want to wait for.

-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, Pennsylvania

19073

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.490 / Virus Database: 289 - Release Date: 16/6/2003

#13Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Paulo Scardine (#12)
Re: SELECT FOR UPDATE NOWAIT

Paulo Scardine wrote:

LockAcquire has a "dontWait" parameter, which do just what I want.

The executor level calls "heap_open(relid, RowShareLock)" when doing "FOR
UPDATE"s.
Should we define something like RowShareLockNoWait, so heap_open() or other
lower level functions can call LockAcquire() with dontWait set?

By the way, is this kind of question on-topic for pgsql-hackers?

I think there are two issues with implementing nowait locking:

If we have special syntax for FOR UPDATE, we will need it for other
commands that need no wait behavior, and after a while they all carry
around that cruft --- SET seems easier and more useful.

Second, I don't think we want to carry around a NOWAIT boolean in all
our structures --- a SET would control it easier. The SET can be
checked right in the lock code, and I think having it control only
exclusive locks would do almost everything we want.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#14Christoph Haller
ch@rodos.fzk.de
In reply to: Bruce Momjian (#13)
Re: SELECT FOR UPDATE NOWAIT

Paulo Scardine wrote:

LockAcquire has a "dontWait" parameter, which do just what I want.

The executor level calls "heap_open(relid, RowShareLock)" when doing

"FOR

UPDATE"s.
Should we define something like RowShareLockNoWait, so heap_open()

or other

lower level functions can call LockAcquire() with dontWait set?

By the way, is this kind of question on-topic for pgsql-hackers?

I think there are two issues with implementing nowait locking:

If we have special syntax for FOR UPDATE, we will need it for other
commands that need no wait behavior, and after a while they all carry
around that cruft --- SET seems easier and more useful.

Second, I don't think we want to carry around a NOWAIT boolean in all
our structures --- a SET would control it easier. The SET can be
checked right in the lock code, and I think having it control only
exclusive locks would do almost everything we want.

Sounds reasonable to me. You'll have my vote for the SET way.
Regards, Christoph