sequence's plpgsql

Started by Tim McAuleyover 22 years ago67 messages
#1Tim McAuley
mcauleyt@tcd.ie

Hi,

I've hit a little problem and was wondering if anyone might be able to
give some help.

Set-up:

- JBoss appserver using entity beans to access database
- Using sequence pattern to update primary keys for several tables.
Basically it is a method of getting primary keys without hitting the
database. It actually hits the database every 10th go and updates the
counter by 10 in the database.

(now the bold bit)
- We have a stored procedure that actually updates one of the sequences
as well. It increments one at a time using this code:

SELECT INTO NewPK pkIndex FROM PrimaryKeyGenerator WHERE Name =
SequenceName FOR UPDATE;
UPDATE PrimaryKeyGenerator SET pkIndex = NewPK + 1 WHERE Name =
SequenceName;

I believe the "FOR UPDATE" won't actually do any good inside a plpgsql
call. Am I right?

Problem:

I have just called this stored procedure from outside the system using
10 threads and have got some errors due to duplicate entries on the
unique index. If it was only the stored procedures using this pk
generator then I could use a postgresql sequence but it isn't. If the
entity beans were to use the sequence, they'd have to make a database
call every time.

Any thoughts?

I'm thinking I may need to switch to using a sequence because the entity
beans don't actually update this particular table very often but would
prefer not to for portability reasons.

Thanks,

Tim

#2Richard Huxton
dev@archonet.com
In reply to: Tim McAuley (#1)
Re: sequence's plpgsql

On Wednesday 24 September 2003 17:40, Tim McAuley wrote:

Hi,

I've hit a little problem and was wondering if anyone might be able to
give some help.

Set-up:

- JBoss appserver using entity beans to access database
- Using sequence pattern to update primary keys for several tables.
Basically it is a method of getting primary keys without hitting the
database. It actually hits the database every 10th go and updates the
counter by 10 in the database.

You might want to try just using sequences - PG does some caching of generated
values for you. Sorry - can't remember how you alter the cache size, but try
SELECT * FROM my_sequence;
to see the details of its settings.

Use sequences, and from your sequence-holding class do something like:
SELECT nextval('myseq'),nextval('myseq'),...10 times...
That will give you a block of 10 sequence values in one go, and off you go.

If you'd rather have the values in one column, create a single-column table
"seq_count" and populate with values 1..10 then:
SELECT nextval('myseq'),seq_count.id FROM seq_count;

That any use?
--
Richard Huxton
Archonet Ltd

#3Tim McAuley
mcauleyt@tcd.ie
In reply to: Richard Huxton (#2)
Re: sequence's plpgsql

Richard Huxton wrote:

On Wednesday 24 September 2003 17:40, Tim McAuley wrote:

Hi,

I've hit a little problem and was wondering if anyone might be able to
give some help.

Set-up:

- JBoss appserver using entity beans to access database
- Using sequence pattern to update primary keys for several tables.
Basically it is a method of getting primary keys without hitting the
database. It actually hits the database every 10th go and updates the
counter by 10 in the database.

You might want to try just using sequences - PG does some caching of generated
values for you. Sorry - can't remember how you alter the cache size, but try
SELECT * FROM my_sequence;
to see the details of its settings.

Use sequences, and from your sequence-holding class do something like:
SELECT nextval('myseq'),nextval('myseq'),...10 times...
That will give you a block of 10 sequence values in one go, and off you go.

If you'd rather have the values in one column, create a single-column table
"seq_count" and populate with values 1..10 then:
SELECT nextval('myseq'),seq_count.id FROM seq_count;

That any use?

Thanks for that. I investigated using your suggestion but actually
discovered that these tables were only being updated by the stored
procedures so this meant it was safe to use sequences there. I have now
set these up however I am now getting "deadlock detection" errors.

I see from a previous email on the list that someone was able to get
decent debug out of the locks, i.e.

Aug 10 14:19:36 thunder postgres[18735]: [2-1] ERROR: deadlock detected

Aug 10 14:19:36 thunder postgres[18735]: [2-2] DETAIL: Proc 18735 waits
for AccessExclusiveLock on relation 18028 of database 17140; blocked by

How can I get this? I must be overlooking something because I've set the debug level to debug5 and still only get a single message saying
ERROR: deadlock detected
DEBUG: AbortCurrentTransaction

I've switched from using Postgresql 7.3.2 on a linux server to 7.3.4 running on my own machine through cygwin.

I've got output from "select * from pg_locks;" but am not getting very far with this. The only locks marked as false do not give a table oid, only the transaction id. i.e.
| | 11515901 | 30440 | ShareLock | f

Will continue working away to see if I can locate the deadlock.

Tim

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim McAuley (#3)
Re: sequence's plpgsql

Tim McAuley <mcauleyt@tcd.ie> writes:

I see from a previous email on the list that someone was able to get
decent debug out of the locks, i.e.

Aug 10 14:19:36 thunder postgres[18735]: [2-1] ERROR: deadlock detected
Aug 10 14:19:36 thunder postgres[18735]: [2-2] DETAIL: Proc 18735 waits
for AccessExclusiveLock on relation 18028 of database 17140; blocked by

How can I get this?

This display is a new feature in 7.4. Can you try your problem on a
7.4 beta release?

regards, tom lane

#5Tim McAuley
mcauleyt@tcd.ie
In reply to: Tom Lane (#4)
Re: sequence's plpgsql

I see from a previous email on the list that someone was able to get
decent debug out of the locks, i.e.

Aug 10 14:19:36 thunder postgres[18735]: [2-1] ERROR: deadlock detected
Aug 10 14:19:36 thunder postgres[18735]: [2-2] DETAIL: Proc 18735 waits
for AccessExclusiveLock on relation 18028 of database 17140; blocked by

How can I get this?

This display is a new feature in 7.4. Can you try your problem on a
7.4 beta release?

Ah, that would make sense.

Another question now. I am unable to compile Postgresql 7.4 beta 3 under
cygwin (Windows 2K, using cgyipc 2).

I am getting the error:
"
creating information schema... ERROR: end-of-copy marker does not match
previous newline style
CONTEXT: COPY FROM, line 361
"

Any ideas?

Thanks,

Tim

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim McAuley (#5)
initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tim McAuley <mcauleyt@tcd.ie> writes:

Another question now. I am unable to compile Postgresql 7.4 beta 3 under
cygwin (Windows 2K, using cgyipc 2).

I am getting the error:
"
creating information schema... ERROR: end-of-copy marker does not match
previous newline style
CONTEXT: COPY FROM, line 361
"

That's interesting. COPY is complaining because the \. terminator in
the file it's been fed has a different kind of newline after it than
the newlines earlier in the file (LF vs CR/LF, no doubt). The part
of the initdb script that must be causing this is

echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported, comments) FROM STDIN;"
cat "$datadir"/sql_features.txt
echo "\."
) \
| "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
echo "ok"

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

regards, tom lane

#7Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#6)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 2003-09-26 at 11:01, Tom Lane wrote:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

Does cygwin still have the install time option of what type of line feed
to use? I know at one point (a long time ago) when I installed cygwin,
and chose windows line feeds (CRLF) that it caused problems with several
applications. So the problem might be that with CYGWIN you could have
either type of line feed depending on what the user selected during
install.

#8Michael Meskes
meskes@postgresql.org
In reply to: Tom Lane (#6)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, Sep 26, 2003 at 11:01:34AM -0400, Tom Lane wrote:

I am getting the error:
"
creating information schema... ERROR: end-of-copy marker does not match
previous newline style
CONTEXT: COPY FROM, line 361
"

That's interesting. COPY is complaining because the \. terminator in
the file it's been fed has a different kind of newline after it than
the newlines earlier in the file (LF vs CR/LF, no doubt). The part

Well actually I'm not so sure. I tried installing 7.4 from Oliver's
Debian packages and experienced exactly the same. We have yet to find
out why, but I doubt newlines changed between his Debian system and mine.

of the initdb script that must be causing this is

echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported, comments) FROM STDIN;"
cat "$datadir"/sql_features.txt
echo "\."
) \
| "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
echo "ok"

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

Putting the "\." line into sql_features.txt did help me, too.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

--
Peter Eisentraut peter_e@gmx.net

#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#9)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

This error is coming from the new 7.4 COPY code that allows \r\n as a
line terminator. Requiring the end-of-line to be consistent seemed to
be the only way to be sure we were not eating a literal carriage return
in the data stream. Let's put the "\." into sql_features.txt and see if
that fixes it, or can we use echo -c "\.\n" in initdb?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#9)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check. I'm not very comfortable with that, but maybe we
should. Bruce?

regards, tom lane

#12Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#11)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check. I'm not very comfortable with that, but maybe we
should. Bruce?

I posted on that a few minutes ago. Yea, we can drop it, but we risk
eating carraige returns as data values. I am not sure how consistently
we output literal carriage returns in old dumps, nor how many apps
produce on literal carriage returns in COPY. If we conditionally eat
them, we run the risk of discarding some of their data without warning.
Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#13Andrew Dunstan
andrew@dunslane.net
In reply to: Bruce Momjian (#12)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian wrote:

I posted on that a few minutes ago. Yea, we can drop it, but we risk
eating carraige returns as data values. I am not sure how consistently
we output literal carriage returns in old dumps, nor how many apps
produce on literal carriage returns in COPY. If we conditionally eat
them, we run the risk of discarding some of their data without warning.
Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

Would the best longterm solution be to require escaping CR in a data
value? (Yes I know this might cause backwards compatibility problems, at
least for a while).

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#12)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

I like the idea of reducing the newline consistency check to a warning.
There is one thing we'd have to watch for though (it's already an issue
but would become a bigger one): client-side COPY code had better be
prepared to absorb backend Notice messages while processing COPY IN.
Currently libpq doesn't read input data at all during a COPY IN loop,
which means that if the COPY generates more than a few K of warning
messages, the backend gets blocked on a full pipe and the whole
operation locks up. I have been meaning to fix that in libpq anyway,
but what other client libraries might have the same issue? Anyone know
whether JDBC would need a similar fix?

regards, tom lane

#15Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Andrew Dunstan (#13)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Andrew Dunstan wrote:

Bruce Momjian wrote:

I posted on that a few minutes ago. Yea, we can drop it, but we risk
eating carraige returns as data values. I am not sure how consistently
we output literal carriage returns in old dumps, nor how many apps
produce on literal carriage returns in COPY. If we conditionally eat
them, we run the risk of discarding some of their data without warning.
Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

Would the best longterm solution be to require escaping CR in a data
value? (Yes I know this might cause backwards compatibility problems, at
least for a while).

Yes, we do have that documented, but we supported it in earlier
releases, so we can't be sure who is using it, and it probably exists in
older dumps.

The particular problem is not literal carriage returns, which we don't
support any more, but carriage returns that happen to be at the end of
the line, right up against \n. If we relax this, we will conditionally
strip off the \r.

In fact, we can do that now:

create table xx(text);

Right now, if all the column value end with \r, we will silently eat it.
But if some have \r and some do not, we will throw an error. (Above,
when I say \r, I mean literal carriage return, not the "\r" string,
which we have always handled cleanly.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#16Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#14)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

I like the idea of reducing the newline consistency check to a warning.
There is one thing we'd have to watch for though (it's already an issue
but would become a bigger one): client-side COPY code had better be
prepared to absorb backend Notice messages while processing COPY IN.
Currently libpq doesn't read input data at all during a COPY IN loop,
which means that if the COPY generates more than a few K of warning
messages, the backend gets blocked on a full pipe and the whole
operation locks up. I have been meaning to fix that in libpq anyway,
but what other client libraries might have the same issue? Anyone know
whether JDBC would need a similar fix?

Wow, that sounds big. The ERROR will only happen once, while the
WARNING could happen a lot --- we could add code to throw the WARNING
only once per COPY command --- that would probably make sense. I don't
see how we could get all clients to handle this for 7.4, particularly
clients from previous releases.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#17Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#9)
1 attachment(s)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

Here is a little diff to make initdb behave if we decide to keep the
COPY check.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/difftext/plainDownload
Index: src/bin/initdb/initdb.sh
===================================================================
RCS file: /cvsroot/pgsql-server/src/bin/initdb/initdb.sh,v
retrieving revision 1.201
diff -c -c -r1.201 initdb.sh
*** src/bin/initdb/initdb.sh	7 Sep 2003 03:36:03 -0000	1.201
--- src/bin/initdb/initdb.sh	26 Sep 2003 18:23:16 -0000
***************
*** 1087,1094 ****
    echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported, comments) FROM STDIN;"
    cat "$datadir"/sql_features.txt
    echo "\."
! ) \
! 	| "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
  echo "ok"
  
  $ECHO_N "vacuuming database template1... "$ECHO_C
--- 1087,1095 ----
    echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported, comments) FROM STDIN;"
    cat "$datadir"/sql_features.txt
    echo "\."
! ) |
! tr -d '\r' | # make newlines consistent for Win32
! "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
  echo "ok"
  
  $ECHO_N "vacuuming database template1... "$ECHO_C
#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#16)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

... we could add code to throw the WARNING
only once per COPY command --- that would probably make sense.

Seems like a bit of a kluge, but perhaps the best compromise. It would
be quite likely that you'd get the same warning on many lines of a COPY,
and that probably isn't really going to help people.

regards, tom lane

#19scott.marlowe
scott.marlowe@ihs.com
In reply to: Bruce Momjian (#12)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Bruce Momjian wrote:

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check. I'm not very comfortable with that, but maybe we
should. Bruce?

I posted on that a few minutes ago. Yea, we can drop it, but we risk
eating carraige returns as data values. I am not sure how consistently
we output literal carriage returns in old dumps, nor how many apps
produce on literal carriage returns in COPY. If we conditionally eat
them, we run the risk of discarding some of their data without warning.
Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database
to a 7.4beta3 database is producing some errors like this:

ERROR: literal newline found in data
HINT: Use "\n" to represent newline.
CONTEXT: COPY FROM, line 59

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 41

These show up with little or no context, only the "line number" of the
dump file. Since I'm wrapping these up in pg_dumpall, I don't have the
dump file so I don't know where the error is really occuring. It would be
nice to have such occurances echo the table / row they are getting the
error on, or maybe just the first 20 or so characters, so they'd be easier
to identify.

Is this related to this issue?

#20Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#11)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane writes:

Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check. I'm not very comfortable with that, but maybe we
should. Bruce?

I don't mind if we keep it on pure-POSIX platforms. But one of the nicer
developments on Windows in recent(?) times is that you can actually use
any kind of line separator and most programs still work correctly (with
the notable exception of Notepad). The drawback of this is that you never
really know what kind of line separator a program is actually going to
write, especially when you start mixing Cygwin, MinGW, and native. So
this newline consistency check is not only going to be pretty annoying on
Windows, it's going to cancel a feature of the operating system
environment.

--
Peter Eisentraut peter_e@gmx.net

#21Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#18)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

... we could add code to throw the WARNING
only once per COPY command --- that would probably make sense.

Seems like a bit of a kluge, but perhaps the best compromise. It would
be quite likely that you'd get the same warning on many lines of a COPY,
and that probably isn't really going to help people.

I can't think of any cases where we throw _tons_ of warnings at people,
like we could with COPY.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: scott.marlowe (#19)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

"scott.marlowe" <scott.marlowe@ihs.com> writes:

I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database
to a 7.4beta3 database is producing some errors like this:

ERROR: literal newline found in data
HINT: Use "\n" to represent newline.
CONTEXT: COPY FROM, line 59

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 41

Really? 7.2 should dump data \r or \n as the backslash versions ...
and does in my tests. Can you make a reproducible test case?

It would be nice to have such occurances echo the table / row they are
getting the error on, or maybe just the first 20 or so characters, so
they'd be easier to identify.

That's not a bad idea. I think it would be fairly easy now for the
CONTEXT line of the error message to include the input data line:

CONTEXT: COPY FROM, line 41: "data here ...."

at least up through the field where the error gets thrown, and with some
limit on the length of the data that will get echoed. If people like
that idea I'll see about making it happen.

regards, tom lane

#23Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#22)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane wrote:

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 41

Really? 7.2 should dump data \r or \n as the backslash versions ...
and does in my tests. Can you make a reproducible test case?

It would be nice to have such occurances echo the table / row they are
getting the error on, or maybe just the first 20 or so characters, so
they'd be easier to identify.

That's not a bad idea. I think it would be fairly easy now for the
CONTEXT line of the error message to include the input data line:

CONTEXT: COPY FROM, line 41: "data here ...."

at least up through the field where the error gets thrown, and with some
limit on the length of the data that will get echoed. If people like
that idea I'll see about making it happen.

Also, he wanted table name too.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#20)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check. I'm not very comfortable with that, but maybe we
should. Bruce?

I don't mind if we keep it on pure-POSIX platforms. But one of the nicer
developments on Windows in recent(?) times is that you can actually use
any kind of line separator and most programs still work correctly (with
the notable exception of Notepad).

Not sure if we should make the behavior Windows-specific though. And
didn't Michael report seeing the same initdb failure on Debian? That
confuses me a bit --- why would there be a newline discrepancy on Debian?

regards, tom lane

#25Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#24)
Re: initdb failure

Tom Lane wrote:

I don't mind if we keep it on pure-POSIX platforms. But one of the nicer
developments on Windows in recent(?) times is that you can actually use
any kind of line separator and most programs still work correctly (with
the notable exception of Notepad).

Not sure if we should make the behavior Windows-specific though. And
didn't Michael report seeing the same initdb failure on Debian? That
confuses me a bit --- why would there be a newline discrepancy on Debian?

Is this affected by the fact that the platform where a dump is made
might have a different line-end discipline from the one where the dump
is restored?

cheers

andrew

#26scott.marlowe
scott.marlowe@ihs.com
In reply to: Tom Lane (#22)
1 attachment(s)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Tom Lane wrote:

"scott.marlowe" <scott.marlowe@ihs.com> writes:

I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database
to a 7.4beta3 database is producing some errors like this:

ERROR: literal newline found in data
HINT: Use "\n" to represent newline.
CONTEXT: COPY FROM, line 59

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 41

Really? 7.2 should dump data \r or \n as the backslash versions ...
and does in my tests. Can you make a reproducible test case?

The attached file produces this problem. Note it's a blank trailing field
that looks to be causing it. The error for this .sql file is:

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 2

Note that loading this into pico and saving it back out fixes the problem.

If I remove the preceding row that doesn't end in a blank field, I get a
different error, this one:

ERROR: end-of-copy marker does not match previous newline style
CONTEXT: COPY FROM, line 2

It would be nice to have such occurances echo the table / row they are
getting the error on, or maybe just the first 20 or so characters, so
they'd be easier to identify.

That's not a bad idea. I think it would be fairly easy now for the
CONTEXT line of the error message to include the input data line:

CONTEXT: COPY FROM, line 41: "data here ...."

at least up through the field where the error gets thrown, and with some
limit on the length of the data that will get echoed. If people like
that idea I'll see about making it happen.

table name too, like Bruce said. The bothersome bit is that in pg_dump,
it says the line, relative to just this part of the copy command, so you
don't even know which table is giving the error.

Attachments:

people2.sqltext/plain; charset=US-ASCII; name=people2.sqlDownload
#27Bruce Momjian
pgman@candle.pha.pa.us
In reply to: scott.marlowe (#26)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

scott.marlowe wrote:

The attached file produces this problem. Note it's a blank trailing field
that looks to be causing it. The error for this .sql file is:

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 2

Note that loading this into pico and saving it back out fixes the problem.

If I remove the preceding row that doesn't end in a blank field, I get a
different error, this one:

ERROR: end-of-copy marker does not match previous newline style
CONTEXT: COPY FROM, line 2

OK, 'vi' shows it as:

COPY people2 (id, persons) FROM stdin;
59 Chance Terry--S
60 ^M
\.

which is _exactly the case the error was supposed to catch. Now, the
big question is where did this dump come from? Pg version? OS platform?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#28scott.marlowe
scott.marlowe@ihs.com
In reply to: Bruce Momjian (#27)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Bruce Momjian wrote:

scott.marlowe wrote:

The attached file produces this problem. Note it's a blank trailing field
that looks to be causing it. The error for this .sql file is:

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 2

Note that loading this into pico and saving it back out fixes the problem.

If I remove the preceding row that doesn't end in a blank field, I get a
different error, this one:

ERROR: end-of-copy marker does not match previous newline style
CONTEXT: COPY FROM, line 2

OK, 'vi' shows it as:

COPY people2 (id, persons) FROM stdin;
59 Chance Terry--S
60 ^M
\.

which is _exactly the case the error was supposed to catch. Now, the
big question is where did this dump come from? Pg version? OS platform?

The originating system is a RedHat 7.2 box with postgresql 7.2.x running
on it.

The destination system is a RedHat 7.2 box with postgresql 7.4 beta3
running on it.

The data likely came out of a (gasp, horrors) windows box.

#29Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Bruce Momjian (#27)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian wrote:

scott.marlowe wrote:

The attached file produces this problem. Note it's a blank trailing field
that looks to be causing it. The error for this .sql file is:

ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY FROM, line 2

Note that loading this into pico and saving it back out fixes the problem.

If I remove the preceding row that doesn't end in a blank field, I get a
different error, this one:

ERROR: end-of-copy marker does not match previous newline style
CONTEXT: COPY FROM, line 2

OK, 'vi' shows it as:

COPY people2 (id, persons) FROM stdin;
59 Chance Terry--S
60 ^M
\.

The significant issue here is that we don't want to throw a warning in
this case; it should be an error --- this is clearly a data value we
don't want to discard as part of the end-of-line.

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

Seems we should just fix initdb as my patch does and see if we get any
more mixed-newline cases or literal carriage-return data cases. My
guess is that we will get more of the latter.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#30Bruce Momjian
pgman@candle.pha.pa.us
In reply to: scott.marlowe (#28)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

scott.marlowe wrote:

OK, 'vi' shows it as:

COPY people2 (id, persons) FROM stdin;
59 Chance Terry--S
60 ^M
\.

which is _exactly the case the error was supposed to catch. Now, the
big question is where did this dump come from? Pg version? OS platform?

The originating system is a RedHat 7.2 box with postgresql 7.2.x running
on it.

The destination system is a RedHat 7.2 box with postgresql 7.4 beta3
running on it.

The data likely came out of a (gasp, horrors) windows box.

OK, try this on your 7.2:

test=> create table test(x text);
CREATE TABLE
test=> insert into test values ('\r');
INSERT 17158 1
test=> copy test to '/tmp/out';
COPY

Then 'vi' /tmp/out. It should show \r, not ^M. Please report back.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#31Michael Meskes
meskes@postgresql.org
In reply to: Tom Lane (#24)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, Sep 26, 2003 at 03:05:58PM -0400, Tom Lane wrote:

Not sure if we should make the behavior Windows-specific though. And
didn't Michael report seeing the same initdb failure on Debian? That
confuses me a bit --- why would there be a newline discrepancy on Debian?

I take it there are no special characters in that COPY, right? The only
difference between the two machines we found was that my was running
under UTF-8.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

#32scott.marlowe
scott.marlowe@ihs.com
In reply to: Bruce Momjian (#30)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Bruce Momjian wrote:

scott.marlowe wrote:

OK, 'vi' shows it as:

COPY people2 (id, persons) FROM stdin;
59 Chance Terry--S
60 ^M
\.

which is _exactly the case the error was supposed to catch. Now, the
big question is where did this dump come from? Pg version? OS platform?

The originating system is a RedHat 7.2 box with postgresql 7.2.x running
on it.

The destination system is a RedHat 7.2 box with postgresql 7.4 beta3
running on it.

The data likely came out of a (gasp, horrors) windows box.

OK, try this on your 7.2:

test=> create table test(x text);
CREATE TABLE
test=> insert into test values ('\r');
INSERT 17158 1
test=> copy test to '/tmp/out';
COPY

Then 'vi' /tmp/out. It should show \r, not ^M. Please report back.

I'm not much of a vi guy, so the out file thing didn't tell me much, but
if I try and dump it from the 7.4 beta3 box into a new table, I get:

ERROR: end-of-copy marker does not match previous newline style
CONTEXT: COPY FROM, line 2

#33scott.marlowe
scott.marlowe@ihs.com
In reply to: Bruce Momjian (#30)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Bruce Momjian wrote:

scott.marlowe wrote:

OK, 'vi' shows it as:

COPY people2 (id, persons) FROM stdin;
59 Chance Terry--S
60 ^M
\.

which is _exactly the case the error was supposed to catch. Now, the
big question is where did this dump come from? Pg version? OS platform?

The originating system is a RedHat 7.2 box with postgresql 7.2.x running
on it.

The destination system is a RedHat 7.2 box with postgresql 7.4 beta3
running on it.

The data likely came out of a (gasp, horrors) windows box.

OK, try this on your 7.2:

test=> create table test(x text);
CREATE TABLE
test=> insert into test values ('\r');
INSERT 17158 1
test=> copy test to '/tmp/out';
COPY

Then 'vi' /tmp/out. It should show \r, not ^M. Please report back.

Figured out iv (-b switch, man pages rock) and it's a ^M

#34Peter Eisentraut
peter_e@gmx.net
In reply to: scott.marlowe (#26)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

scott.marlowe writes:

table name too, like Bruce said. The bothersome bit is that in pg_dump,
it says the line, relative to just this part of the copy command, so you
don't even know which table is giving the error.

I don't see the problem. Can't you identify the failing command by the
line number that psql gives you?

--
Peter Eisentraut peter_e@gmx.net

#35Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#29)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian writes:

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment. Who actually has CRs in their data?

--
Peter Eisentraut peter_e@gmx.net

#36scott.marlowe
scott.marlowe@ihs.com
In reply to: Peter Eisentraut (#34)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Peter Eisentraut wrote:

scott.marlowe writes:

table name too, like Bruce said. The bothersome bit is that in pg_dump,
it says the line, relative to just this part of the copy command, so you
don't even know which table is giving the error.

I don't see the problem. Can't you identify the failing command by the
line number that psql gives you?

OK, here's the output from pg_dump while it's running

ERROR: function "odbc_user" already exists with same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user "ayousuff".
ERROR: literal newline found in data
HINT: Use "\n" to represent newline.
CONTEXT: COPY FROM, line 59
You are now connected as new user "smarlowe".
You are now connected as new user "ayousuff".
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
You are now connected as new user "smarlowe".
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX

So how am I supposed to figure out which table and which row broke? Keep
in mind, I'm doing this:

pg_dump -h otherserver dbname|psql -h desthost dbname

but I get basically the same thing if I dump it to a .sql file and do:

psql dbname <dbname.sql

#37scott.marlowe
scott.marlowe@ihs.com
In reply to: Peter Eisentraut (#35)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Peter Eisentraut wrote:

Bruce Momjian writes:

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment. Who actually has CRs in their data?

I do. And so do lots of other people.

#38Larry Rosenman
ler@lerctr.org
In reply to: scott.marlowe (#37)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

--On Friday, September 26, 2003 14:36:08 -0600 "scott.marlowe"
<scott.marlowe@ihs.com> wrote:

On Fri, 26 Sep 2003, Peter Eisentraut wrote:

Bruce Momjian writes:

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment. Who actually has CRs in their data?

I do. And so do lots of other people.

I store E-Mail type comments in some of my DB's and they may contain cr's.

Don't ASSUME that we don't do stuff like this with a database.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

#39Peter Eisentraut
peter_e@gmx.net
In reply to: scott.marlowe (#36)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

scott.marlowe writes:

but I get basically the same thing if I dump it to a .sql file and do:

psql dbname <dbname.sql

Use psql -f dbname.sql instead.

--
Peter Eisentraut peter_e@gmx.net

#40scott.marlowe
scott.marlowe@ihs.com
In reply to: Peter Eisentraut (#39)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Peter Eisentraut wrote:

scott.marlowe writes:

but I get basically the same thing if I dump it to a .sql file and do:

psql dbname <dbname.sql

Use psql -f dbname.sql instead.

and the output is:

psql:webport.sql:803: ERROR: function "odbc_user" already exists with
same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user "ayousuff".
psql:webport.sql:869: ERROR: literal newline found in data
HINT: Use "\n" to represent newline.
CONTEXT: COPY FROM, line 59
You are now connected as new user "smarlowe".
You are now connected as new user "ayousuff".
CREATE INDEX
CREATE INDEX
CREATE INDEX

the same. It doesn't tell me which table in my dump caused the problem,
and it certainly isn't line 59 of the dump file, but of the table
producing the error.

#41scott.marlowe
scott.marlowe@ihs.com
In reply to: scott.marlowe (#40)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

scott.marlowe writes:

but I get basically the same thing if I dump it to a .sql file and do:
psql dbname <dbname.sql

Use psql -f dbname.sql instead.

This doesn't seem like a good argument not to add more information to
the CONTEXT line for COPY errors. Sure, in theory the existing info
should be sufficient, but what if the information is not coming in
through psql? (For instance, maybe the COPY data is being generated
on-the-fly by some other program.) Or what if the dump file is so large
you can't easily edit it to determine which line number is in question?
There are plenty of scenarios where it's not all that convenient to
triangulate on a problem from outside information. Minimalism isn't
really a virtue in error reports anyway.

I'm thinking maybe:

CONTEXT: COPY tablename, line 41: "data ..."

would serve the purpose nicely.

Yeah, just having the table name and line number would be plenty for me.
It's the lack of a table name that makes it so frustrating. I had to
basically dump / restore the tables one at a time to figure out which one
was causing the error. On a database with hundreds of tables, that could
be painful.

#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#39)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut <peter_e@gmx.net> writes:

scott.marlowe writes:

but I get basically the same thing if I dump it to a .sql file and do:
psql dbname <dbname.sql

Use psql -f dbname.sql instead.

This doesn't seem like a good argument not to add more information to
the CONTEXT line for COPY errors. Sure, in theory the existing info
should be sufficient, but what if the information is not coming in
through psql? (For instance, maybe the COPY data is being generated
on-the-fly by some other program.) Or what if the dump file is so large
you can't easily edit it to determine which line number is in question?
There are plenty of scenarios where it's not all that convenient to
triangulate on a problem from outside information. Minimalism isn't
really a virtue in error reports anyway.

I'm thinking maybe:

CONTEXT: COPY tablename, line 41: "data ..."

would serve the purpose nicely.

regards, tom lane

#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#35)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut <peter_e@gmx.net> writes:

Bruce Momjian writes:

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment. Who actually has CRs in their data?

The question isn't so much "who has CRs in their data" as "who is trying
to import data files in which CRs aren't correctly represented as \r" ?
Not anyone upgrading from a recent PG release ... though 7.1 or before
would have an issue.

If we eliminate the mixed-newline check then we will silently discard
CRs that appear at the tail end of the last column of a data line. This
is somewhat worrisome because the failure will not be obvious. But in
most cases (CR in the middle of a data field, or in a field that isn't
last), we will detect and report an error due to field count mismatch.
I think that that might be good enough.

If we were designing COPY from scratch today, we'd certainly not think
that a mixed-newline check was appropriate; it's only because of our
historical handling of literal CRs that the question comes up at all.

regards, tom lane

#44Peter Eisentraut
peter_e@gmx.net
In reply to: scott.marlowe (#40)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

scott.marlowe writes:

psql:webport.sql:803: ERROR: function "odbc_user" already exists with
same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user "ayousuff".
psql:webport.sql:869: ERROR: literal newline found in data
HINT: Use "\n" to represent newline.
CONTEXT: COPY FROM, line 59
You are now connected as new user "smarlowe".
You are now connected as new user "ayousuff".
CREATE INDEX
CREATE INDEX
CREATE INDEX

the same. It doesn't tell me which table in my dump caused the problem,
and it certainly isn't line 59 of the dump file, but of the table
producing the error.

The table is the one the COPY command on line 869 in file webport.sql
mentions and the problem data is on line 928 of that file.

--
Peter Eisentraut peter_e@gmx.net

#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#43)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

I said:

The question isn't so much "who has CRs in their data" as "who is trying
to import data files in which CRs aren't correctly represented as \r" ?
Not anyone upgrading from a recent PG release ... though 7.1 or before
would have an issue.

Actually, checking the CVS logs shows that 7.2.1 was the first release
that would emit a data CR or LF as \r or \n rather than literally. If
Scott is dumping from an original 7.2 installation rather than a dot-
release, that would explain his report.

I talked to Bruce about this on the phone, and we now both feel that
that change was too recent to assume that people won't be trying to load
dumps containing bare CRs into 7.4. Accordingly, it seems that we do
need to throw an error or at least a warning for mixed newlines.
Eventually we can get rid of that behavior and just treat all newline
sequences alike, but I think we can't do it until 7.2.0 is in the
"ancient history" category.

regards, tom lane

#46Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#42)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane writes:

This doesn't seem like a good argument not to add more information to
the CONTEXT line for COPY errors. Sure, in theory the existing info
should be sufficient, but what if the information is not coming in
through psql? (For instance, maybe the COPY data is being generated
on-the-fly by some other program.)

Then you look into the code of that other program. Or you look into the
server log, where you should log the statements you generate if you are
testing your code.

Or what if the dump file is so large you can't easily edit it to
determine which line number is in question?

The line number is already contained in the available information about
the error. If the file is too large to load into an editor, you could use

perl -npi -e '$. == 123456 and s/\r/\\r/;'

or something along those lines.

There are plenty of scenarios where it's not all that convenient to
triangulate on a problem from outside information.

Maybe, but the ones I've seen mentioned so far are not among them. I'm
not opposed to making errors more easy to identify, but considering that
someone else in this thread didn't even know about psql's option to print
line numbers of errors, I think some people haven't done their homework
yet.

Minimalism isn't really a virtue in error reports anyway.

I'm thinking maybe:

CONTEXT: COPY tablename, line 41: "data ..."

would serve the purpose nicely.

The only thing that would really help in the general case is the number of
the character where the error occurs. If you print the actual data you
lose if the data is repeated within the, er, data and/or if the section of
the data that you print contains crazy characters that mess up the
display.

--
Peter Eisentraut peter_e@gmx.net

#47Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#34)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

scott.marlowe writes:

table name too, like Bruce said. The bothersome bit is that in pg_dump,
it says the line, relative to just this part of the copy command, so you
don't even know which table is giving the error.

I don't see the problem. Can't you identify the failing command by the
line number that psql gives you?

If you are loading from pg_dump, you have lots of copy commands, so how
do you know which COPY command caused the failure. You just have the
line number of _a_ copy.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#48Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#35)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Bruce Momjian writes:

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment. Who actually has CRs in their data?

Uh, people storing Win32 text documents.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#46)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut <peter_e@gmx.net> writes:

Minimalism isn't really a virtue in error reports anyway.
I'm thinking maybe:
CONTEXT: COPY tablename, line 41: "data ..."
would serve the purpose nicely.

The only thing that would really help in the general case is the number of
the character where the error occurs. If you print the actual data you
lose if the data is repeated within the, er, data and/or if the section of
the data that you print contains crazy characters that mess up the
display.

A messed-up display would be useful information in itself; and it's not
like we have no precedent for repeating the erroneous data in error
messages. Consider pg_atoi:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# copy foo from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.

z44
\.

ERROR: invalid input syntax for integer: "z44"
CONTEXT: COPY FROM, line 1
regression=#

I was chewing this over with Bruce on the phone just now, and we refined
the idea a little. Some errors (primarily those detected inside the
datatype input procedures) can be clearly traced to a specific column,
whereas others (such as too many fields on an input line) really can't
be nailed down more tightly than a line. So we were thinking about two
different flavors of context info:

CONTEXT: COPY tablename, line n, field colname: "column data"

versus

CONTEXT: COPY tablename, line n: "line data"

where in each case the data display would be truncated at 100 characters
or so (and would have to be omitted in the COPY BINARY case anyway).

If you're really concerned about funny characters messing up the report,
we could imagine replacing them by backslash sequences or some such, but
I suspect that would create more confusion than it would solve.

regards, tom lane

#50Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#9)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

At this point, we have one mixed-EOL case (initdb, fixable), and one \r
in data case (from 7.2). Seems the safest solution is to fix initdb and
see what other failure reports we get.

If we downgrade it to a warning, we will not know about the failures;
someone could miss a warning in a dump, but probably will not mess an
error and an empty table. If we get more reports like initdb, we can
reevaluate, but it seems the safest course is to keep our existing code,
which only removes carriage returns when it quite sure.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#51Jon Jensen
jon@endpoint.com
In reply to: Tom Lane (#49)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Tom Lane wrote:

I was chewing this over with Bruce on the phone just now, and we refined
the idea a little. Some errors (primarily those detected inside the
datatype input procedures) can be clearly traced to a specific column,
whereas others (such as too many fields on an input line) really can't
be nailed down more tightly than a line. So we were thinking about two
different flavors of context info:

CONTEXT: COPY tablename, line n, field colname: "column data"

versus

CONTEXT: COPY tablename, line n: "line data"

Those would be very helpful bits of information.

where in each case the data display would be truncated at 100 characters
or so (and would have to be omitted in the COPY BINARY case anyway).

If you're really concerned about funny characters messing up the report,
we could imagine replacing them by backslash sequences or some such, but
I suspect that would create more confusion than it would solve.

I hate to mention it, but would it be useful/non-overkill to make either
of those things (context message maximum length and funny character
escaping) configurable somehow?

Jon

#52Kris Jurka
books@ejurka.com
In reply to: Tom Lane (#14)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 26 Sep 2003, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

I like the idea of reducing the newline consistency check to a warning.
There is one thing we'd have to watch for though (it's already an issue
but would become a bigger one): client-side COPY code had better be
prepared to absorb backend Notice messages while processing COPY IN.
Currently libpq doesn't read input data at all during a COPY IN loop,
which means that if the COPY generates more than a few K of warning
messages, the backend gets blocked on a full pipe and the whole
operation locks up. I have been meaning to fix that in libpq anyway,
but what other client libraries might have the same issue? Anyone know
whether JDBC would need a similar fix?

JDBC does not support the COPY protocol at the moment.

Kris Jurka

#53Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Jon Jensen (#51)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Jon Jensen wrote:

On Fri, 26 Sep 2003, Tom Lane wrote:

I was chewing this over with Bruce on the phone just now, and we refined
the idea a little. Some errors (primarily those detected inside the
datatype input procedures) can be clearly traced to a specific column,
whereas others (such as too many fields on an input line) really can't
be nailed down more tightly than a line. So we were thinking about two
different flavors of context info:

CONTEXT: COPY tablename, line n, field colname: "column data"

versus

CONTEXT: COPY tablename, line n: "line data"

Those would be very helpful bits of information.

where in each case the data display would be truncated at 100 characters
or so (and would have to be omitted in the COPY BINARY case anyway).

If you're really concerned about funny characters messing up the report,
we could imagine replacing them by backslash sequences or some such, but
I suspect that would create more confusion than it would solve.

I hate to mention it, but would it be useful/non-overkill to make either
of those things (context message maximum length and funny character
escaping) configurable somehow?

Overkill. Sorry.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#54Oliver Elphick
olly@lfix.co.uk
In reply to: Bruce Momjian (#47)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Fri, 2003-09-26 at 23:18, Bruce Momjian wrote:

If you are loading from pg_dump, you have lots of copy commands, so how
do you know which COPY command caused the failure. You just have the
line number of _a_ copy.

I would recommend using
psql -e
so that the sql commands are output too.

Here is a documentation patch:

Index: backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v
retrieving revision 2.30
diff -c -r2.30 backup.sgml
*** backup.sgml	31 Aug 2003 17:32:18 -0000	2.30
--- backup.sgml	27 Sep 2003 05:45:04 -0000
***************
*** 100,106 ****
      be read in by the <application>psql</application> program. The
      general command form to restore a dump is
  <synopsis>
! psql <replaceable class="parameter">dbname</replaceable> &lt; <replaceable class="parameter">infile</replaceable>
  </synopsis>
      where <replaceable class="parameter">infile</replaceable> is what
      you used as <replaceable class="parameter">outfile</replaceable>
--- 100,106 ----
      be read in by the <application>psql</application> program. The
      general command form to restore a dump is
  <synopsis>
! psql -f <replaceable class="parameter">infile</replaceable> -d <replaceable class="parameter">dbname</replaceable> 
  </synopsis>
      where <replaceable class="parameter">infile</replaceable> is what
      you used as <replaceable class="parameter">outfile</replaceable>
***************
*** 112,117 ****
--- 112,124 ----
      <application>psql</> supports similar options to <application>pg_dump</> 
      for controlling the database server location and the user name. See
      its reference page for more information.
+    </para>
+ 
+    <para>
+     With a large dump, it may be difficult to identify where any errors are
+     occurring.  You may use the -e option to psql to print the SQL commands
+     as they are run, so that it is easy to see precisely which commands are
+     causing errors.
     </para>

<para>

--
Oliver Elphick Oliver.Elphick@lfix.co.uk
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C
========================================
"My brethren, count it all joy when ye fall into
various trials, Knowing that the testing of your faith
produces endurance." James 1:2,3

#55Peter Eisentraut
peter_e@gmx.net
In reply to: Jon Jensen (#51)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Jon Jensen writes:

I hate to mention it, but would it be useful/non-overkill to make either
of those things (context message maximum length and funny character
escaping) configurable somehow?

I'm thinking the same. We have conflicting needs of different groups:
Users on "real" Unix platforms converting old data need the checks, users
on Windows platforms with new data don't want them. (I wonder what people
on Mac OS that mix Unix-style and native applications would want.)

--
Peter Eisentraut peter_e@gmx.net

#56Peter Eisentraut
peter_e@gmx.net
In reply to: Oliver Elphick (#54)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Oliver Elphick writes:

+    <para>
+     With a large dump, it may be difficult to identify where any errors are
+     occurring.  You may use the -e option to psql to print the SQL commands
+     as they are run, so that it is easy to see precisely which commands are
+     causing errors.
</para>

That is just not true. If you use -f, it will tell you the line number of
the command causing the error. Add the line number of the COPY error
message, there you have it.

--
Peter Eisentraut peter_e@gmx.net

#57Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Bruce Momjian (#50)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian wrote:

Peter Eisentraut wrote:

Tom Lane writes:

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt. A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer. Peter, any thoughts?

There's no clean answer to this on Cygwin. This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

At this point, we have one mixed-EOL case (initdb, fixable), and one \r
in data case (from 7.2). Seems the safest solution is to fix initdb and
see what other failure reports we get.

If we downgrade it to a warning, we will not know about the failures;
someone could miss a warning in a dump, but probably will not mess an
error and an empty table. If we get more reports like initdb, we can
reevaluate, but it seems the safest course is to keep our existing code,
which only removes carriage returns when it quite sure.

OK, I have applied a patch to make initdb create a COPY file with
consistent end-of-line termination, so it passes the COPY EOL checking.

If we find more cases where load files have intended inconsistent EOL
termination, we can revisit the COPY error for inconsistent EOL
termination.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#58Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#56)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Oliver Elphick writes:

+    <para>
+     With a large dump, it may be difficult to identify where any errors are
+     occurring.  You may use the -e option to psql to print the SQL commands
+     as they are run, so that it is easy to see precisely which commands are
+     causing errors.
</para>

That is just not true. If you use -f, it will tell you the line number of
the command causing the error. Add the line number of the COPY error
message, there you have it.

You are assuming it is easy to find what is on a specific line of the
dump file. I am not sure that is always easy for people with limited
Unix skills, or MSWin folks. I am not sure I would have thought to add
the file offset to find the problem COPY line. I guess I would have
eventually, but it wouldn't have been my first idea, and I might _not_
have used -f on the load, and if the load took an hour, I would have to
run it again to get that line number.

I can see the point that the table name is only really valuable when you
are loading a dump, and not valuabvle when you are just doing a copy.
However, copy is used enought in dumps that the exta word (the table
name) doesn't see to hurt.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#59Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#58)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian writes:

You are assuming it is easy to find what is on a specific line of the
dump file. I am not sure that is always easy for people with limited
Unix skills, or MSWin folks. I am not sure I would have thought to add
the file offset to find the problem COPY line. I guess I would have
eventually, but it wouldn't have been my first idea, and I might _not_
have used -f on the load, and if the load took an hour, I would have to
run it again to get that line number.

That is all besides the point. If adding -f to the command line is for
some reason prohibitive, then the same applies to -e. That is all.

--
Peter Eisentraut peter_e@gmx.net

#60Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#57)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian writes:

OK, I have applied a patch to make initdb create a COPY file with
consistent end-of-line termination, so it passes the COPY EOL checking.

Maybe instead we can change the COPY command to read the file directly and
not via stdin? Then we don't need the \. marker.

--
Peter Eisentraut peter_e@gmx.net

#61Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#59)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Bruce Momjian writes:

You are assuming it is easy to find what is on a specific line of the
dump file. I am not sure that is always easy for people with limited
Unix skills, or MSWin folks. I am not sure I would have thought to add
the file offset to find the problem COPY line. I guess I would have
eventually, but it wouldn't have been my first idea, and I might _not_
have used -f on the load, and if the load took an hour, I would have to
run it again to get that line number.

That is all besides the point. If adding -f to the command line is for
some reason prohibitive, then the same applies to -e. That is all.

I see, both -e give query before error, -f gives line number before
error. I suppose the -e is clearer because you don't have to find the
line in the file, but the -e output makes it more likely they would miss
an error line in the output.

Seems we should recommend -f rather than "<" for restores anyway, right?

Reporting the table with the error is clearer, but this brings up
another case --- what happens with pg_dumpall? Do we print the database
name or will they know the database name from the table name?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#62Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#60)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut wrote:

Bruce Momjian writes:

OK, I have applied a patch to make initdb create a COPY file with
consistent end-of-line termination, so it passes the COPY EOL checking.

Maybe instead we can change the COPY command to read the file directly and
not via stdin? Then we don't need the \. marker.

Yes, we need a temp table then. The single-line fix seemed easier.

Also, the file is a mix of things from a file and echo's, it isn't just
the \.:

echo "UPDATE information_schema.sql_implementation_info SET character_value = '$combined_version' WHERE implementation_info_name = 'DBMS VERSION';"

echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported, comments) FROM STDIN;"
cat "$datadir"/sql_features.txt
echo "\."
) |
tr -d '\r' | # make newlines consistent for Win32

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#63Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Bruce Momjian (#61)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

On Sat, Sep 27, 2003 at 01:08:37PM -0400, Bruce Momjian wrote:

Peter Eisentraut wrote:

That is all besides the point. If adding -f to the command line is for
some reason prohibitive, then the same applies to -e. That is all.

Seems we should recommend -f rather than "<" for restores anyway, right?

Please keep in mind that it is not always possible to use -f. In my
case I had a file larger than 2GB and psql was compiled without large
file support. Detecting an error in that situation would have required
recompiling psql.

Also I don't see the point in not extending the context message of the
error. It's not like it's going to take too much processing power, nor
screen estate, so what is it?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Right now the sectors on the hard disk run clockwise, but I heard a rumor that
you can squeeze 0.2% more throughput by running them counterclockwise.
It's worth the effort. Recommended." (Gerry Pourwelle)

#64Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#62)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Peter Eisentraut wrote:

Maybe instead we can change the COPY command to read the file directly and
not via stdin? Then we don't need the \. marker.

I like that idea too.

Yes, we need a temp table then. The single-line fix seemed easier.

Why a temp table? The COPY command is only sourcing the contents of
"$datadir"/sql_features.txt.

regards, tom lane

#65Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#55)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Peter Eisentraut <peter_e@gmx.net> writes:

Jon Jensen writes:

I hate to mention it, but would it be useful/non-overkill to make either
of those things (context message maximum length and funny character
escaping) configurable somehow?

I'm thinking the same. We have conflicting needs of different groups:
Users on "real" Unix platforms converting old data need the checks, users
on Windows platforms with new data don't want them. (I wonder what people
on Mac OS that mix Unix-style and native applications would want.)

I wouldn't object to adding a GUC variable that turns off the
mixed-newlines error (as long as it defaults to enabled ;-)).

Our intention to remove the check in the future might then reduce
to simply flipping the factory default for this variable.

regards, tom lane

#66Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#64)
Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Peter Eisentraut wrote:

Maybe instead we can change the COPY command to read the file directly and
not via stdin? Then we don't need the \. marker.

I like that idea too.

Yes, we need a temp table then. The single-line fix seemed easier.

Why a temp table? The COPY command is only sourcing the contents of
"$datadir"/sql_features.txt.

Sorry, I meant temp file. I have SQL on the brain. :-)

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#67Christopher Browne
cbbrowne@libertyrms.info
In reply to: Peter Eisentraut (#35)
Re: initdb failure

pgman@candle.pha.pa.us (Bruce Momjian) writes:

Peter Eisentraut wrote:

Bruce Momjian writes:

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment. Who actually has CRs in their data?

Uh, people storing Win32 text documents.

Or people storing XML data that may contain all sorts of whitespace
that may include various inter-line "metadata."
--
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/&gt;
Christopher Browne
(416) 646 3304 x124 (land)