COPY error: pqReadData() -- backend closed the channel unexpectedly

Started by Lee Joramoover 25 years ago15 messagesgeneral
Jump to latest
#1Lee Joramo
lee.list@joramo.com

I am running to the following error when copying a file to a table:
"backend closed the channel unexpectedly"
____________

My basic system info:
[PostgreSQL 6.5.2 on powerpc-unknown-linux-gnu, compiled by gcc 2.95.2]
____________

The Table:
able    = classifieds
+----------------------------------+-
---------------------------------+-------+
|              Field               |              Type                |
Length|
+----------------------------------+-
---------------------------------+-------+
| category_number                  | int4                             | 
   4 |
| bullet                           | char()                           | 
   1 |
| border                           | bool                             | 
   1 |
| image                            | varchar()                        | 
  32 |
| rmc                              | bool                             | 
   1 |
| adcopy                           | text                             | 
 var |
+----------------------------------+-
---------------------------------+-------+
Indices:  classifieds_adcopy
          classifieds_bullet
          classifieds_category_number
          classifieds_image
____________

A copy of the psql command and error message:

rmcars2=> DELETE FROM classifieds;
rmcars2=> COPY classifieds FROM '/path/classifieds.dat';
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
We have lost the connection to the backend, so further processing is
impossible. Terminating.
____________

The 'classifieds.dat' consists of about 2200 lines. I have determined
that the problem is caused by just the following line (literal tabs have
been replaced with <TAB>):

825<TAB><TAB>f<TAB><TAB>f<TAB>Need more growing room ? Cozy up by one of
2 fireplaces, and stay warm this winter! This beautifully maintained
rancher, in paradise hills, offers 3 bedrooms, 2 baths, dining and family
rooms, and a large eat-in kitchen loaded with cupboards. 1844 square feet
of comfortable living on a large corner lot make this the perfect home
for the growing family. All these amenities and more at an affordable
price of only $144,900. Listing#00-5968 Call Shirley McGuiness 255-3810
or 254-8074 Keller Williams Realty
____________

Additional information:

The 'classifieds.dat' file is generated from our classified ad management
system. I am pulling this data into postgre for publication on the web.
The 'classifieds.dat' files generate from two previous weeks work just
fine. (And still do!)

If I remove the offending line form the file, the COPY command works just
fine.

After I isolated the line causing the problems, I assumed that I would
quickly find the source of the problem, but I have not. Here is what I
have done to isolated the problem:
* I inspected the line for wayward quotes or escape characters.
none found
* I have tried truncating the line until it work.
Results were strangely inconsistent
* I substituted letters for punctuation.
no effect
* substituted letters and numbers with the letter 'a'
It worked!
* substituted spaces with '_'
It worked!!!!! (I don't understand this!)

The above where each performed on a copy of the original bad line.

I have searched the mailing list archive, I did find number of messages
regarding the same error message, but none that seem to apply to my
situation. I also learned in the archive that when a 'backend closed'
error occurs I should be able to find a 'core' file located in the
database's directory. But I have not found a core file.

thanks
--
Lee A. Joramo ljoramo@nickads.com
The Nickel Want Ads www.nickads.com
Internet Manager 970-242-5555

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lee Joramo (#1)
Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Lee Joramo <lee.list@joramo.com> writes:

[PostgreSQL 6.5.2 on powerpc-unknown-linux-gnu, compiled by gcc 2.95.2]

Hm. Did you compile at -O0? Pre-7.1 PG is known to have a lot of
problems on PPC if compiled with any optimization at all.

The 'classifieds.dat' consists of about 2200 lines. I have determined
that the problem is caused by just the following line (literal tabs have
been replaced with <TAB>):

Are there any lines with more than 2700 characters worth of ad copy?
Pre-7.1 PG has a limit of 1/3 page or about 2700 bytes for any indexed
column ... and 6.5 tends to fall over rather than give an error if you
exceed the limit :-(

This particular line is well below that, but you could still see the
problem appear or disappear depending on which entries happen to fall
on the same disk page, so subtracting a line that isn't directly causing
the problem might be enough to mask the bug.

If that's not it, I'm not sure ... but I'd still recommend updating to
7.0.3 just on general principles.

I also learned in the archive that when a 'backend closed'
error occurs I should be able to find a 'core' file located in the
database's directory. But I have not found a core file.

On many Linuxes, processes started from boot scripts are by default
started with "ulimit -c 0", which prevents creation of core files.
You may need to say "ulimit -c unlimited" in the postmaster startup
script to allow creation of corefiles.

regards, tom lane

#3Lee Joramo
lee.list@joramo.com
In reply to: Tom Lane (#2)
Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Thanks for the helpful feedback.

[PostgreSQL 6.5.2 on powerpc-unknown-linux-gnu, compiled by gcc 2.95.2]

Hm. Did you compile at -O0? Pre-7.1 PG is known to have a lot of
problems on PPC if compiled with any optimization at all.

I don't remember. I installed it nine months ago. I normally keep notes
of what I do, but I don't seem to have recorded what I did while
installing postgre. Most likely I do not have specfic notes for the
postgre installation because because it was part of the initial LinuxPCC
2000 installation from CD.

The 'classifieds.dat' consists of about 2200 lines. I have determined
that the problem is caused by just the following line (literal tabs have
been replaced with <TAB>):

Are there any lines with more than 2700 characters worth of ad copy?
Pre-7.1 PG has a limit of 1/3 page or about 2700 bytes for any indexed
column ... and 6.5 tends to fall over rather than give an error if you
exceed the limit :-(

The longest line in the 'classifieds.dat' file is 1152 characters and the
corrosponding adcopy field is 1140 characters. So I should be well under
this problem. (That being said, I am glad to hear about this problem so
that I can catch it before it occurs.)

This particular line is well below that, but you could still see the
problem appear or disappear depending on which entries happen to fall
on the same disk page, so subtracting a line that isn't directly causing
the problem might be enough to mask the bug.

Except, that the file loads just fine without the offending line, _AND_
when the the offending line is loaded by itself it causes the 'backend
closed' error. I beleive, that the way that I isolated the bad line also
precludes your idea. I found the bad line by repeatedly cutting the
'classifieds.dat' file in halfs, until I isolated the bad line.

If that's not it, I'm not sure ... but I'd still recommend updating to
7.0.3 just on general principles.

I sure agree with that. Unfortunately, our public web server which is a
Cobalt RaQ3 that runs 6.5.2 as well. I am not going to attempt an upgrade
of the Cobalt, since much of the systems admin interface is driven by
postgre. (Plus this machine is hosted by a far away company.) I am
beginning to work up plans to change this situation, but until then I am
stuck with 6.5.2

On many Linuxes, processes started from boot scripts are by default
started with "ulimit -c 0", which prevents creation of core files.
You may need to say "ulimit -c unlimited" in the postmaster startup
script to allow creation of corefiles.

Thanks, I give that a try.

--
Lee A. Joramo ljoramo@nickads.com
The Nickel Want Ads www.nickads.com
Internet Manager 970-242-5555

#4Lee Joramo
lee.list@joramo.com
In reply to: Tom Lane (#2)
Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Thanks for the feed back.

Just a guess, but is there an embedded tab in the last field that you are
mistaking for a space? I've done that before and pulled my hair out trying
to figure out the issue.

Nope. When I quoted the bad line in my orignal message, I replaced
literal tabs with <TAB> with search and replace in my text editor. The
tabs are just where they should be. (The bad line is quoted at the end of
this message.)

Any other info in the log file that might give you a clue? Typically I find
some entry about a failed atoi (ascii to int) or something that can get me
pointed in the right direction.

I have looked for a proper log file, and can find own. On inspecting the
startup script it looks like postmaster is being involved with the
following command:

su -l postgres -c '/usr/bin/postmaster -i -S -D/var/lib/pgsql'

For a log file, shouldn't it be something like:

su -l postgres -c '/usr/bin/postmaster -i -S -D/var/lib/pgsql >
path/logfile'

I have felt your pain before. It's something simple that you will kick
yourself over. Best of luck.

I have felt this pain before too. I hope that I end up kicking myself and
not cursing postgre ;-)

--
Lee A. Joramo ljoramo@nickads.com
The Nickel Want Ads www.nickads.com
Internet Manager 970-242-5555

Show quoted text

825<TAB><TAB>f<TAB><TAB>f<TAB>Need more growing room ? Cozy up by one of
2 fireplaces, and stay warm this winter! This beautifully maintained
rancher, in paradise hills, offers 3 bedrooms, 2 baths, dining and family
rooms, and a large eat-in kitchen loaded with cupboards. 1844 square feet
of comfortable living on a large corner lot make this the perfect home
for the growing family. All these amenities and more at an affordable
price of only $144,900. Listing#00-5968 Call Shirley McGuiness 255-3810
or 254-8074 Keller Williams Realty

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lee Joramo (#4)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Lee Joramo <lee.list@joramo.com> writes:

I have looked for a proper log file, and can find own. On inspecting the
startup script it looks like postmaster is being involved with the
following command:

su -l postgres -c '/usr/bin/postmaster -i -S -D/var/lib/pgsql'

-S suppresses all log output, so first you need to get rid of that.
A more useful startup script would go like

su -l postgres -c \
'/usr/bin/postmaster -i -D/var/lib/pgsql >/path/to/logfile 2>&1 &'

Note the trailing & to get it to run in background --- you need that
when you don't use -S.

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lee Joramo (#3)
Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Lee Joramo <lee.list@joramo.com> writes:

Most likely I do not have specfic notes for the
postgre installation because because it was part of the initial LinuxPCC
2000 installation from CD.

If you can confirm that the server you are running is from the LinuxPPC
2000 CD, then I will try to duplicate the problem here ... I have that
same Linux distro running on a laptop, but never bothered to fire up
its PgSQL, since it's so out of date :-(.

I will need the full declaration of the table (most reliable way to show
this is "pg_dump -s -t tablename databasename").

regards, tom lane

#7Lee Joramo
lee.list@joramo.com
In reply to: Tom Lane (#5)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

And the solution is:

DELETE INDEX classifieds_adcopy

By deleteing the index, everything started to work correctly. Even after
recreating the index, everyting worked after multiple tests.

I came to this conclusion after much examination of the 'bad line' that
was causing my problems. I created a simple Python script to loop over
the 'adcopy' field of the bad line. For every loop it would add another
character of the adcopy, save it to a file and then issue a "COPY
classifieds FROM 'file'" to import the single line. For example the
following 8 lines represent the first 8 files that were generated:

825<TAB><TAB>f<TAB><TAB>f<TAB>N
825<TAB><TAB>f<TAB><TAB>f<TAB>Ne
825<TAB><TAB>f<TAB><TAB>f<TAB>Nee
825<TAB><TAB>f<TAB><TAB>f<TAB>Need
825<TAB><TAB>f<TAB><TAB>f<TAB>Need
825<TAB><TAB>f<TAB><TAB>f<TAB>Need m
825<TAB><TAB>f<TAB><TAB>f<TAB>Need mo
825<TAB><TAB>f<TAB><TAB>f<TAB>Need more

I also preformed this test with many other lines of data that I believed
to be good. The results of these tests confirmed to me that only the one
bad line was responsible. The good lines always loaded, the bad line
always failed. However as I repeatedly preformed this test on 'bad line',
it failed in an inconsistent way:

Failure 1:
825<TAB><TAB>f<TAB><TAB>f<TAB>Need more growin
Failure 2:
825<TAB><TAB>f<TAB><TAB>f<TAB>Nee
Failure 3:
825<TAB><TAB>f<TAB><TAB>f<TAB>Need more growing room ? Cozy up by one of
And so on ...

At this point, I thought that maybe the index was corrupted.

But this raises other questions. Should I be concerned about other
corruption to the database? What can I do to prevent this from happening
again. Should I delete the indexes before I preform a 'COPY table FROM'
operation, and then recreate the indexes?

Thanks very much for your help.

--
Lee A. Joramo ljoramo@nickads.com
The Nickel Want Ads www.nickads.com
Internet Manager 970-242-5555

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lee Joramo (#7)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Lee Joramo <lee.list@joramo.com> writes:

And the solution is:
DELETE INDEX classifieds_adcopy
By deleteing the index, everything started to work correctly. Even after
recreating the index, everyting worked after multiple tests.

So that index had gotten corrupted at some time in the past, and just
emptying and refilling the table didn't fix it. Doesn't surprise me
a whole lot. (I had assumed you were destroying and rebuilding the
whole database, else I'd have suggested this sooner...)

It's possible that the index corruption was triggered by having inserted
an overlength adcopy value at some time in the past. Or maybe it's just
a garden-variety bug; 6.5 has its share of 'em. I do urge you to update
to 7.0.3 as soon as possible. (BTW, there are LinuxPPC RPMs of 7.0.3
available from our FTP server, so if it's having to compile from source
that's scaring you, you don't need to.)

regards, tom lane

#9Tod McQuillin
devin@spamcop.net
In reply to: Tom Lane (#5)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

On Tue, 9 Jan 2001, Tom Lane wrote:

-S suppresses all log output, so first you need to get rid of that.

The postgres.1 man page from postgresql-7.0.3/doc/man.tar.gz says:

-S SortSize
Specifies the amount of memory to be used by inter-
nal sorts and hashes before resorting to temporary
disk files. The value is specified in kilobytes,
and defaults to 512 kilobytes. Note that for a com-
plex query, several sorts and/or hashes might be
running in parallel, and each one will be allowed
to use as much as SortSize kilobytes before it
starts to put data into temporary files.

Who is right?
--
Tod McQuillin

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tod McQuillin (#9)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Tod McQuillin <devin@spamcop.net> writes:

On Tue, 9 Jan 2001, Tom Lane wrote:

-S suppresses all log output, so first you need to get rid of that.

The postgres.1 man page from postgresql-7.0.3/doc/man.tar.gz says:

-S SortSize

Postmaster and backend switches are two different animals. To specify
a backend switch to the postmaster, you need -o. Thus:

postmaster -S -o "-S 1000"

selects silent postmaster mode and 1000K sortsize for the backend.

In short: postmaster.1 != postgres.1 ...

regards, tom lane

#11Tod McQuillin
devin@spamcop.net
In reply to: Tom Lane (#10)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

On Tue, 9 Jan 2001, Tom Lane wrote:

Postmaster and backend switches are two different animals. To specify
a backend switch to the postmaster, you need -o. Thus:

postmaster -S -o "-S 1000"

selects silent postmaster mode and 1000K sortsize for the backend.

In short: postmaster.1 != postgres.1 ...

OOPS! My mistake. No wonder I was confused. I was basing my incorrect
assumptions on this:

devin@kanemochi /tools/pgsql/bin% ls -l post*
-r-xr-xr-x 1 devin wheel 1469829 Jan 4 03:21 postgres
lrwxrwxr-x 1 devin wheel 8 Jan 4 03:21 postmaster -> postgres

I was not aware the binary would behave differently depending on the name
it was invoked as.

Thanks for straightening me out.
--
Tod McQuillin

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tod McQuillin (#11)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Tod McQuillin <devin@spamcop.net> writes:

I was not aware the binary would behave differently depending on the name
it was invoked as.

Right. We used to have two separate binaries, and the postmaster would
actually exec() the backend (postgres) binary for every session launch.
About two years ago, Bruce made them into a single executable so that
session launch only needs fork() and not exec(). But he kept the old
switch-parsing mechanisms for backwards compatibiliy with existing
startup scripts. So our current switch syntax is a dreadfully ugly
legacy thingy :-(

Peter E. has done some good work towards providing a less ugly switch
notation for 7.1.

regards, tom lane

#13Lee Joramo
lee.list@joramo.com
In reply to: Tod McQuillin (#11)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

general opinion on the Cobalt mailing lists is that upgrading Postgre is
a very delicate task.

[SNIP]

Check, at least not without practicing on a spare machine first ;-).

One thing to think about is that there's no fundamental reason why you'd
have to update the version of Postgres used by the OS. You could
install 7.0.* alongside 6.5, use 7.0 for your own purposes and leave the
OS untouched.

Yes, I do this myself with perl and python. And if I had a spare and
local Cobalt RaQ3, I'd try it with postgre. But again, I won't try this
with our public web server.

When it comes to installing software on various unix flavors, I am able
to compile from source. If program does not work right away I normally
can beat it into submission. Even if I was an order of magnitude more
skilled at installs, I don't think I would attempt this particular
installation.

lee.

#14Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#12)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Tod McQuillin <devin@spamcop.net> writes:

I was not aware the binary would behave differently depending on the name
it was invoked as.

Right. We used to have two separate binaries, and the postmaster would
actually exec() the backend (postgres) binary for every session launch.
About two years ago, Bruce made them into a single executable so that
session launch only needs fork() and not exec(). But he kept the old
switch-parsing mechanisms for backwards compatibiliy with existing
startup scripts. So our current switch syntax is a dreadfully ugly
legacy thingy :-(

Peter E. has done some good work towards providing a less ugly switch
notation for 7.1.

Tom, can you be a little more specific on the changes? I am unsure of
the details.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#14)
Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Peter E. has done some good work towards providing a less ugly switch
notation for 7.1.

Tom, can you be a little more specific on the changes? I am unsure of
the details.

Peter would be the right person to explain the details, not me.

regards, tom lane