Very high load average but no cpu utilization ?

Started by Rajesh Kumar Mallahabout 24 years ago18 messageshackers

mallah@trade-india.com

about 24 years ago

Hi Again,

10:22am up 15:06, 1 user, load average: 9.02, 9.02, 8.98
85 processes: 73 sleeping, 1 running, 11 zombie, 0 stopped
CPU states: 0.0% user, 0.4% system, 0.0% nice, 99.4% idle
Mem: 1028484K av, 1017488K used, 10996K free, 0K shrd, 8996K buff
Swap: 971004K av, 240344K used, 730660K free 760208K
cached

In my postgresql server load avearge is very high but cpu is 99.4 % idle

this is not strictly a pgsql issue but , can anyone tell me how can i
find what is loading my server heavily

regds
Mallah.

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

Rajesh Kumar Mallah

mallah@trade-india.com

about 24 years ago

In reply to: Rajesh Kumar Mallah (#1)

Further info : Very high load average but no cpu utilization ?

Hi ,

i am sorry to bother you people again and again , but i guess this
is a bad patch for me which will soon pass on ;-)

my postmaster is running but most of the backeds are defunct ,
and on connecting get following error message:

$ psql -h 130.94.22.209 -U tradein tradein_clients
psql: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
[rmallah@server rmallah]$

how do i bring down postmaster safetly?

ps output is as below.

[root@linux10320 root2]# ps auxwww| grep post
postgres 1131 0.0 0.0 139424 4 ? D May1004/usr/local/pgsql/bin/postmaster
postgres 1132 0.0 0.0 140412 4 ? D May10 0:13 postgres: stats buffer process
postgres 1133 0.0 0.0 139576 4 ? S May10 0:18 postgres: stats collector process
postgres 8046 0.0 0.0 238712 4 ? D 00:25 0:13 postgres: tradein tradein_clients 130.94.20.27 SELECT
postgres 8089 0.0 0.0 139812 4 ? D 00:26 0:00 postgres: checkpoint subprocess
postgres 11442 0.0 0.0 218152 4 ? D 04:25 0:03 postgres: tradein tradein_clients 130.94.20.27 SELECT
postgres 15453 0.1 0.0 0 0 ? Z 08:17 0:09 [postmaster <defunct>]
postgres 15455 0.0 0.0 0 0 ? Z 08:17 0:00 [postmaster <defunct>]
postgres 15456 0.0 0.0 0 0 ? Z 08:18 0:00 [postmaster <defunct>]
postgres 15457 0.0 0.0 0 0 ? Z 08:19 0:00 [postmaster <defunct>]
postgres 15462 0.0 0.0 0 0 ? Z 08:20 0:01 [postmaster <defunct>]
postgres 15463 0.0 0.0 0 0 ? Z 08:20 0:00 [postmaster <defunct>]
postgres 15465 0.0 0.0 0 0 ? Z 08:21 0:01 [postmaster <defunct>]
postgres 15466 0.0 0.0 0 0 ? Z 08:22 0:00 [postmaster <defunct>]
postgres 15491 0.0 0.0 0 0 ? Z 08:24 0:00 [postmaster <defunct>]
postgres 15494 0.0 0.0 0 0 ? Z 08:24 0:00 [postmaster <defunct>]
postgres 15496 0.0 0.0 0 0 ? Z 08:24 0:00 [postmaster <defunct>]
postgres 15510 0.2 10.2 238712 105008 ? D 08:25 0:20 postgres: tradein tradein_clients 130.94.20.27 SELECT
root 19268 0.0 0.0 1364 528 pts/1 S 10:42 0:00 grep post
[root@linux10320 root2]#

On Saturday 11 May 2002 10:07 am, Rajesh Kumar Mallah. wrote:

Hi Again,

10:22am up 15:06, 1 user, load average: 9.02, 9.02, 8.98
85 processes: 73 sleeping, 1 running, 11 zombie, 0 stopped
CPU states: 0.0% user, 0.4% system, 0.0% nice, 99.4% idle
Mem: 1028484K av, 1017488K used, 10996K free, 0K shrd, 8996K
buff Swap: 971004K av, 240344K used, 730660K free
760208K cached

In my postgresql server load avearge is very high but cpu is 99.4 % idle

this is not strictly a pgsql issue but , can anyone tell me how can i
find what is loading my server heavily

regds
Mallah.

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Rajesh Kumar Mallah (#2)

Re: Further info : Very high load average but no cpu utilization ?

"Rajesh Kumar Mallah." <mallah@trade-india.com> writes:

[root@linux10320 root2]# ps auxwww| grep post
postgres 1131 0.0 0.0 139424 4 ? D May1004/usr/local/pgsql/bin/postmaster
postgres 1132 0.0 0.0 140412 4 ? D May10 0:13 postgres: stats buffer process
postgres 1133 0.0 0.0 139576 4 ? S May10 0:18 postgres: stats collector process
postgres 8046 0.0 0.0 238712 4 ? D 00:25 0:13 postgres: tradein tradein_clients 130.94.20.27 SELECT
postgres 8089 0.0 0.0 139812 4 ? D 00:26 0:00 postgres: checkpoint subprocess
postgres 11442 0.0 0.0 218152 4 ? D 04:25 0:03 postgres: tradein tradein_clients 130.94.20.27 SELECT
postgres 15453 0.1 0.0 0 0 ? Z 08:17 0:09 [postmaster <defunct>]
postgres 15455 0.0 0.0 0 0 ? Z 08:17 0:00 [postmaster <defunct>]
postgres 15456 0.0 0.0 0 0 ? Z 08:18 0:00 [postmaster <defunct>]
postgres 15457 0.0 0.0 0 0 ? Z 08:19 0:00 [postmaster <defunct>]
postgres 15462 0.0 0.0 0 0 ? Z 08:20 0:01 [postmaster <defunct>]

I think your postmaster is stuck; it should have reaped those defunct
subprocesses instantly. Given that you also seem to have a stuck
checkpoint process (8 hours to run a checkpoint?) there is probably
something hosed in the interprocess communication logic, but it's hard
to guess what from this amount of info.

At this point probably your best bet is to kill all the running postgres
processes (try SIGTERM first, then SIGKILL if that doesn't work) and
launch a postmaster from a fresh start. Don't forget the ulimit this
time.

regards, tom lane

Rajesh Kumar Mallah

mallah@trade-india.com

about 24 years ago

In reply to: Tom Lane (#3)

Re: Further info : Very high load average but no cpu utilization ?

Hi there,

I have observed that it is nearly impossible to
get rid of postmaster or backends by any signal
when it decides not to quit.

Even the OS( Linux rh62) refuses to reboot in such a situation.
and my system admin had to power off the system ,
then fsck .... and stuff.

but this only happens when postmaster is stuck for
some reason , i feel filling up of postmasters log
file was the reason of my postmaster getting stuck.

regds
mallah.

On Saturday 11 May 2002 09:29 pm, Tom Lane wrote:

"Rajesh Kumar Mallah." <mallah@trade-india.com> writes:

[root@linux10320 root2]# ps auxwww| grep post
postgres 1131 0.0 0.0 139424 4 ? D
May1004/usr/local/pgsql/bin/postmaster postgres 1132 0.0 0.0 140412
4 ? D May10 0:13 postgres: stats buffer process postgres
1133 0.0 0.0 139576 4 ? S May10 0:18 postgres: stats
collector process postgres 8046 0.0 0.0 238712 4 ? D 00:25
0:13 postgres: tradein tradein_clients 130.94.20.27 SELECT postgres
8089 0.0 0.0 139812 4 ? D 00:26 0:00 postgres: checkpoint
subprocess postgres 11442 0.0 0.0 218152 4 ? D 04:25 0:03
postgres: tradein tradein_clients 130.94.20.27 SELECT postgres 15453 0.1
0.0 0 0 ? Z 08:17 0:09 [postmaster <defunct>]
postgres 15455 0.0 0.0 0 0 ? Z 08:17 0:00
[postmaster <defunct>] postgres 15456 0.0 0.0 0 0 ? Z
08:18 0:00 [postmaster <defunct>] postgres 15457 0.0 0.0 0 0 ?
Z 08:19 0:00 [postmaster <defunct>] postgres 15462 0.0 0.0
0 0 ? Z 08:20 0:01 [postmaster <defunct>]

I think your postmaster is stuck; it should have reaped those defunct
subprocesses instantly. Given that you also seem to have a stuck
checkpoint process (8 hours to run a checkpoint?) there is probably
something hosed in the interprocess communication logic, but it's hard
to guess what from this amount of info.

At this point probably your best bet is to kill all the running postgres
processes (try SIGTERM first, then SIGKILL if that doesn't work) and
launch a postmaster from a fresh start. Don't forget the ulimit this
time.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

D'Arcy J.M. Cain

darcy@druid.net

about 24 years ago

In reply to: Rajesh Kumar Mallah (#4)

Re: Further info : Very high load average but no cpu utilization ?

On May 12, 2002 01:46 am, Rajesh Kumar Mallah. wrote:

Hi there,

I have observed that it is nearly impossible to
get rid of postmaster or backends by any signal
when it decides not to quit.

Even the OS( Linux rh62) refuses to reboot in such a situation.
and my system admin had to power off the system ,
then fsck .... and stuff.

Not even kill -9 worked? I had that happen too but I thought it was a
problem with AIX. Kill -9 is supposed to kill any process. It can't be
caught. Is it possible that PostgreSQL is doing something that makes it that
unkillable?

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: D'Arcy J.M. Cain (#5)

Re: Further info : Very high load average but no cpu utilization ?

"D'Arcy J.M. Cain" <darcy@druid.net> writes:

Not even kill -9 worked? I had that happen too but I thought it was a
problem with AIX. Kill -9 is supposed to kill any process. It can't be
caught. Is it possible that PostgreSQL is doing something that makes it that
unkillable?

Could there be a kernel bug associated with processes that are trying to
write past the 2Gb limit? The postmaster is certainly not doing
anything deliberate to make itself unkillable, but on some platforms
kill -9 will not work on processes that are wedged in a system call...

regards, tom lane

Rajesh Kumar Mallah

mallah@trade-india.com

about 24 years ago

In reply to: Tom Lane (#6)

Re: Further info : Very high load average but no cpu utilization ?

Well,

Its advocated "dont kill -9 the postmaster" and i rarely do that.

Postmaster tends to be immortal ,
And it is *not* so only for the case when postmaster is trying to
write past 2GB limit ,
I have only recently started logging postmaster to that extent.

regds
mallah.

On Sunday 12 May 2002 09:07 pm, Tom Lane wrote:

"D'Arcy J.M. Cain" <darcy@druid.net> writes:

Not even kill -9 worked? I had that happen too but I thought it was a
problem with AIX. Kill -9 is supposed to kill any process. It can't be
caught. Is it possible that PostgreSQL is doing something that makes it
that unkillable?

Could there be a kernel bug associated with processes that are trying to
write past the 2Gb limit? The postmaster is certainly not doing
anything deliberate to make itself unkillable, but on some platforms
kill -9 will not work on processes that are wedged in a system call...

regards, tom lane

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

D'Arcy J.M. Cain

darcy@druid.net

about 24 years ago

In reply to: Rajesh Kumar Mallah (#7)

Re: Further info : Very high load average but no cpu utilization ?

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Its advocated "dont kill -9 the postmaster" and i rarely do that.

Advocated or not, kill -9 is supposed to be the last resort. If nothing else
works then kill -9 should kill any Unix process. As Tom says, if it doesn't
then it suggests an OS (probably driver) problem.

Now if only I could get IBM to understand that. They still claim that my
problem is that PostgreSQL (an "unsupported" application) is doing something
to catch SIGKILL.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.

Rajesh Kumar Mallah

mallah@trade-india.com

about 24 years ago

In reply to: D'Arcy J.M. Cain (#8)

Re: Further info : Very high load average but no cpu utilization ?

Hi,

I vaguely remember postmaster not responding to
kill -9 even on Linux,

i can confirm next time when (god forbids) my postmaster
goes crazy. ;-)

i feel lucky now that my postmaster is up for more that
(24 hrs)

(no offence intended , there have been instances of my postmaster
running for as long as 3 months)

regds
mallah.

On Monday 13 May 2002 03:51 pm, D'Arcy J.M. Cain wrote:

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Its advocated "dont kill -9 the postmaster" and i rarely do that.

Advocated or not, kill -9 is supposed to be the last resort. If nothing
else works then kill -9 should kill any Unix process. As Tom says, if it
doesn't then it suggests an OS (probably driver) problem.

Now if only I could get IBM to understand that. They still claim that my
problem is that PostgreSQL (an "unsupported" application) is doing
something to catch SIGKILL.

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

#10

Denis Braekhus

denis@startsiden.no

about 24 years ago

In reply to: Rajesh Kumar Mallah (#9)

Re: Further info : Very high load average but no cpu utilization ?

On Monday 13 May 2002 02:04 pm, you wrote:

Hi,

I vaguely remember postmaster not responding to
kill -9 even on Linux,

i can confirm next time when (god forbids) my postmaster
goes crazy. ;-)

i feel lucky now that my postmaster is up for more that
(24 hrs)

(no offence intended , there have been instances of my postmaster
running for as long as 3 months)

regds
mallah.

I've been reading this thread with interest, may I ask a few additional
questions ?

- What version of postgresql are you running, compiled from tarball source or
RPM version ?
- Is there any good reason to still run this server on a RedHat 6.2 (non
supported platform from RedHat) ?
- How large are your databases, and how much usage do you have ?
- What kinds of API's do you use to interface ?
- Is the application running locally or do you use IP connections remotely ?

Reason I am asking is that I still have never had postgresql go bad like
that.. I can always stop it properly, and to this date have had very few
problems with postgresql itself. I am interested in your problems because I'd
like to be aware of issues I can eventually run into.. (Better care before
than later ? :-)

My own configuration is like this :

- Postgresql 7.1.3 with OpenFTS, both compiled from source tarballs
- Debian Linux 2.2.x platform
- PHP and Perl applications accessing postgresql from remote machines via IP
- Not very large databases (100s of MBs) but very frequent read, and quite
frequent insert / update activity

For the record, server uptime now is equal to the last time I rebooted for a
kernel recompilation, and during that time I have been forced to restart
postgresql only because of hangups on the Application servers (apache w. php
/ perl) ..

Regards
--
Denis Braekhus

#11

Rajesh Kumar Mallah

mallah@trade-india.com

about 24 years ago

In reply to: Denis Braekhus (#10)

Re: Further info : Very high load average but no cpu utilization ?

Hi Dennis,

thanks for your interest and i like your
idea of "Better care before later....."

I feel the best care you can take before its late is is
to monitor the sever , what is happening and when.

basic parameters like load average , iostat do reveal if
anything going fishy.

In my case i do have an heavly loaded webserver, but i do not
feel it was the load that brought the server to its toes.

its more of mismangement on my part. I do not have documented which
all programs run and when , how much do they load may be
some wicked script running a query that would never finish etc etc,

its not that everytime my server crashed in unexplained manner.
eg at one time i had redirected the postmaster log to a file which
ran out of sapce!.

I feel If you are concerned abt sever health you should install
softwares like sysstat to monitor various system paramenters at
various time , plot charts etc, and analyze .

my postmaster is cool now running for quite sometime without getting wild.

I have got sar installed on my system and now a days
writing a GD cgi application to closely monitor
whats happending to system and when.

i have replied your other questions point wise below:

On Thursday 23 May 2002 01:48 pm, Denis wrote:

I've been reading this thread with interest, may I ask a few additional
questions ?

- What version of postgresql are you running, compiled from tarball source
or RPM version ?

- Is there any good reason to still run this server on a RedHat 6.2 (non
supported platform from RedHat) ?

not many , it costs bucks to upgrade becoz my server and ISP are in US
and i do not have physical access and not too interested to give my
ISP $$$.

- How large are your databases, and how much usage do you have ?

not very large $PGDATA is betweeb 1.5 GB to 2.0 GB

- What kinds of API's do you use to interface ?
- Is the application running locally or do you use IP connections remotely
?

Perl DBI , remote ip connections but in same network.

Reason I am asking is that I still have never had postgresql go bad like
that.. I can always stop it properly, and to this date have had very few
problems with postgresql itself. I am interested in your problems because
I'd like to be aware of issues I can eventually run into.. (Better care
before than later ? :-)

Even I did not have problems for months together.
And I feel there is/was a hardrive problem.

In my plots even now i see very high peaks at times
and i am still to investigate into it.

My own configuration is like this :

- Postgresql 7.1.3 with OpenFTS, both compiled from source tarballs
- Debian Linux 2.2.x platform

I too used OpenFTS till recently but migrated to contrib/tsearch now.
Hey upgrade to PG 7.2.1 its *really* worth it. read the release notes.

- PHP and Perl applications accessing postgresql from remote machines via
IP - Not very large databases (100s of MBs) but very frequent read, and
quite frequent insert / update activity

For the record, server uptime now is equal to the last time I rebooted for
a kernel recompilation, and during that time I have been forced to restart
postgresql only because of hangups on the Application servers (apache w.
php / perl) ..

Regards

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

#12

Bruce Momjian

bruce@momjian.us

about 24 years ago

In reply to: D'Arcy J.M. Cain (#8)

Re: Further info : Very high load average but no cpu utilization

D'Arcy J.M. Cain wrote:

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Its advocated "dont kill -9 the postmaster" and i rarely do that.

Advocated or not, kill -9 is supposed to be the last resort. If nothing else
works then kill -9 should kill any Unix process. As Tom says, if it doesn't
then it suggests an OS (probably driver) problem.

Now if only I could get IBM to understand that. They still claim that my
problem is that PostgreSQL (an "unsupported" application) is doing something
to catch SIGKILL.

First, an application can't catch SIGKILL. It never arrives to
applications. It is supposed to pull the process with no warning.

However, there are things processes can do to wedge themselves in a
system call so they don't see the SIGKILL. Of course, as soon as they
return from the system call, they die.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

#13

D'Arcy J.M. Cain

darcy@druid.net

about 24 years ago

In reply to: Bruce Momjian (#12)

Re: PostgreSQL on AIX

On June 5, 2002 12:33 pm, Bruce Momjian wrote:

D'Arcy J.M. Cain wrote:

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Catching up on an old mailbox, Bruce? :-)

Now if only I could get IBM to understand that. They still claim that my
problem is that PostgreSQL (an "unsupported" application) is doing
something to catch SIGKILL.

First, an application can't catch SIGKILL. It never arrives to
applications. It is supposed to pull the process with no warning.

However, there are things processes can do to wedge themselves in a
system call so they don't see the SIGKILL. Of course, as soon as they
return from the system call, they die.

Exactly. What IBM was saying was was that we were "catching" SIGKILL and I
could not convince the (supposedly technical) IBMers that they were talking
out their ass.

Anyway, I am pretty sure that PostgreSQL is not the culprit here. As it
happens this project is back on the table for me so it is interesting that
your email popped up now. I just compiled the latest version of PostgreSQL
on my AIX system and it generated lots of errors and then completed and
installed fine. Makes me sort of nervous. We'll see how it goes. Anyone
have any horror/success stories about PostgreSQL on AIX for me?

Changed subject and mailing list.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.

#14

Travis Hoyt

thoyt@npc.net

about 24 years ago

In reply to: D'Arcy J.M. Cain (#13)

Re: PostgreSQL on AIX

I've been using PosgreSQL 7.2 on AIX 4.3.3 with no probelms at all.

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of D'Arcy J.M. Cain
Sent: Thursday, June 06, 2002 6:35 AM
To: Bruce Momjian; pgsql-sql@postgresql.org
Cc: pgsql-sql@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [SQL] PostgreSQL on AIX

On June 5, 2002 12:33 pm, Bruce Momjian wrote:

D'Arcy J.M. Cain wrote:

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Catching up on an old mailbox, Bruce? :-)

Now if only I could get IBM to understand that. They still claim that

problem is that PostgreSQL (an "unsupported" application) is doing
something to catch SIGKILL.

First, an application can't catch SIGKILL. It never arrives to
applications. It is supposed to pull the process with no warning.

However, there are things processes can do to wedge themselves in a
system call so they don't see the SIGKILL. Of course, as soon as they
return from the system call, they die.

Exactly. What IBM was saying was was that we were "catching" SIGKILL and
I
could not convince the (supposedly technical) IBMers that they were
talking
out their ass.

Anyway, I am pretty sure that PostgreSQL is not the culprit here. As it
happens this project is back on the table for me so it is interesting that
your email popped up now. I just compiled the latest version of
PostgreSQL
on my AIX system and it generated lots of errors and then completed and
installed fine. Makes me sort of nervous. We'll see how it goes. Anyone
have any horror/success stories about PostgreSQL on AIX for me?

Changed subject and mailing list.

--
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#15

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

about 24 years ago

In reply to: Travis Hoyt (#14)

Re: [HACKERS] PostgreSQL on AIX

Anyway, I am pretty sure that PostgreSQL is not the culprit here. As it
happens this project is back on the table for me so it is interesting that
your email popped up now. I just compiled the latest version of PostgreSQL
on my AIX system and it generated lots of errors and then completed and
installed fine. Makes me sort of nervous. We'll see how it goes. Anyone
have any horror/success stories about PostgreSQL on AIX for me?

The "errors" are mostly duplicate symbol warnings, that are part of generating
a shared lib on AIX (in a mostly gcc and xlc independent way), and can be safely
ignored.

The imho most needed effort for AIX would be to switch the TAS stuff from
cs() to fetch_and_or() or a PowerPC assembler or the test_and_set() that is
undocumented/intended for kernel, see discussions from last year.

The fetch_and_or() is a lot faster on multi processor systems but a little
slower on single processor. But cs() is documented as depricated, so ...

I might get round to doing this.

Andreas

Import Notes

Resolved by subject fallback

#16

Bruce Momjian

bruce@momjian.us

about 24 years ago

In reply to: D'Arcy J.M. Cain (#13)

Re: PostgreSQL on AIX

D'Arcy J.M. Cain wrote:

On June 5, 2002 12:33 pm, Bruce Momjian wrote:

D'Arcy J.M. Cain wrote:

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Catching up on an old mailbox, Bruce? :-)

Now if only I could get IBM to understand that. They still claim that my
problem is that PostgreSQL (an "unsupported" application) is doing
something to catch SIGKILL.

First, an application can't catch SIGKILL. It never arrives to
applications. It is supposed to pull the process with no warning.

However, there are things processes can do to wedge themselves in a
system call so they don't see the SIGKILL. Of course, as soon as they
return from the system call, they die.

Exactly. What IBM was saying was was that we were "catching" SIGKILL and I
could not convince the (supposedly technical) IBMers that they were talking
out their ass.

Yes, they didn't know "catching" from "ignoring because in
uninterruptible system call".

Anyway, I am pretty sure that PostgreSQL is not the culprit here. As it
happens this project is back on the table for me so it is interesting that
your email popped up now. I just compiled the latest version of PostgreSQL
on my AIX system and it generated lots of errors and then completed and
installed fine. Makes me sort of nervous. We'll see how it goes. Anyone
have any horror/success stories about PostgreSQL on AIX for me?

Would you check those error/warnings and send us patches or a list of
them. Sometimes different compilers like AIX can show problems gcc
doesn't.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

#17

Bruce Momjian

bruce@momjian.us

about 24 years ago

In reply to: Travis Hoyt (#14)

Re: PostgreSQL on AIX

Also, Tatsuo uses AIX a lot and knows all the issues.

---------------------------------------------------------------------------

Travis Hoyt wrote:

I've been using PosgreSQL 7.2 on AIX 4.3.3 with no probelms at all.

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of D'Arcy J.M. Cain
Sent: Thursday, June 06, 2002 6:35 AM
To: Bruce Momjian; pgsql-sql@postgresql.org
Cc: pgsql-sql@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [SQL] PostgreSQL on AIX

On June 5, 2002 12:33 pm, Bruce Momjian wrote:

D'Arcy J.M. Cain wrote:

On May 13, 2002 12:50 am, Rajesh Kumar Mallah. wrote:

Catching up on an old mailbox, Bruce? :-)

Now if only I could get IBM to understand that. They still claim that

my

problem is that PostgreSQL (an "unsupported" application) is doing
something to catch SIGKILL.

First, an application can't catch SIGKILL. It never arrives to
applications. It is supposed to pull the process with no warning.

However, there are things processes can do to wedge themselves in a
system call so they don't see the SIGKILL. Of course, as soon as they
return from the system call, they die.

Exactly. What IBM was saying was was that we were "catching" SIGKILL and
I
could not convince the (supposedly technical) IBMers that they were
talking
out their ass.

Anyway, I am pretty sure that PostgreSQL is not the culprit here. As it
happens this project is back on the table for me so it is interesting that
your email popped up now. I just compiled the latest version of
PostgreSQL
on my AIX system and it generated lots of errors and then completed and
installed fine. Makes me sort of nervous. We'll see how it goes. Anyone
have any horror/success stories about PostgreSQL on AIX for me?

Changed subject and mailing list.
--
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

#18

Bruce Momjian

bruce@momjian.us

about 24 years ago

In reply to: Zeugswetter Andreas SB SD (#15)

Re: [HACKERS] PostgreSQL on AIX

Zeugswetter Andreas SB SD wrote:

Anyway, I am pretty sure that PostgreSQL is not the culprit here. As it
happens this project is back on the table for me so it is interesting that
your email popped up now. I just compiled the latest version of PostgreSQL
on my AIX system and it generated lots of errors and then completed and
installed fine. Makes me sort of nervous. We'll see how it goes. Anyone
have any horror/success stories about PostgreSQL on AIX for me?

The "errors" are mostly duplicate symbol warnings, that are part of generating
a shared lib on AIX (in a mostly gcc and xlc independent way), and can be safely
ignored.

The imho most needed effort for AIX would be to switch the TAS stuff from
cs() to fetch_and_or() or a PowerPC assembler or the test_and_set() that is
undocumented/intended for kernel, see discussions from last year.

Yes, TODO has:

* Evaluate AIX cs() spinlock macro for performance optimizations (Tatsuo)

The fetch_and_or() is a lot faster on multi processor systems but a little
slower on single processor. But cs() is documented as depricated, so ...

Should I update the TODO item?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026