strange behavior of UPDATE

Started by Edmund Merglover 26 years ago29 messages

E.Mergl@bawue.de

over 26 years ago

Hi,

recently I tried to reproduce some benchmark results
when I discovered a very strange behavior. I did
my tests with the current snapshot of last week,
but other people who have performed the same bench-
mark with postgresql-6.4-2 reported the same problems.

The setup is pretty simple: one table with 13
integer and 7 char(20) columns. For every column
an index is created. The postmaster is started with
-o -F and before each query a 'vacuum analyze' is
performed.

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior. A never ending
disk-activity starts. Memory consumption
increases up to the physical limit (384 MB)
whereas the postmaster uses only a few %
of CPU time. After 1 hour I killed the post-
master.

It would be nice, if this could be fixed.
People from the german UNIX magazine IX
benchmarked Oracle, Informix and Sybase on Linux
and they claimed, that Postgres is totally unusable
because of this problem.

If you need some additional info, just let me know.

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Edmund Mergl (#1)

Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior.

Can you provide a script or something to reproduce this behavior?

There are a number of people using Postgres with large databases
and not reporting any such problem, so I think there has to be some
special triggering condition; it's not just a matter of things
breaking at a million rows. Before digging into it, I'd like to
eliminate variables like whether I have the right test case.

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofFri21May1999223415+02003745C347.F01A3CAB@bawue.de | Resolved by subject fallback

Bruce Momjian

maillist@candle.pha.pa.us

over 26 years ago

In reply to: Edmund Mergl (#1)

Re: [HACKERS] strange behavior of UPDATE

OK, can you attach to the running process and tell us what functions it
is running. That would help.

[Charset iso-8859-2 unsupported, filtering to ASCII...]

Hi,

recently I tried to reproduce some benchmark results
when I discovered a very strange behavior. I did
my tests with the current snapshot of last week,
but other people who have performed the same bench-
mark with postgresql-6.4-2 reported the same problems.

The setup is pretty simple: one table with 13
integer and 7 char(20) columns. For every column
an index is created. The postmaster is started with
-o -F and before each query a 'vacuum analyze' is
performed.

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior. A never ending
disk-activity starts. Memory consumption
increases up to the physical limit (384 MB)
whereas the postmaster uses only a few %
of CPU time. After 1 hour I killed the post-
master.

It would be nice, if this could be fixed.
People from the german UNIX magazine IX
benchmarked Oracle, Informix and Sybase on Linux
and they claimed, that Postgres is totally unusable
because of this problem.

If you need some additional info, just let me know.

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Edmund Mergl

E.Mergl@bawue.de

over 26 years ago

In reply to: Tom Lane (#2)

1 attachment(s)

Re: [HACKERS] strange behavior of UPDATE

Tom Lane wrote:

Edmund Mergl <E.Mergl@bawue.de> writes:

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior.

Can you provide a script or something to reproduce this behavior?

There are a number of people using Postgres with large databases
and not reporting any such problem, so I think there has to be some
special triggering condition; it's not just a matter of things
breaking at a million rows. Before digging into it, I'd like to
eliminate variables like whether I have the right test case.

regards, tom lane

the original benchmark can be found at

ftp://ftp.heise.de/pub/ix/benches/sqlb-21.tar

for a stripped-down version see the attachment.
For loading the database and running the first
and second part (selects and updates) just do
the following:

createdb test
./make_wnt 1000000 pgsql >make.out 2>&1 &

This needs about 700 MB of diskspace.
On a PII-400 it takes about 40 minutes to
load the database, 20 minutes to create the indeces
and 20 minutes to run the first part of the
benchmark (make_sqs). For running the benchmark
in 20 minutes (without swapping) one needs 384 MB RAM.

The second part (make_nqs) contains update
statements which can not be performed properly
using PostgreSQL.

For testing it is sufficient to initialize the
database and then to perform a query like

update bench set k500k = k500k + 1 where k100 = 30

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

Edmund Mergl

E.Mergl@bawue.de

over 26 years ago

In reply to: Bruce Momjian (#3)

Re: [HACKERS] strange behavior of UPDATE

Bruce Momjian wrote:

OK, can you attach to the running process and tell us what functions it
is running. That would help.

[Charset iso-8859-2 unsupported, filtering to ASCII...]

Hi,

recently I tried to reproduce some benchmark results
when I discovered a very strange behavior. I did
my tests with the current snapshot of last week,
but other people who have performed the same bench-
mark with postgresql-6.4-2 reported the same problems.

The setup is pretty simple: one table with 13
integer and 7 char(20) columns. For every column
an index is created. The postmaster is started with
-o -F and before each query a 'vacuum analyze' is
performed.

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior. A never ending
disk-activity starts. Memory consumption
increases up to the physical limit (384 MB)
whereas the postmaster uses only a few %
of CPU time. After 1 hour I killed the post-
master.

It would be nice, if this could be fixed.
People from the german UNIX magazine IX
benchmarked Oracle, Informix and Sybase on Linux
and they claimed, that Postgres is totally unusable
because of this problem.

If you need some additional info, just let me know.

Edmund

I can attach to the backend and print a backtrace.
Is this what you expect ?

Edmund

(gdb) bt
#0 0x40186534 in __libc_read ()
#1 0x360 in ?? ()
#2 0x80de019 in mdread (reln=0x8222790, blocknum=1671,
buffer=0x40215c40 "ďż˝\a\024\bďż˝\037") at md.c:413
#3 0x80dead5 in smgrread (which=0, reln=0x8222790, blocknum=1671,
buffer=0x40215c40 "ďż˝\a\024\bďż˝\037") at smgr.c:231
#4 0x80d5863 in ReadBufferWithBufferLock (reln=0x8222790, blockNum=1671,
bufferLockHeld=0) at bufmgr.c:292
#5 0x80d5758 in ReadBuffer (reln=0x8222790, blockNum=1671) at bufmgr.c:170
#6 0x8073153 in _bt_getbuf (rel=0x8222790, blkno=1671, access=0)
at nbtpage.c:337
#7 0x8074659 in _bt_searchr (rel=0x8222790, keysz=1, scankey=0x8236470,
bufP=0xbfffa36c, stack_in=0x849e498) at nbtsearch.c:116
#8 0x8074547 in _bt_search (rel=0x8222790, keysz=1, scankey=0x8236470,
bufP=0xbfffa36c) at nbtsearch.c:52
#9 0x8070d31 in _bt_doinsert (rel=0x8222790, btitem=0x849e468,
index_is_unique=0 '\000', heapRel=0x8218c40) at nbtinsert.c:65
#10 0x8073aea in btinsert (rel=0x8222790, datum=0x849e420,
nulls=0x849e408 " ", ht_ctid=0x82367d4, heapRel=0x8218c40) at nbtree.c:369
#11 0x8109a13 in fmgr_c (finfo=0xbfffa40c, values=0xbfffa41c,
isNull=0xbfffa40b "") at fmgr.c:154
#12 0x8109cb7 in fmgr (procedureId=331) at fmgr.c:338
#13 0x806d540 in index_insert (relation=0x8222790, datum=0x849e420,
nulls=0x849e408 " ", heap_t_ctid=0x82367d4, heapRel=0x8218c40)
at indexam.c:190
#14 0x80953a2 in ExecInsertIndexTuples (slot=0x8233368, tupleid=0x82367d4,
estate=0x8231740, is_update=1) at execUtils.c:1210
#15 0x809293c in ExecReplace (slot=0x8233368, tupleid=0xbfffa540,
estate=0x8231740) at execMain.c:1472
#16 0x809255e in ExecutePlan (estate=0x8231740, plan=0x8231280,
operation=CMD_UPDATE, offsetTuples=0, numberTuples=0,
direction=ForwardScanDirection, destfunc=0x8236350) at execMain.c:1086
#17 0x8091cae in ExecutorRun (queryDesc=0x8231728, estate=0x8231740,
feature=3, limoffset=0x0, limcount=0x0) at execMain.c:359
#18 0x80e1098 in ProcessQueryDesc (queryDesc=0x8231728, limoffset=0x0,
limcount=0x0) at pquery.c:333
#19 0x80e10fe in ProcessQuery (parsetree=0x8220998, plan=0x8231280,
dest=Remote) at pquery.c:376
#20 0x80dfbb6 in pg_exec_query_dest (
query_string=0xbfffa67c "update bench set k500k = k500k + 1 where k100 = 30;
", dest=Remote, aclOverride=0) at postgres.c:742
#21 0x80dfab7 in pg_exec_query (
query_string=0xbfffa67c "update bench set k500k = k500k + 1 where k100 = 30;
") at postgres.c:642
#22 0x80e0abc in PostgresMain (argc=6, argv=0xbfffe704, real_argc=6,
real_argv=0xbffffcf4) at postgres.c:1610
---Type <return> to continue, or q <return> to quit---
#23 0x80cacaa in DoBackend (port=0x81c3c38) at postmaster.c:1584
#24 0x80ca77a in BackendStartup (port=0x81c3c38) at postmaster.c:1351
#25 0x80c9ec9 in ServerLoop () at postmaster.c:802
#26 0x80c9a0e in PostmasterMain (argc=6, argv=0xbffffcf4) at postmaster.c:596
#27 0x80a1836 in main (argc=6, argv=0xbffffcf4) at main.c:97
#28 0x400fbcb3 in __libc_start_main (main=0x80a17d0 <main>, argc=6,
argv=0xbffffcf4, init=0x8061878 <_init>, fini=0x810f13c <_fini>,
rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffcec)
at ../sysdeps/generic/libc-start.c:78

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

Bruce Momjian

maillist@candle.pha.pa.us

over 26 years ago

In reply to: Edmund Mergl (#5)

Re: [HACKERS] strange behavior of UPDATE

The setup is pretty simple: one table with 13
integer and 7 char(20) columns. For every column
an index is created. The postmaster is started with
-o -F and before each query a 'vacuum analyze' is
performed.

Yes, this is what I wanted. Does the test use the DEFAULT clause. If
so, I may have just fixed the problem. If not, it may be another
problem with char() length not being padded properly.

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior. A never ending
disk-activity starts. Memory consumption
increases up to the physical limit (384 MB)
whereas the postmaster uses only a few %
of CPU time. After 1 hour I killed the post-
master.

It would be nice, if this could be fixed.
People from the german UNIX magazine IX
benchmarked Oracle, Informix and Sybase on Linux
and they claimed, that Postgres is totally unusable
because of this problem.

If you need some additional info, just let me know.

Edmund

I can attach to the backend and print a backtrace.
Is this what you expect ?

Edmund

(gdb) bt
#0 0x40186534 in __libc_read ()
#1 0x360 in ?? ()
#2 0x80de019 in mdread (reln=0x8222790, blocknum=1671,
buffer=0x40215c40 "_\a\024\b_\037") at md.c:413
#3 0x80dead5 in smgrread (which=0, reln=0x8222790, blocknum=1671,
buffer=0x40215c40 "_\a\024\b_\037") at smgr.c:231
#4 0x80d5863 in ReadBufferWithBufferLock (reln=0x8222790, blockNum=1671,
bufferLockHeld=0) at bufmgr.c:292
#5 0x80d5758 in ReadBuffer (reln=0x8222790, blockNum=1671) at bufmgr.c:170
#6 0x8073153 in _bt_getbuf (rel=0x8222790, blkno=1671, access=0)
at nbtpage.c:337
#7 0x8074659 in _bt_searchr (rel=0x8222790, keysz=1, scankey=0x8236470,
bufP=0xbfffa36c, stack_in=0x849e498) at nbtsearch.c:116
#8 0x8074547 in _bt_search (rel=0x8222790, keysz=1, scankey=0x8236470,
bufP=0xbfffa36c) at nbtsearch.c:52
#9 0x8070d31 in _bt_doinsert (rel=0x8222790, btitem=0x849e468,
index_is_unique=0 '\000', heapRel=0x8218c40) at nbtinsert.c:65
#10 0x8073aea in btinsert (rel=0x8222790, datum=0x849e420,
nulls=0x849e408 " ", ht_ctid=0x82367d4, heapRel=0x8218c40) at nbtree.c:369
#11 0x8109a13 in fmgr_c (finfo=0xbfffa40c, values=0xbfffa41c,
isNull=0xbfffa40b "") at fmgr.c:154
#12 0x8109cb7 in fmgr (procedureId=331) at fmgr.c:338
#13 0x806d540 in index_insert (relation=0x8222790, datum=0x849e420,
nulls=0x849e408 " ", heap_t_ctid=0x82367d4, heapRel=0x8218c40)
at indexam.c:190
#14 0x80953a2 in ExecInsertIndexTuples (slot=0x8233368, tupleid=0x82367d4,
estate=0x8231740, is_update=1) at execUtils.c:1210
#15 0x809293c in ExecReplace (slot=0x8233368, tupleid=0xbfffa540,
estate=0x8231740) at execMain.c:1472
#16 0x809255e in ExecutePlan (estate=0x8231740, plan=0x8231280,
operation=CMD_UPDATE, offsetTuples=0, numberTuples=0,
direction=ForwardScanDirection, destfunc=0x8236350) at execMain.c:1086
#17 0x8091cae in ExecutorRun (queryDesc=0x8231728, estate=0x8231740,
feature=3, limoffset=0x0, limcount=0x0) at execMain.c:359
#18 0x80e1098 in ProcessQueryDesc (queryDesc=0x8231728, limoffset=0x0,
limcount=0x0) at pquery.c:333
#19 0x80e10fe in ProcessQuery (parsetree=0x8220998, plan=0x8231280,
dest=Remote) at pquery.c:376
#20 0x80dfbb6 in pg_exec_query_dest (
query_string=0xbfffa67c "update bench set k500k = k500k + 1 where k100 = 30;
", dest=Remote, aclOverride=0) at postgres.c:742
#21 0x80dfab7 in pg_exec_query (
query_string=0xbfffa67c "update bench set k500k = k500k + 1 where k100 = 30;
") at postgres.c:642
#22 0x80e0abc in PostgresMain (argc=6, argv=0xbfffe704, real_argc=6,
real_argv=0xbffffcf4) at postgres.c:1610
---Type <return> to continue, or q <return> to quit---
#23 0x80cacaa in DoBackend (port=0x81c3c38) at postmaster.c:1584
#24 0x80ca77a in BackendStartup (port=0x81c3c38) at postmaster.c:1351
#25 0x80c9ec9 in ServerLoop () at postmaster.c:802
#26 0x80c9a0e in PostmasterMain (argc=6, argv=0xbffffcf4) at postmaster.c:596
#27 0x80a1836 in main (argc=6, argv=0xbffffcf4) at main.c:97
#28 0x400fbcb3 in __libc_start_main (main=0x80a17d0 <main>, argc=6,
argv=0xbffffcf4, init=0x8061878 <_init>, fini=0x810f13c <_fini>,
rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffcec)
at ../sysdeps/generic/libc-start.c:78

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Oleg Bartunov

oleg@sai.msu.su

over 26 years ago

In reply to: Edmund Mergl (#4)

Re: [HACKERS] strange behavior of UPDATE

Edmund,

Here is what I got running that test on DUAL PII 350Mhz, 256 RAM,
FreeBSD-3.1 elf release, current 6.5 cvs:

Start of inserting 1000000 rows: SB 22 MAJ 1999 13:14:58 MSD
Start of indexing 1000000 rows: SB 22 MAJ 1999 14:47:06 MSD
Start of SetQuery single user: SB 22 MAJ 1999 18:18:05 MSD
Start of NewQuery single user: SB 22 MAJ 1999 19:38:06 MSD
End of iX SQLBench 2.1: WS 23 MAJ 1999 11:05:25 MSD

It was so slow, does FreeBSD has slow IO operation or it's swapping
problem ?
SB 22 MAJ 1999 22:07:26 MSD
Q8A
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
We have lost the connection to the backend, so further processing is impossible.
Terminating.
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
We have lost the connection to the backend, so further processing is impossible.
Terminating.
WS 23 MAJ 1999 11:05:24 MSD
Q8B
Connection to database 'test' failed.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I did upgrade of postgres and accidentally stopped testing - I didn't expect
it's still running !

After reinstalling of postgres I did
test=> select count(*) from bench;
anf it takes forever ! Anybody tried that test ?
Here is a bt from gdb:
0x8078b8d in TransactionIdDidAbort ()
(gdb) bt
#0 0x8078b8d in TransactionIdDidAbort ()
#1 0x806b460 in heapgettup ()
#2 0x806bf3f in heap_getnext ()
#3 0x809c816 in SeqNext ()
#4 0x8097609 in ExecScan ()
#5 0x809c8cb in ExecSeqScan ()
#6 0x8095c56 in ExecProcNode ()
#7 0x80993dd in ExecAgg ()
#8 0x8095cd6 in ExecProcNode ()
#9 0x8094c10 in ExecutePlan ()
#10 0x809450b in ExecutorRun ()
#11 0x80ec121 in ProcessQueryDesc ()
#12 0x80ec19e in ProcessQuery ()
#13 0x80eaab3 in pg_exec_query_dest ()
#14 0x80ea994 in pg_exec_query ()
#15 0x80ebb28 in PostgresMain ()
#16 0x80d3608 in DoBackend ()
#17 0x80d30f3 in BackendStartup ()
#18 0x80d2716 in ServerLoop ()
#19 0x80d226f in PostmasterMain ()
#20 0x80a5517 in main ()
#21 0x80611fd in _start ()
(gdb)

I don't know what exactly this test did but I shocked from such a
poor performance. We really need to find out what's the problem.
I'm in a way to open a new very big Web+DB project and I'm a little bit
scare to do this with postgres :-)

Regards,

Oleg

On Sat, 22 May 1999, Edmund Mergl wrote:

Date: Sat, 22 May 1999 06:39:25 +0200
From: Edmund Mergl <E.Mergl@bawue.de>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: PostgreSQL Hackers Mailinglist <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] strange behavior of UPDATE

Tom Lane wrote:

Edmund Mergl <E.Mergl@bawue.de> writes:

When loading 100.000 rows into the table
everything works ok. Selects and updates
are reasonable fast. But when loading
1.000.000 rows the select statements still
work, but a simple update statement
shows this strange behavior.

Can you provide a script or something to reproduce this behavior?

There are a number of people using Postgres with large databases
and not reporting any such problem, so I think there has to be some
special triggering condition; it's not just a matter of things
breaking at a million rows. Before digging into it, I'd like to
eliminate variables like whether I have the right test case.

regards, tom lane

the original benchmark can be found at

ftp://ftp.heise.de/pub/ix/benches/sqlb-21.tar

for a stripped-down version see the attachment.
For loading the database and running the first
and second part (selects and updates) just do
the following:

createdb test
./make_wnt 1000000 pgsql >make.out 2>&1 &

This needs about 700 MB of diskspace.
On a PII-400 it takes about 40 minutes to
load the database, 20 minutes to create the indeces
and 20 minutes to run the first part of the
benchmark (make_sqs). For running the benchmark
in 20 minutes (without swapping) one needs 384 MB RAM.

The second part (make_nqs) contains update
statements which can not be performed properly
using PostgreSQL.

For testing it is sufficient to initialize the
database and then to perform a query like

update bench set k500k = k500k + 1 where k100 = 30

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Kaare Rasmussen

kar@webline.dk

over 26 years ago

In reply to: Edmund Mergl (#1)

Outer joins

Going through the documentation I can only find little about outer
joins. One statement is in the Changes doc about including syntax for
outer joins, but there doesn't seem to be implemented any code after
that.

Is it true that there's no outer joins yet? Any plans? Btw. what is the
syntax for outer joins. I know only Oracle's (+) operator.

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Oleg Bartunov (#7)

Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

When loading 100.000 rows into the table everything works ok. Selects
and updates are reasonable fast. But when loading 1.000.000 rows the
select statements still work, but a simple update statement shows this
strange behavior. A never ending disk-activity starts. Memory
consumption increases up to the physical limit (384 MB) whereas the
postmaster uses only a few % of CPU time. After 1 hour I killed the
post-master.

I tried to reproduce this with current sources on a rather underpowered
Linux box (64Mb of memory, about 40Mb of which is locked down by a
high-priority data collection process). It took a *long* time, but
as far as I could see it was all disk activity, and that's hardly
surprising given the drastic shortage of buffer cache memory.
In particular I did not see any dramatic growth in the size of the
backend process. The test case

update bench set k500k = k500k + 1 where k100 = 30;

required a maximum of 10Mb.

Perhaps you could try it again with a current 6.5 snapshot and see
whether things are any better?

Also, I suspect that increasing the postmaster -B setting beyond its
default of 64 would be quite helpful.

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofFri21May1999223415+02003745C347.F01A3CAB@bawue.de | Resolved by subject fallback

#10

Oleg Bartunov

oleg@sai.msu.su

over 26 years ago

In reply to: Tom Lane (#9)

Re: [HACKERS] strange behavior of UPDATE

Tom,

did you wait until test finished.
I also tried to reproduce test with current 6.5 cvs, Linux 2.0.36,
DUAL PPRO 256Mb. It's still running, it's extremely slow, but
memory usage was about 10-11Mb, CPU usage about 5-9%.
I use -B 1024 option. No surprize people won't use Postgres
for large application.

9:12[postgres@zeus]:~/test/sqlbench> cat cat L9905232104.txt

postgresql-6.5pre on linux-2.2.7
Start of inserting 1000000 rows: Sun May 23 21:04:32 MSD 1999
Start of indexing 1000000 rows: Mon May 24 00:09:47 MSD 1999
Start of SetQuery single user: Mon May 24 03:24:01 MSD 1999
Start of NewQuery single user: Mon May 24 05:23:41 MSD 1999

9:15[postgres@zeus]:~/test/sqlbench>gdb /usr/local/pgsql.65/bin/postgres 10130
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i486-slackware-linux),
Copyright 1996 Free Software Foundation, Inc...

/usr2/u/postgres/test/sqlbench/10130: No such file or directory.
Attaching to program /usr/local/pgsql.65/bin/postgres', process 10130
Reading symbols from /lib/libdl.so.1...done.
Reading symbols from /lib/libm.so.5...done.
Reading symbols from /lib/libtermcap.so.2...done.
Reading symbols from /lib/libncurses.so.3.0...done.
Reading symbols from /lib/libc.so.5...done.
Reading symbols from /lib/ld-linux.so.1...done.
0x400c0564 in __read ()
(gdb) bt
#0 0x400c0564 in __read ()
#1 0x80e5abb in FileRead ()
#2 0x80ec793 in mdread ()
#3 0x80ed3b5 in smgrread ()
#4 0x80e34d2 in ReadBufferWithBufferLock ()
#5 0x80e33b2 in ReadBuffer ()
#6 0x806ff28 in heap_fetch ()
#7 0x809ec19 in IndexNext ()
#8 0x809b3e9 in ExecScan ()
#9 0x809ed61 in ExecIndexScan ()
#10 0x8099a46 in ExecProcNode ()
#11 0x809d1bd in ExecAgg ()
#12 0x8099ab6 in ExecProcNode ()
#13 0x80989f0 in ExecutePlan ()
#14 0x80982eb in ExecutorRun ()
#15 0x80eff54 in ProcessQueryDesc ()
#16 0x80effce in ProcessQuery ()
#17 0x80ee783 in pg_exec_query_dest ()
#18 0x80ee664 in pg_exec_query ()
#19 0x80ef8d8 in PostgresMain ()
#20 0x80d7290 in DoBackend ()
#21 0x80d6dd3 in BackendStartup ()
#22 0x80d6496 in ServerLoop ()
---Type <return> to continue, or q <return> to quit---
#23 0x80d603c in PostmasterMain ()
#24 0x80a9287 in main ()
#25 0x806502e in _start ()
(gdb)

Top shows:

10130 postgres 7 0 11020 10M 9680 D 0 5.9 4.0 5:04 postmaster

Regards,

Oleg

On Sun, 23 May 1999, Tom Lane wrote:

Date: Sun, 23 May 1999 20:43:33 -0400
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Edmund Mergl <E.Mergl@bawue.de>
Cc: PostgreSQL Hackers Mailinglist <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

When loading 100.000 rows into the table everything works ok. Selects
and updates are reasonable fast. But when loading 1.000.000 rows the
select statements still work, but a simple update statement shows this
strange behavior. A never ending disk-activity starts. Memory
consumption increases up to the physical limit (384 MB) whereas the
postmaster uses only a few % of CPU time. After 1 hour I killed the
post-master.

I tried to reproduce this with current sources on a rather underpowered
Linux box (64Mb of memory, about 40Mb of which is locked down by a
high-priority data collection process). It took a *long* time, but
as far as I could see it was all disk activity, and that's hardly
surprising given the drastic shortage of buffer cache memory.
In particular I did not see any dramatic growth in the size of the
backend process. The test case

update bench set k500k = k500k + 1 where k100 = 30;

required a maximum of 10Mb.

Perhaps you could try it again with a current 6.5 snapshot and see
whether things are any better?

Also, I suspect that increasing the postmaster -B setting beyond its
default of 64 would be quite helpful.

regards, tom lane

#11

Thomas Lockhart

lockhart@alumni.caltech.edu

over 26 years ago

In reply to: Kaare Rasmussen (#8)

Re: [HACKERS] Outer joins

Going through the documentation I can only find little about outer
joins. One statement is in the Changes doc about including syntax for
outer joins, but there doesn't seem to be implemented any code after
that.
Is it true that there's no outer joins yet? Any plans? Btw. what is the
syntax for outer joins. I know only Oracle's (+) operator.

There is a small amount of code inside of #ifdef ENABLE_OUTER_JOINS
but it is not even close to what needs to be present for anything to
run. Bruce and I were talking about an implementation, but it is
definitely not coming for v6.5.

- Thomas

Oh, the syntax has lots of variants, but the basic one is:

select * from t1 left|right|full outer join t2 on t1.x = t2.x;

select * from t1 left|right|full outer join t2 using (x);

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

#12

Edmund Mergl

E.Mergl@bawue.de

over 26 years ago

In reply to: Oleg Bartunov (#10)

Re: [HACKERS] strange behavior of UPDATE

Bruce Momjian wrote:

Not really. It required changing function calls all over, and probably
relies on 6.5 changes too. It is in the 6.5 beta right now. Are you
using that?

I did all the tests with the current snapshot of May 18th.

Fix was installed yesterday.

sorry, but the problem still persists. I used

-rw-r--r-- 1 1005 root 5954900 May 24 03:03 postgresql.snapshot.tar.gz

I started the query

update bench set k500k = k500k + 1 where k100 = 30;

and I killed it after half an hour of havy disk activity.
The same query on the same machine with the same setup,
but using sybase-ase-11.0.3.3-1 takes less than 1 minute.

The table bench looks like the following:

Table    = bench
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                | Length|
+----------------------------------+----------------------------------+-------+
| kseq                             | int4 not null                    |     4 |
| k500k                            | int4 not null                    |     4 |
| k250k                            | int4 not null                    |     4 |
| k100k                            | int4 not null                    |     4 |
| k40k                             | int4 not null                    |     4 |
| k10k                             | int4 not null                    |     4 |
| k1k                              | int4 not null                    |     4 |
| k100                             | int4 not null                    |     4 |
| k25                              | int4 not null                    |     4 |
| k10                              | int4 not null                    |     4 |
| k5                               | int4 not null                    |     4 |
| k4                               | int4 not null                    |     4 |
| k2                               | int4 not null                    |     4 |
| s1                               | char() not null                  |     8 |
| s2                               | char() not null                  |    20 |
| s3                               | char() not null                  |    20 |
| s4                               | char() not null                  |    20 |
| s5                               | char() not null                  |    20 |
| s6                               | char() not null                  |    20 |
| s7                               | char() not null                  |    20 |
| s8                               | char() not null                  |    20 |
+----------------------------------+----------------------------------+-------+

The table is filled with 1.000.000 rows of random data
and on every field an index is created.

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

Import Notes

Reference msg id not found: 199905230548.BAA22060@candle.pha.pa.us | Resolved by subject fallback

#13

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Edmund Mergl (#12)

Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

The table is filled with 1.000.000 rows of random data
and on every field an index is created.

BTW, do you happen to know just how random the data actually is?
I noticed that the update query
update bench set k500k = k500k + 1 where k100 = 30;
updates 10,000 rows. If this "random" data actually consists of
10,000 repetitions of only 100 distinct values in every column,
then a possible explanation for the problem would be that our
btree index code isn't very fast when there are large numbers of
identical keys. (Mind you, I have no idea if that's true or not,
I'm just trying to think of likely trouble spots. Anyone know
btree well enough to say whether that is likely to be a problem?)

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofTue25May1999001333+02003749CF0D.781ACC88@bawue.de | Resolved by subject fallback

#14

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Tom Lane (#13)

Re: [HACKERS] strange behavior of UPDATE

I wrote:

then a possible explanation for the problem would be that our
btree index code isn't very fast when there are large numbers of
identical keys.

Ah-hah, a lucky guess! I went back and looked at the profile stats
I'd extracted from Edmund's "update" example. This Linux box has
the same semi-functional gprof as someone else was using a while
ago --- the timings are bogus, but the call counts seem right.
And what I find are entries like this:

0.00 0.00 284977/284977 _bt_binsrch [3174]
[3177]: 0.0 0.00 0.00 284977 _bt_firsteq [3177] 0.00 0.00 21784948/24713758 _bt_compare [3169]
0.00 0.00 21784948/24713758 _bt_compare [3169]

0.00 0.00 426/35632 _bt_split [53]
0.00 0.00 35206/35632 _bt_insertonpg [45]
[3185]: 0.0 0.00 0.00 35632 _bt_findsplitloc [3185] 0.00 0.00 5093972/8907411 _bt_itemcmp [3171]
0.00 0.00 5093972/8907411 _bt_itemcmp [3171]

In other words, _bt_firsteq is averaging almost 100 comparisons per
call, _bt_findsplitloc well over that. Both of these routines are
evidently designed on the assumption that there will be relatively
few duplicate keys --- they reduce to linear scans when there are
many duplicates.

_bt_firsteq shouldn't exist at all; the only routine that calls it
is _bt_binsrch, which does a fast binary search of the index page.
_bt_binsrch should be fixed so that the binary search logic does the
right thing for equal-key cases, rather than degrading to a linear
search. I am less confident that I understand _bt_findsplitloc's place
in the great scheme of things, but it could certainly be revised to use
a binary search rather than linear scan.

This benchmark is clearly overemphasizing the equal-key case, but
I think it ought to be fixed independently of whether we want to
look good on a dubious benchmark ... equal keys are not uncommon in
real-world scenarios after all.

Next question is do we want to risk twiddling this code so soon before
6.5, or wait till after?

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofMon24May1999202004-04001649.927591604@sss.pgh.pa.us | Resolved by subject fallback

#15

Edmund Mergl

E.Mergl@bawue.de

over 26 years ago

In reply to: Tom Lane (#13)

Re: [HACKERS] strange behavior of UPDATE

Tom Lane wrote:

Edmund Mergl <E.Mergl@bawue.de> writes:

The table is filled with 1.000.000 rows of random data
and on every field an index is created.

BTW, do you happen to know just how random the data actually is?
I noticed that the update query
update bench set k500k = k500k + 1 where k100 = 30;
updates 10,000 rows. If this "random" data actually consists of
10,000 repetitions of only 100 distinct values in every column,
then a possible explanation for the problem would be that our
btree index code isn't very fast when there are large numbers of
identical keys. (Mind you, I have no idea if that's true or not,
I'm just trying to think of likely trouble spots. Anyone know
btree well enough to say whether that is likely to be a problem?)

regards, tom lane

the query:

update bench set k500k = k500k + 1 where k100 = 30;

affects about 10.000 rows. This can be determined by running
the query:

select k500k from bench where k100 = 30;

which takes about half a minute. That's the reason I
was talking about the strange UPDATE behavior of
PostgreSQL. If it can determine a specific number
of rows in a reasonable time, it should be able to
update these rows in the same time frame.

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

#16

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Edmund Mergl (#15)

Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

... That's the reason I
was talking about the strange UPDATE behavior of
PostgreSQL. If it can determine a specific number
of rows in a reasonable time, it should be able to
update these rows in the same time frame.

Not necessarily --- this table has a remarkably large number of indexes,
and all of them have to be updated when a tuple is replaced. So the
amount of work is significantly greater than simply finding the tuples
will require.

As I posted later, I think that much of the problem comes from poor
handling of equal-key cases in our btree index routines...

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofTue25May1999071112+0200374A30F0.5DA9BCFA@bawue.de | Resolved by subject fallback

#17

Edmund Mergl

E.Mergl@bawue.de

over 26 years ago

In reply to: Tom Lane (#16)

Re: [HACKERS] strange behavior of UPDATE

Tom Lane wrote:

Edmund Mergl <E.Mergl@bawue.de> writes:

... That's the reason I
was talking about the strange UPDATE behavior of
PostgreSQL. If it can determine a specific number
of rows in a reasonable time, it should be able to
update these rows in the same time frame.

Not necessarily --- this table has a remarkably large number of indexes,
and all of them have to be updated when a tuple is replaced. So the
amount of work is significantly greater than simply finding the tuples
will require.

As I posted later, I think that much of the problem comes from poor
handling of equal-key cases in our btree index routines...

regards, tom lane

if this is the case, these routines must be very poor.
Again some numbers:

1.000.000 rows:

- select * from bench where k100 = 30
with indeces 10 seconds
without indeces 28 seconds

- update bench set k500k = k500k + 1 where k100 = 30
with indeces unknown
without indeces 36 seconds

Still the poor update routines do not explain the
strange behavior, that the postmaster runs for
hours using at most 10% CPU, and all the time
heavy disk activity is observed. According to
top, there are over 80MB free Mem and the postmaster
has been started with -o -F. Hence this disk activity
can not be simple swapping.

Some more numbers:

database #rows inserts create make_sqs make_nqs
index selects updates
----------------------------------------------------------------------------
pgsql 10.000 00:24 00:09 00:16 00:25
pgsql 100.000 04:01 01:29 01:06 49:45
pgsql 1.000.000 39:24 20:49 23:42 ???

whereas the increase of elapsed time is somewhat proportional
to the number of rows, for the update-case it is rather
exponential.

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

#18

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Edmund Mergl (#17)

Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

Some more numbers:

database #rows inserts create make_sqs make_nqs
index selects updates
----------------------------------------------------------------------------
pgsql 10.000 00:24 00:09 00:16 00:25
pgsql 100.000 04:01 01:29 01:06 49:45
pgsql 1.000.000 39:24 20:49 23:42 ???

whereas the increase of elapsed time is somewhat proportional
to the number of rows, for the update-case it is rather
exponential.

Those are attention-getting numbers, all right. I think that the two
equal-key problems I found last night might partially explain them;
I suspect there are more that I have not found, too. I will look into
it some more.

Could you try the same queries with no indexes in place, and see what
the time scaling is like then? That would confirm or deny the theory
that it's an index-update problem.

Question for the hackers list: are we prepared to install purely
performance-related bug fixes at this late stage of the 6.5 beta cycle?
Bad as the above numbers are, I hesitate to twiddle the btree code and
risk breaking things with only a week of testing time to go...

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofTue25May1999204641+0200374AF011.22C181AE@bawue.de | Resolved by subject fallback

#19

Edmund Mergl

E.Mergl@bawue.de

over 26 years ago

In reply to: Tom Lane (#18)

Re: [HACKERS] strange behavior of UPDATE

Tom Lane wrote:

Edmund Mergl <E.Mergl@bawue.de> writes:

Some more numbers:

database #rows inserts create make_sqs make_nqs
index selects updates
----------------------------------------------------------------------------
pgsql 10.000 00:24 00:09 00:16 00:25
pgsql 100.000 04:01 01:29 01:06 49:45
pgsql 1.000.000 39:24 20:49 23:42 ???

whereas the increase of elapsed time is somewhat proportional
to the number of rows, for the update-case it is rather
exponential.

Those are attention-getting numbers, all right. I think that the two
equal-key problems I found last night might partially explain them;
I suspect there are more that I have not found, too. I will look into
it some more.

Could you try the same queries with no indexes in place, and see what
the time scaling is like then? That would confirm or deny the theory
that it's an index-update problem.

here they are, and yes, I double-checked them twice !

database #rows inserts create make_sqs make_nqs
index selects updates
----------------------------------------------------------------------------
pgsql 10.000 00:24 - 00:13 00:05
pgsql 100.000 04:01 - 00:83 00:32
pgsql 1.000.000 39:24 - 26:36 26:52

Question for the hackers list: are we prepared to install purely
performance-related bug fixes at this late stage of the 6.5 beta cycle?
Bad as the above numbers are, I hesitate to twiddle the btree code and
risk breaking things with only a week of testing time to go...

regards, tom lane

if there is anything else I can do, just let me know.

Edmund

--
Edmund Mergl mailto:E.Mergl@bawue.de
Im Haldenhau 9 http://www.bawue.de/~mergl
70565 Stuttgart fon: +49 711 747503
Germany

#20

Vadim Mikheev

vadim@krs.ru

over 26 years ago

In reply to: Tom Lane (#18)

Re: [HACKERS] strange behavior of UPDATE

Tom Lane wrote:

Those are attention-getting numbers, all right. I think that the two
equal-key problems I found last night might partially explain them;
I suspect there are more that I have not found, too. I will look into
it some more.

Am I correct that update takes ~ 10% CPU with high disk activity?
(Unfortunately, no list archive after May 13, so I'm not able
to re-read thread).
Remember that update inserts new index tuples and most likely
index scan will see these tuples and fetch just inserted
heap tuples.

Could you try the same queries with no indexes in place, and see what
the time scaling is like then? That would confirm or deny the theory
that it's an index-update problem.

Question for the hackers list: are we prepared to install purely
performance-related bug fixes at this late stage of the 6.5 beta cycle?
Bad as the above numbers are, I hesitate to twiddle the btree code and
risk breaking things with only a week of testing time to go...

Try to fix problems and run Edmund scripts to see are things
better than now. We can apply fixes after 6.5.

Vadim

#21

ZEUGSWETTER Andreas IZ5

Andreas.Zeugswetter@telecom.at

over 26 years ago

In reply to: Vadim Mikheev (#20)

Re: [HACKERS] strange behavior of UPDATE

- update bench set k500k = k500k + 1 where k100 = 30
with indeces unknown
without indeces 36 seconds

can you run an: explain update bench set k500k = k500k + 1 where k100 = 30;

Still the poor update routines do not explain the
strange behavior, that the postmaster runs for
hours using at most 10% CPU, and all the time
heavy disk activity is observed.

I suspect it is doing a seq scan. Thus explaining the heavy disk activity.
I have previously sent in a patch which will fix this if someone applies it.

Andreas

Import Notes

Resolved by subject fallback

#22

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: ZEUGSWETTER Andreas IZ5 (#21)

Re: [HACKERS] strange behavior of UPDATE

ZEUGSWETTER Andreas IZ5 <Andreas.Zeugswetter@telecom.at> writes:

can you run an: explain update bench set k500k = k500k + 1 where k100 = 30;
I suspect it is doing a seq scan.

No, that's not it:

test=> explain update bench set k500k = k500k + 1 where k100 = 30;
NOTICE: QUERY PLAN:

Index Scan using k100 on bench (cost=179.05 rows=2082 width=154)

The benchmark loads the tables first and then builds indexes, and
in fact does a vacuum analyze after that! So the stats should be fine.

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofWed26May1999092559+0200219F68D65015D011A8E000006F8590C60267B378@sdexcsrv1.f000.d0188.sd.spardat.at | Resolved by subject fallback

#23

Tom Lane

tgl@sss.pgh.pa.us

over 26 years ago

In reply to: Tom Lane (#22)

Re: [HACKERS] strange behavior of UPDATE

Edmund Mergl <E.Mergl@bawue.de> writes:

Could you try the same queries with no indexes in place, and see what
the time scaling is like then? That would confirm or deny the theory
that it's an index-update problem.

here they are, and yes, I double-checked them twice !

database #rows inserts create make_sqs make_nqs
index selects updates
----------------------------------------------------------------------------
pgsql 10.000 00:24 - 00:13 00:05
pgsql 100.000 04:01 - 00:83 00:32
pgsql 1.000.000 39:24 - 26:36 26:52

Oh dear ... so much for my theory that index updates are to blame for
the scaling problem. Back to the drawing board ...

regards, tom lane

Import Notes

Reply to msg id not found: YourmessageofTue25May1999233137+0200374B16B9.57103224@bawue.de | Resolved by subject fallback

#24

Kaare Rasmussen

kar@webline.dk

over 26 years ago

In reply to: Thomas Lockhart (#11)

Re: [HACKERS] Outer joins

select * from t1 left|right|full outer join t2 on t1.x = t2.x;

Will this be correct?

SELECT * FROM t1, t2, t3, t4 LEFT OUTER JOIN ON t1.x = t2.x,
t1.x = t3.x, t1.x = t4.x;

#25

Thomas Lockhart

lockhart@alumni.caltech.edu

over 26 years ago

In reply to: Kaare Rasmussen (#24)

Re: [HACKERS] Outer joins

select * from t1 left|right|full outer join t2 on t1.x = t2.x;

Will this be correct?
SELECT * FROM t1, t2, t3, t4 LEFT OUTER JOIN ON t1.x = t2.x,
t1.x = t3.x, t1.x = t4.x;

Left outer joins will take the left-side table and null-fill entries
which do not have a corresponding match on the right-side table. If
your example is trying to get an output row for at least every input
row from t1, then perhaps the query would be

select * from t1 left join t2 using (x)
left join t3 using (x)
left join t4 using (x);

But since I haven't implemented it yet I don't have much experience
with the outer join syntax...

- Thomas

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

#26

Kaare Rasmussen

kar@webline.dk

over 26 years ago

In reply to: Thomas Lockhart (#25)

Re: [HACKERS] Outer joins

Left outer joins will take the left-side table and null-fill entries
which do not have a corresponding match on the right-side table. If
your example is trying to get an output row for at least every input
row from t1, then perhaps the query would be

select * from t1 left join t2 using (x)
left join t3 using (x)
left join t4 using (x);

But since I haven't implemented it yet I don't have much experience
with the outer join syntax...

You miss at least two points: The keyword OUTER and the column name
from t1. As I know, LEFT is the default, so it could be omitted.

Maybe
SELECT * FROM t1 USING (X) OUTER JOIN t2 USING (Y)
OUTER JOIN t3 USING (Z)
OUTER JOIN t4 using (t);

It should be possible to boil it down to
SELECT * FROM t1 USING (X) OUTER JOIN t2 USING (Y), t3 USING (Z), t4 using (t);

#27

Thomas Lockhart

lockhart@alumni.caltech.edu

over 26 years ago

In reply to: Kaare Rasmussen (#26)

Re: [HACKERS] Outer joins

Left outer joins will take the left-side table and null-fill entries
which do not have a corresponding match on the right-side table. If
your example is trying to get an output row for at least every input
row from t1, then perhaps the query would be
select * from t1 left join t2 using (x)
left join t3 using (x)
left join t4 using (x);
But since I haven't implemented it yet I don't have much experience
with the outer join syntax...

You miss at least two points: The keyword OUTER and the column name
from t1. As I know, LEFT is the default, so it could be omitted.

"OUTER" conveys no additional information, and can be omitted. My copy
of Date and Darwen indicates that "LEFT JOIN" is the minimum required
to get a left outer join (i.e. the "LEFT" can not be omitted).

I'm not sure what you mean about missing something about "the column
name for t1". My hypothetical query is referring to column "x",
present in all four tables. Was there some other place a column for t1
should be mentioned?

Maybe
SELECT * FROM t1 USING (X) OUTER JOIN t2 USING (Y)
OUTER JOIN t3 USING (Z)
OUTER JOIN t4 using (t);
It should be possible to boil it down to
SELECT * FROM t1 USING (X) OUTER JOIN t2 USING (Y), t3 USING (Z), t4 using (t);

This doesn't resemble SQL92, but may have some similarity to outer
join syntaxes in Oracle, Sybase, etc. Don't know myself.

A (hypothetical) simple two table outer join can be written as

select * from t1 left join t2 using (x);

Introducing a third table to be "left outer joined" to this
intermediate result can be done as

select * from t1 left join t2 using (x)
left join t3 using (x);

where the second "x" refers to the column named "x" from the first
outer join, and the column named "x" from t3.

An alternate equivalent query would be

select * from t1 left join t2 on t1.x = t2.x
left join t3 on x = t3.x;

Hope this helps (and that I've got the details right now that I've
spouted off... :)

- Thomas

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

#28

Kaare Rasmussen

kar@webline.dk

over 26 years ago

In reply to: Thomas Lockhart (#27)

Re: [HACKERS] Outer joins

"OUTER" conveys no additional information, and can be omitted. My copy

Sorry. You're right. Just as long as you accept it.

I'm not sure what you mean about missing something about "the column
name for t1". My hypothetical query is referring to column "x",
present in all four tables. Was there some other place a column for t1
should be mentioned?

What if the column is named order_id in one table and ord_id in another?

select * from t1 left join t2 on t1.x = t2.x
left join t3 on x = t3.x;

OK, this will do it. You can have a t1.x = t2.y.

#29

Bruce Momjian

maillist@candle.pha.pa.us

over 26 years ago

In reply to: Tom Lane (#14)

Re: [HACKERS] strange behavior of UPDATE

Added to TODO:

* Improve _bt_binsrch() to handle equal keys better, remove _bt_firsteq()(Tom)

I wrote:

then a possible explanation for the problem would be that our
btree index code isn't very fast when there are large numbers of
identical keys.

Ah-hah, a lucky guess! I went back and looked at the profile stats
I'd extracted from Edmund's "update" example. This Linux box has
the same semi-functional gprof as someone else was using a while
ago --- the timings are bogus, but the call counts seem right.
And what I find are entries like this:

0.00 0.00 284977/284977 _bt_binsrch [3174]
[3177] 0.0 0.00 0.00 284977 _bt_firsteq [3177]
0.00 0.00 21784948/24713758 _bt_compare [3169]

0.00 0.00 426/35632 _bt_split [53]
0.00 0.00 35206/35632 _bt_insertonpg [45]
[3185] 0.0 0.00 0.00 35632 _bt_findsplitloc [3185]
0.00 0.00 5093972/8907411 _bt_itemcmp [3171]

In other words, _bt_firsteq is averaging almost 100 comparisons per
call, _bt_findsplitloc well over that. Both of these routines are
evidently designed on the assumption that there will be relatively
few duplicate keys --- they reduce to linear scans when there are
many duplicates.

_bt_firsteq shouldn't exist at all; the only routine that calls it
is _bt_binsrch, which does a fast binary search of the index page.
_bt_binsrch should be fixed so that the binary search logic does the
right thing for equal-key cases, rather than degrading to a linear
search. I am less confident that I understand _bt_findsplitloc's place
in the great scheme of things, but it could certainly be revised to use
a binary search rather than linear scan.

This benchmark is clearly overemphasizing the equal-key case, but
I think it ought to be fixed independently of whether we want to
look good on a dubious benchmark ... equal keys are not uncommon in
real-world scenarios after all.

Next question is do we want to risk twiddling this code so soon before
6.5, or wait till after?

regards, tom lane

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

strange behavior of UPDATE

Attachments: