PHPBuilder article -- Postgres vs MySQL

Started by Michael Forkabout 25 years ago78 messages

Michael Fork

mfork@toledolink.com

about 25 years ago

Thought this may be of interest to some...

http://www.phpbuilder.com/columns/tim20001112.php3

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Michael Fork (#1)

Re: [HACKERS] PHPBuilder article -- Postgres vs MySQL

At 11:38 PM 11/12/00 -0500, Michael Fork wrote:

Thought this may be of interest to some...

http://www.phpbuilder.com/columns/tim20001112.php3

I just submitted it to slashdot, what the heck :)

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

Robert D. Nelson

RDNELSON@co.centre.pa.us

about 25 years ago

In reply to: Don Baccus (#2)

RE: PHPBuilder article -- Postgres vs MySQL

And now it's on www.slashdot.org ...

http://slashdot.org/articles/00/11/13/1342208.shtml

Poul L. Christiansen

Michael Fork wrote:

Thought this may be of interest to some...

http://www.phpbuilder.com/columns/tim20001112.php3

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

It would be nice if the fourth page were there. Am I the only one getting
"Not Found"?

Rob Nelson
rdnelson@co.centre.pa.us

Import Notes

Resolved by subject fallback

Bruce Momjian

pgman@candle.pha.pa.us

about 25 years ago

In reply to: Robert D. Nelson (#3)

Re: PHPBuilder article -- Postgres vs MySQL

[ Charset ISO-8859-1 unsupported, converting... ]

And now it's on www.slashdot.org ...

http://slashdot.org/articles/00/11/13/1342208.shtml

Poul L. Christiansen

Michael Fork wrote:

Thought this may be of interest to some...

http://www.phpbuilder.com/columns/tim20001112.php3

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

It would be nice if the fourth page were there. Am I the only one getting
"Not Found"?

They are getting lots of hits. Hit reload and eventually it will
appear. I am on page six now.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Noname

davidb@vectormath.com

about 25 years ago

In reply to: Bruce Momjian (#4)

Re: PHPBuilder article -- Postgres vs MySQL

I made it all the way through the article. I'll summarize it for you:
Postgres - hooray!
MySQL - boo!
Since this is an open source database article linked off of slashdot, I
imagine they're getting pounded.

David Boerwinkle

-----Original Message-----
From: Robert D. Nelson <RDNELSON@co.centre.pa.us>
To: Michael Fork <mfork@toledolink.com>; Poul L.Christiansen
<poulc@cs.auc.dk>
Cc: pgsql-general <pgsql-general@postgresql.org>; pgsql-hackers
<pgsql-hackers@postgresql.org>
Date: Monday, November 13, 2000 8:31 AM
Subject: RE: [GENERAL] PHPBuilder article -- Postgres vs MySQL

Show quoted text

And now it's on www.slashdot.org ...

http://slashdot.org/articles/00/11/13/1342208.shtml

Poul L. Christiansen

Michael Fork wrote:

Thought this may be of interest to some...

http://www.phpbuilder.com/columns/tim20001112.php3

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

It would be nice if the fourth page were there. Am I the only one getting
"Not Found"?

Rob Nelson
rdnelson@co.centre.pa.us

Import Notes

Resolved by subject fallback

Poul L. Christiansen

poulc@cs.auc.dk

about 25 years ago

In reply to: Michael Fork (#1)

Re: PHPBuilder article -- Postgres vs MySQL

And now it's on www.slashdot.org ...

http://slashdot.org/articles/00/11/13/1342208.shtml

Poul L. Christiansen

Michael Fork wrote:

Show quoted text

Thought this may be of interest to some...

http://www.phpbuilder.com/columns/tim20001112.php3

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

Robert D. Nelson

RDNELSON@co.centre.pa.us

about 25 years ago

In reply to: Poul L. Christiansen (#6)

RE: PHPBuilder article -- Postgres vs MySQL

I made it all the way through the article. I'll summarize it for you:
Postgres - hooray!
MySQL - boo!

Yeah, and that's about it. No analysis or anything. Disappointing, after
waiting so long for the pages to load.

Since this is an open source database article linked off of slashdot, I
imagine they're getting pounded.

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

Rob Nelson
rdnelson@co.centre.pa.us

Import Notes

Resolved by subject fallback

Noname

fabrizio.ermini@sysdat.it

about 25 years ago

In reply to: Robert D. Nelson (#7)

RE: PHPBuilder article -- Postgres vs MySQL

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

I think this could be because they used real killer pages in the test,
and maybe this also the reason PgSQL fared this good (I've always
been and I'm still a postgres fan, but looking at that results I've
been quite astonished!!). Have you looked the spec? If I remember
well, Tim was talking about executing cuncurrently a page that
joined a dozen tables and another that was doing
update/select/insert on the same tables. Under these condition, 10
pages/sec it seems lighting to me!!!!

bye!

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Fabrizio Ermini Alternate E-mail:
C.so Umberto, 7 faermini@tin.it
loc. Meleto Valdarno Mail on GSM: (keep it short!)
52020 Cavriglia (AR) faermini@sms.tin.it

Martin A. Marques

martin@math.unl.edu.ar

about 25 years ago

In reply to: Robert D. Nelson (#7)

Re: PHPBuilder article -- Postgres vs MySQL

On Lun 13 Nov 2000 13:22, Robert D. Nelson wrote:

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

In the article it was said that the querys were unoptimized to get the best
out of the database. Tim said that with some changes to the querys they could
have gotten much better results.

Saludos... :-)

--
"And I'm happy, because you make me feel good, about me." - Melvin Udall
-----------------------------------------------------------------
Martï¿½n Marquï¿½s email: martin@math.unl.edu.ar
Santa Fe - Argentina http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

#10

carl garland

carlhgarland@hotmail.com

about 25 years ago

In reply to: Noname (#8)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

I think alot of it has to do with the web server/db setup not pg. They are
using Apache/PHP and looking at their code every page has the additional
overhead of making the db connection. Now if they had used AOLserver with
its persistent db connecction pooling scheme they may have faired better ;)

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Share information about yourself, create your own public profile at
http://profiles.msn.com.

Import Notes

Resolved by subject fallback

#11

mlw

markw@mohawksoft.com

about 25 years ago

In reply to: carl garland (#10)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

carl garland wrote:

perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

I think alot of it has to do with the web server/db setup not pg. They are
using Apache/PHP and looking at their code every page has the additional
overhead of making the db connection. Now if they had used AOLserver with
its persistent db connecction pooling scheme they may have faired better ;)

PHP has a persistent PostgreSQL open pg_pConnect(....) and it does make
a difference.

I use postgres as a music database back-end for a PHP web server.
(Actually it is a web farm, with many instances of the database, one per
web server)

The one problem I have had with Postgres is its stubborn refusal to use
an index. I understand the reasons why it won't, but it is wrong, so I
sped it up by starting the backends with -fs.

That may be the issue.

On a side note:
I'm not sure of the current workings of the vacuum and statistics vs
indexing issue, I am new to this list, but I do have a 7.0.2 relevant
observation:

My music database has 50,000 arises and 210,000 albums. Many artists
have only one or 2 entries in the albums table (for the youngsters, CD
table ;-). About 34,000 have the integer key for "Various Artists" as
their artist entry, and another few thousand have things like "Movie
Soundtrack" and so on.

When the statistics are computed, these relatively few records with a
huge number of relations distort the statistics and make it impossible
to get postgres to use an index on that table without the -fs switch.

This is bad because it always forces use of an index, even when postgres
would legitimately ignore it.

--
http://www.mohawksoft.com

#12

Tom Samplonius

tom@sdf.com

about 25 years ago

In reply to: carl garland (#10)

Re: Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

On Wed, 15 Nov 2000, carl garland wrote:

perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

I think alot of it has to do with the web server/db setup not pg. They are
using Apache/PHP and looking at their code every page has the additional
overhead of making the db connection. Now if they had used AOLserver with
its persistent db connecction pooling scheme they may have faired better ;)

I doubt it. PostgreSQL has a higher connection startup overhead than
MySQL, so if every view required a new database connection, it would been
quite a detriment to the PostgreSQL scores.

PHP can maintain persisitant connections. Unfortunately, this means
that you end up with a database connection per httpd process. That really
isn't a problem for PostgreSQL though, it just requires sufficent memory.
No doubt that is what was being done.

AOLServer isn't the only system that can pool database connections, so
can servlets/JSP, ColdFusion, ASP, etc. No doubt AOLServer would be more
widely accepted if it used something other than TCL.

Tom

#13

Noname

joseph@randomnetworks.com

about 25 years ago

In reply to: carl garland (#10)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Wed, 15 Nov 2000, carl garland wrote:

# >perhaps why, even at 5 clients, the page views he shows never went
# >significantly above 10/sec?
#
# I think alot of it has to do with the web server/db setup not pg. They are
# using Apache/PHP and looking at their code every page has the additional
# overhead of making the db connection. Now if they had used AOLserver with
# its persistent db connecction pooling scheme they may have faired better ;)

I haven't actually looked at their code they used to test with to
see if they are actually using it, but Apache/PHP has the ability to do
persistent db connections. I would be surprised that someone like Tim (
who seems to have done a fair bit of php with db stuff) would not make use
of such a feature.

If you can point out an example of where they are not using this
feature I will gladly stand corrected.

--------------------------------------------------------------------
| Joseph Scott The Office Of Water Programs |
| joseph@randomnetworks.com joseph.scott@owp.csus.edu |
--------------------------------------------------------------------

#14

Adam Lang

aalang@rutgersinsurance.com

about 25 years ago

In reply to: Noname (#13)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

PHP 4 was having problems with persistent connections (not sure if with just
pgsql or all dbs). Maybe that is why they didn't use it.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: <joseph@randomnetworks.com>
To: "carl garland" <carlhgarland@hotmail.com>
Cc: <martin@math.unl.edu.ar>; <RDNELSON@co.centre.pa.us>;
<davidb@vectormath.com>; <mfork@toledolink.com>; <poulc@cs.auc.dk>;
<pgsql-general@postgresql.org>; <pgsql-hackers@postgresql.org>
Sent: Wednesday, November 15, 2000 12:28 PM
Subject: Re: [HACKERS] Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

On Wed, 15 Nov 2000, carl garland wrote:

# >perhaps why, even at 5 clients, the page views he shows never went
# >significantly above 10/sec?
#
# I think alot of it has to do with the web server/db setup not pg. They

are

# using Apache/PHP and looking at their code every page has the additional
# overhead of making the db connection. Now if they had used AOLserver

with

# its persistent db connecction pooling scheme they may have faired better

;)

Show quoted text

I haven't actually looked at their code they used to test with to
see if they are actually using it, but Apache/PHP has the ability to do
persistent db connections. I would be surprised that someone like Tim (
who seems to have done a fair bit of php with db stuff) would not make use
of such a feature.

If you can point out an example of where they are not using this
feature I will gladly stand corrected.

--------------------------------------------------------------------
| Joseph Scott The Office Of Water Programs |
| joseph@randomnetworks.com joseph.scott@owp.csus.edu |
--------------------------------------------------------------------

#15

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Tom Samplonius (#12)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

At 09:27 AM 11/15/00 -0800, Tom Samplonius wrote:

AOLServer isn't the only system that can pool database connections, so
can servlets/JSP, ColdFusion, ASP, etc. No doubt AOLServer would be more
widely accepted if it used something other than TCL.

There are two separate modules that support Java in AOLserver: ns_tomcat
which provides an identical interface as Apache tomcat (and no real
advantages) and ns_java, which is coming out of the OpenACS project. ns_java
exposes AOLserver's pooled, persistent database API to java.

There's also support available for Python, though there's still a lot of
work to be done to support the full AOLserver API (same's true of ns_java,
actually).

If you use ADP pages, your use of Tcl is typically restricted to snippets of
code anyway, so I've never really understood the complaints about Tcl...

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#16

markw

markw@mohawksoft.com

about 25 years ago

In reply to: carl garland (#10)

Re: Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

Andrew McMillan wrote:

mlw wrote:

My music database has 50,000 arises and 210,000 albums. Many artists
have only one or 2 entries in the albums table (for the youngsters, CD
table ;-). About 34,000 have the integer key for "Various Artists" as
their artist entry, and another few thousand have things like "Movie
Soundtrack" and so on.

When the statistics are computed, these relatively few records with a
huge number of relations distort the statistics and make it impossible
to get postgres to use an index on that table without the -fs switch.

This is bad because it always forces use of an index, even when postgres
would legitimately ignore it.

What about doing:
SET enable_seqscan TO 'Off';
Just before the query in question?

That way you'd only affect the single query. Possibly you could even
code to spot the two aberrant situations and not do it in those ones.

I'd rather not pollute the application's SQL with postgres-isms. Not that I
don't love postgres, but there are always critics looking for a reason to use
Oracle or (gasp) MS-SQL.

As for "code to spot.." I am fairly new to hacking postgres. (Though, I have
been using it in various projects since ~1995), but I am excellent C/C++ guy,
give me a pointer to where (a) statistics are calculated, and (b) where they
are interpreted, and I would do that.

Just a question, however, what is the feeling about the way statistics are
currently being calculated? My feeling is that some sort of windowing
algorithm be used to normalize the statistics to the majority of the entries
in a table. It could be as simple as discarding the upper and lower 10% of
the record stats, and use the remaining 80% for statistics. That would
certainly take care of my problem (and others I am sure), and I'd be glad to
write it. ;-)

Show quoted text

Regards,
Andrew.
--
_____________________________________________________________________
Andrew McMillan, e-mail: Andrew@cat-it.co.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267

#17

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: markw (#16)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

At 01:53 PM 11/15/00 -0500, markw wrote:

I'd rather not pollute the application's SQL with postgres-isms. Not that I
don't love postgres, but there are always critics looking for a reason to use
Oracle or (gasp) MS-SQL.

Define some global variable with the name of the database being run (currently
only Postgres) and guard the SET statement with a conditional...

In the OpenACS project we've got little functions that return query snippets
called things like "db_nextval" that return either "sequence_name.nextval"
or "nextval('sequence_name')" depending on whether the code's running
under Oracle or Postgres. That helps us minimize differences in the source.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#18

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: markw (#16)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

markw <markw@mohawksoft.com> writes:

Just a question, however, what is the feeling about the way statistics are
currently being calculated?

They suck, no question about it ;-)

My feeling is that some sort of windowing
algorithm be used to normalize the statistics to the majority of the entries
in a table. It could be as simple as discarding the upper and lower 10% of
the record stats, and use the remaining 80% for statistics.

I think what most of the discussion has focused on is building
histograms. The current upper-and-lower-bounds-only approach just
plain isn't enough data, even if you discard outliers so that the
data isn't actively misleading.

As far as the most-common-value issue goes, if you have one value that
is vastly more common than any other, I think it would be a real mistake
to throw away that information --- that would mean the planner would do
the wrong thing for queries that do involve that value. What we need
is to save info about several top-frequency values, maybe three or so,
not just one. Also the finding of those values needs to be much more
robust than it is currently.

See past discussions in pghackers --- there have been plenty...

regards, tom lane

#19

Charles Tassell

ctassell@isn.net

about 25 years ago

In reply to: Adam Lang (#14)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about a
month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

At 02:05 PM 11/15/00, Adam Lang wrote:

Show quoted text

PHP 4 was having problems with persistent connections (not sure if with just
pgsql or all dbs). Maybe that is why they didn't use it.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: <joseph@randomnetworks.com>
To: "carl garland" <carlhgarland@hotmail.com>
Cc: <martin@math.unl.edu.ar>; <RDNELSON@co.centre.pa.us>;
<davidb@vectormath.com>; <mfork@toledolink.com>; <poulc@cs.auc.dk>;
<pgsql-general@postgresql.org>; <pgsql-hackers@postgresql.org>
Sent: Wednesday, November 15, 2000 12:28 PM
Subject: Re: [HACKERS] Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

On Wed, 15 Nov 2000, carl garland wrote:

# >perhaps why, even at 5 clients, the page views he shows never went
# >significantly above 10/sec?
#
# I think alot of it has to do with the web server/db setup not pg. They

are

# using Apache/PHP and looking at their code every page has the additional
# overhead of making the db connection. Now if they had used AOLserver

with

# its persistent db connecction pooling scheme they may have faired better

;)

I haven't actually looked at their code they used to test with to
see if they are actually using it, but Apache/PHP has the ability to do
persistent db connections. I would be surprised that someone like Tim (
who seems to have done a fair bit of php with db stuff) would not make use
of such a feature.

If you can point out an example of where they are not using this
feature I will gladly stand corrected.

--------------------------------------------------------------------
| Joseph Scott The Office Of Water Programs |
| joseph@randomnetworks.com joseph.scott@owp.csus.edu |
--------------------------------------------------------------------

#20

Adam Lang

aalang@rutgersinsurance.com

about 25 years ago

In reply to: Noname (#13)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

It is possible. I think the problems were arising before the currentpatch.
Problems involved not properly releasing conenctions and they were
eventually filling up as used.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: "Charles Tassell" <ctassell@isn.net>
To: "Adam Lang" <aalang@rutgersinsurance.com>
Cc: <pgsql-general@postgresql.org>
Sent: Wednesday, November 15, 2000 4:54 PM
Subject: Re: [HACKERS] Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about

month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

At 02:05 PM 11/15/00, Adam Lang wrote:

PHP 4 was having problems with persistent connections (not sure if with

just

pgsql or all dbs). Maybe that is why they didn't use it.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: <joseph@randomnetworks.com>
To: "carl garland" <carlhgarland@hotmail.com>
Cc: <martin@math.unl.edu.ar>; <RDNELSON@co.centre.pa.us>;
<davidb@vectormath.com>; <mfork@toledolink.com>; <poulc@cs.auc.dk>;
<pgsql-general@postgresql.org>; <pgsql-hackers@postgresql.org>
Sent: Wednesday, November 15, 2000 12:28 PM
Subject: Re: [HACKERS] Re: [GENERAL] PHPBuilder article -- Postgres vs

MySQL

On Wed, 15 Nov 2000, carl garland wrote:

# >perhaps why, even at 5 clients, the page views he shows never went
# >significantly above 10/sec?
#
# I think alot of it has to do with the web server/db setup not pg.

They

are

# using Apache/PHP and looking at their code every page has the

additional

# overhead of making the db connection. Now if they had used

AOLserver

with

# its persistent db connecction pooling scheme they may have faired

better

;)

I haven't actually looked at their code they used to test with to
see if they are actually using it, but Apache/PHP has the ability to

persistent db connections. I would be surprised that someone like Tim

(

who seems to have done a fair bit of php with db stuff) would not make

use

Show quoted text

of such a feature.

If you can point out an example of where they are not using this
feature I will gladly stand corrected.

--------------------------------------------------------------------
| Joseph Scott The Office Of Water Programs |
| joseph@randomnetworks.com joseph.scott@owp.csus.edu |
--------------------------------------------------------------------

#21

The Hermit Hacker

scrappy@hub.org

about 25 years ago

In reply to: Charles Tassell (#19)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Wed, 15 Nov 2000, Charles Tassell wrote:

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about a
month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out of
the ability to connect since you've locked up all connections ...

At 02:05 PM 11/15/00, Adam Lang wrote:

PHP 4 was having problems with persistent connections (not sure if with just
pgsql or all dbs). Maybe that is why they didn't use it.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: <joseph@randomnetworks.com>
To: "carl garland" <carlhgarland@hotmail.com>
Cc: <martin@math.unl.edu.ar>; <RDNELSON@co.centre.pa.us>;
<davidb@vectormath.com>; <mfork@toledolink.com>; <poulc@cs.auc.dk>;
<pgsql-general@postgresql.org>; <pgsql-hackers@postgresql.org>
Sent: Wednesday, November 15, 2000 12:28 PM
Subject: Re: [HACKERS] Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

On Wed, 15 Nov 2000, carl garland wrote:

# >perhaps why, even at 5 clients, the page views he shows never went
# >significantly above 10/sec?
#
# I think alot of it has to do with the web server/db setup not pg. They

are

# using Apache/PHP and looking at their code every page has the additional
# overhead of making the db connection. Now if they had used AOLserver

with

# its persistent db connecction pooling scheme they may have faired better

;)

I haven't actually looked at their code they used to test with to
see if they are actually using it, but Apache/PHP has the ability to do
persistent db connections. I would be surprised that someone like Tim (
who seems to have done a fair bit of php with db stuff) would not make use
of such a feature.

If you can point out an example of where they are not using this
feature I will gladly stand corrected.

--------------------------------------------------------------------
| Joseph Scott The Office Of Water Programs |
| joseph@randomnetworks.com joseph.scott@owp.csus.edu |
--------------------------------------------------------------------

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#22

Frank Joerdens

frank@joerdens.de

about 25 years ago

In reply to: The Hermit Hacker (#21)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

The Hermit Hacker wrote:

On Wed, 15 Nov 2000, Charles Tassell wrote:

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about a
month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out of
the ability to connect since you've locked up all connections ...

That sounds pretty evil. Have you also tested PHP3 by any chance? Has this been posted to
php-general? If this is a general issue, it would have to be considered a _very_ nasty bug
indeed. Can you describe how you made the test, i.e. how you saw that it wouldn't reuse
idle ones and keeps opening new ones? How would you monitor this?

- Frank

#23

Adam Lang

aalang@rutgersinsurance.com

about 25 years ago

In reply to: The Hermit Hacker (#21)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

Basically, after so many people connect and disconnect, the PHP doesn't
reuse/drop old connections so you maximum amount is reached. You'd get to
the point where no one is connected, but one another person would try to and
it would come back with an error saying no connections available.

I am unaware if anything has been posted to PHP about it.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: "Frank Joerdens" <frank@joerdens.de>
To: "The Hermit Hacker" <scrappy@hub.org>
Cc: "Charles Tassell" <ctassell@isn.net>; "Adam Lang"
<aalang@rutgersinsurance.com>; <pgsql-general@postgresql.org>;
<scherf@ventasoft.de>
Sent: Thursday, November 16, 2000 1:08 PM
Subject: Re: [HACKERS] Re: [GENERAL] PHPBuilder article -- Postgres vs MySQL

The Hermit Hacker wrote:

On Wed, 15 Nov 2000, Charles Tassell wrote:

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for

about a

month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the

point

releases, or maybe I just don't have enough different db connections

trigger it.

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to

remove

the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but

myself,

to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out

the ability to connect since you've locked up all connections ...

That sounds pretty evil. Have you also tested PHP3 by any chance? Has this

been posted to

php-general? If this is a general issue, it would have to be considered a

_very_ nasty bug

indeed. Can you describe how you made the test, i.e. how you saw that it

wouldn't reuse

Show quoted text

idle ones and keeps opening new ones? How would you monitor this?

- Frank

#24

Karl DeBisschop

kdebisschop@alert.infoplease.com

about 25 years ago

In reply to: The Hermit Hacker (#21)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

Frank Joerdens wrote:

The Hermit Hacker wrote:

On Wed, 15 Nov 2000, Charles Tassell wrote:

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about a
month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out of
the ability to connect since you've locked up all connections ...

That sounds pretty evil. Have you also tested PHP3 by any chance? Has this been posted to
php-general? If this is a general issue, it would have to be considered a _very_ nasty bug
indeed. Can you describe how you made the test, i.e. how you saw that it wouldn't reuse
idle ones and keeps opening new ones? How would you monitor this?

We have looked into the issue. Our conclusion was that persistent
connections are allocated by httpd process. They are persistent,
but not shared. Thus, if you have hundredes of httpd processes,
you have issues (fewer if you don't tune postgres max allowable
backends). Or if there is more than one user or DB
per httpd process.

We have seen the same behavior on PHP3 and on PHP4.

Maybe our diagnosis is wrong. I would love to be able to
use persistent connections. But I don't think it is a bug
per say. The docs talk about persistence. They say nothing
about pooling. (maybe phplib will do this -- I have not
looked).
use persistent

--
Karl DeBisschop kdebisschop@alert.infoplease.com
Learning Network Reference http://www.infoplease.com
Netsaint Plugin Developer kdebisschop@users.sourceforge.net

#25

Marko Kreen

marko@l-t.ee

about 25 years ago

In reply to: The Hermit Hacker (#21)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Thu, Nov 16, 2000 at 01:33:08PM -0400, The Hermit Hacker wrote:

On Wed, 15 Nov 2000, Charles Tassell wrote:

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about a
month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out of
the ability to connect since you've locked up all connections ...

Maybe you have the apache maxspareservers set too high? I run
PHP4/apache (debian-woody) and it works ok. If this is only IMP
site maybe you are better off

apache.MaxClients = postgres.backendcount

Just a thought.

--
marko

#26

The Hermit Hacker

scrappy@hub.org

about 25 years ago

In reply to: Frank Joerdens (#22)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

Install PgSQL, install PHP4, install IMP ... do it on a standalone machine
that you would never get an external connection on ... open connection to
IMP several times and watch number of postgres backends start and never
die ...

On Thu, 16 Nov 2000, Frank Joerdens wrote:

The Hermit Hacker wrote:

On Wed, 15 Nov 2000, Charles Tassell wrote:

Just a note, I've been using Postgres 7.02 and PHP 4.02 or 4.03 for about a
month in a couple sites, and haven't experienced any problems with
persistent connections. Problem might have been fixed in one of the point
releases, or maybe I just don't have enough different db connections to
trigger it.

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out of
the ability to connect since you've locked up all connections ...

That sounds pretty evil. Have you also tested PHP3 by any chance? Has this been posted to
php-general? If this is a general issue, it would have to be considered a _very_ nasty bug
indeed. Can you describe how you made the test, i.e. how you saw that it wouldn't reuse
idle ones and keeps opening new ones? How would you monitor this?

- Frank

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#27

Alexey Borzov

borz_off@rdw.ru

about 25 years ago

In reply to: The Hermit Hacker (#21)

Re[2]: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

Greetings, The Hermit Hacker!

Well, there's a problem with PHP's [mis]documentation. First of all,
it counts open DB connections not on per-webserver, but on
per-process/thread basis.
The default PHP config file has the limits of persistent and
non-persistent connections set to -1 (no limit)... Setting it to
some (supposedly) reasonable value (like, 50) accomplishes nothing: you
should multiply 50 by the number of webserver processes/threads. There
can be lots of them... :[
And then there comes PHP's "logic": if I can just open the new
connection, why bother reusing the old one? And thus Postgres backends
start multiplying like rabbits, eventually reaching the limit... :[
You should set a reasonable limit on number of open persistent
connections (like 1, maybe 2 or 3), only then PHP will actually reuse
them. My webserver now works with such setup and there are no more
problems with pg_pconnect().

Hell, I never thought I'll give advice to one of PgGurus. ;]

At 16.11.2000, 11:21, you wrote:
THH> I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
THH> the useof pg_pconnect() since it is broken. Broken how, you might
THH> ask? Well, I ran on a standalone machine, no other web users but myself,
THH> to test, and each tim eI hit the database with IMP,. it opened a new
THH> backend, but it never reused old, idle ones ... eventually, you run out of
THH> the ability to connect since you've locked up all connections ...
--
Yours, Alexey V. Borzov, Webmaster of RDW

#28

Martin A. Marques

martin@math.unl.edu.ar

about 25 years ago

In reply to: Marko Kreen (#25)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Jue 16 Nov 2000 22:54, Marko Kreen wrote:

On Thu, Nov 16, 2000 at 01:33:08PM -0400, The Hermit Hacker wrote:

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out
of the ability to connect since you've locked up all connections ...

Maybe you have the apache maxspareservers set too high? I run
PHP4/apache (debian-woody) and it works ok. If this is only IMP
site maybe you are better off

apache.MaxClients = postgres.backendcount

What I can say on my behave is that (as the only user of PHP and Postgres on
the computer) when I login to IMP webmail, quite a lot of postmasters appear,
and don't dissapear after logging off.
but I have no problem. I can be all day with IMP fooling around and the
backend postmaster would stay around the 20 live conections.

Any comments?

Saludos.. :-)

--
"And I'm happy, because you make me feel good, about me." - Melvin Udall
-----------------------------------------------------------------
Mart��n Marqu��s email: martin@math.unl.edu.ar
Santa Fe - Argentina http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

#29

Philip Hallstrom

philip@adhesivemedia.com

about 25 years ago

In reply to: Martin A. Marques (#28)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Jue 16 Nov 2000 22:54, Marko Kreen wrote:

On Thu, Nov 16, 2000 at 01:33:08PM -0400, The Hermit Hacker wrote:

I run PHP4 and IMP (http://www.horde.org) and we've gotten then to remove
the useof pg_pconnect() since it is broken. Broken how, you might
ask? Well, I ran on a standalone machine, no other web users but myself,
to test, and each tim eI hit the database with IMP,. it opened a new
backend, but it never reused old, idle ones ... eventually, you run out
of the ability to connect since you've locked up all connections ...

Maybe you have the apache maxspareservers set too high? I run
PHP4/apache (debian-woody) and it works ok. If this is only IMP
site maybe you are better off

apache.MaxClients = postgres.backendcount

What I can say on my behave is that (as the only user of PHP and Postgres on
the computer) when I login to IMP webmail, quite a lot of postmasters appear,
and don't dissapear after logging off.
but I have no problem. I can be all day with IMP fooling around and the
backend postmaster would stay around the 20 live conections.

Any comments?

What is the value for MinSpareServers in your apache's conf? If that's
set to 20, then even if you only hit your site 20 times (one for each),
that many apache processes will continue to live and therefore keep a db
connection open..??

What would be interesting to know is after your down how many postmaster
processes are running and how many httpd processes are running...

-philip

#30

Martin A. Marques

martin@math.unl.edu.ar

about 25 years ago

In reply to: Philip Hallstrom (#29)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Fri, 17 Nov 2000, Philip Hallstrom wrote:

What is the value for MinSpareServers in your apache's conf? If that's
set to 20, then even if you only hit your site 20 times (one for each),
that many apache processes will continue to live and therefore keep a db
connection open..??

What would be interesting to know is after your down how many postmaster
processes are running and how many httpd processes are running...

postamaster - 19
httpd - 11

Saludos... ;-)

"And I'm happy, because you make me feel good, about me." - Melvin Udall
-----------------------------------------------------------------
Martï¿½n Marquï¿½s email: martin@math.unl.edu.ar
Santa Fe - Argentina http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

#31

Martin A. Marques

martin@math.unl.edu.ar

about 25 years ago

In reply to: Philip Hallstrom (#29)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Friday 17 November 2000 18:15, Philip Hallstrom wrote:

What is the value for MinSpareServers in your apache's conf? If that's
set to 20, then even if you only hit your site 20 times (one for each),
that many apache processes will continue to live and therefore keep a db
connection open..??

I have 10 as MaxSpareServers and 5 as MinSpareServers.

What would be interesting to know is after your down how many postmaster
processes are running and how many httpd processes are running...

My question would be, when do those connections die?

Saludos... :-)

#32

Christopher Kings-Lynne

chriskl@familyhealth.com.au

about 25 years ago

In reply to: Robert D. Nelson (#7)

RE: [HACKERS] RE: PHPBuilder article -- Postgres vs MySQL

Speaking of MySQL, has anyone looked at www.mysql.org recently?

They have a big news article:

MySQL wins Linux Journal Readers Choice Award again!
For the third Year in a row MySQL won the Readers Choice Award in Linux
Journal. Considering that MySQL earlier this fall won the Linux Magazine
Editors Choice Award, reading magazines on the whole has been a very
rewarding experience for MySQL fans lately.

If you follow their link to www.linuxjournal.com, all I can find is an
article about how _PostgreSQL_ won the Linux Magazine Editors Choice award!
What's with that???

Chris

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Robert D. Nelson
Sent: Tuesday, November 14, 2000 12:22 AM
To: davidb@vectormath.com; Michael Fork; Poul L.Christiansen
Cc: pgsql-general; pgsql-hackers
Subject: [HACKERS] RE: [GENERAL] PHPBuilder article -- Postgres vs MySQL

I made it all the way through the article. I'll summarize it for you:
Postgres - hooray!
MySQL - boo!

Yeah, and that's about it. No analysis or anything. Disappointing, after
waiting so long for the pages to load.

Since this is an open source database article linked off of slashdot, I
imagine they're getting pounded.

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

Rob Nelson
rdnelson@co.centre.pa.us

#33

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Robert D. Nelson (#7)

Re: [HACKERS] RE: PHPBuilder article -- Postgres vs MySQL

At 11:22 AM 11/13/00 -0500, Robert D. Nelson wrote:

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

They don't appear to do any client-side query caching, which is understandable
from one point of view (you need some sort of handle on which queries are
hit frequently enough to warrant devoting RAM to caching the result, or else
you risk caching things that don't gain you much and cut down on the amount
of the DB cached in RAM which hits you on other queries). On the other hand,
you'd think they'd do some analysis...

Still, the table-locking of MySQL just gets in the way. If you can cache
everything, then you can send a postal worker to the mailbox to retrieve
uncached data without significantly impacting throughput (in other words,
the MySQL argument that simple select benchmarks are all you need are
not relevant). If you can't cache anything but have few users, then perhaps
low levels of concurrency don't hurt. If you don't cache anything but have
lots of users, scaling well under high levels of load rule.

My thinking is that intellegent caching coupled with a highly-scalable
database wins. That's the world I'm used to...

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#34

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Noname (#8)

Re: [HACKERS] RE: PHPBuilder article -- Postgres vs MySQL

At 06:16 PM 11/13/00 +0100, fabrizio.ermini@sysdat.it wrote:

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

I think this could be because they used real killer pages in the test,
and maybe this also the reason PgSQL fared this good (I've always
been and I'm still a postgres fan, but looking at that results I've
been quite astonished!!). Have you looked the spec? If I remember
well, Tim was talking about executing cuncurrently a page that
joined a dozen tables and another that was doing
update/select/insert on the same tables. Under these condition, 10
pages/sec it seems lighting to me!!!!

But much of this could still be cached. I visit my homepage at sourceforge
rarely, because my project uses sourceforge for its cvs repository, only.
So all those joins are mostly a waste. I never have new postings in my
project forums, blah blah. Some level of caching could help (not for me
personally, I visit too rarely for a system to want to cache my query
returns involved in building my home page, but I'm sure there are many
cases where caching would help).

Again, you have to balance query cache RAM consumption against the benefits
of extra RAM availability to the RDBMS (assuming you have one, which
MySQL isn't :)

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#35

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Noname (#5)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

At 09:43 AM 11/13/00 -0600, davidb@vectormath.com wrote:

I made it all the way through the article. I'll summarize it for you:
Postgres - hooray!
MySQL - boo!
Since this is an open source database article linked off of slashdot, I
imagine they're getting pounded.

Why is all this e-mail showing up so late?

(I'm curious because there have been complaints about the mail server here,
and the article is old hat).

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#36

The Hermit Hacker

scrappy@hub.org

about 25 years ago

In reply to: Don Baccus (#35)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Mon, 20 Nov 2000, Don Baccus wrote:

At 09:43 AM 11/13/00 -0600, davidb@vectormath.com wrote:

I made it all the way through the article. I'll summarize it for you:
Postgres - hooray!
MySQL - boo!
Since this is an open source database article linked off of slashdot, I
imagine they're getting pounded.

Why is all this e-mail showing up so late?

(I'm curious because there have been complaints about the mail server here,
and the article is old hat).

I was away for the past week at Comdex and didn't keep up very well on
approving postings from ppl that don't subscribe to the list before
posting ... for anti-spam purposes, you have to be subscribed before
posting, or else it has to be approved :(

#37

Ashley Clark

aclark@ghoti.org

about 25 years ago

In reply to: Christopher Kings-Lynne (#32)

Re: [HACKERS] RE: PHPBuilder article -- Postgres vs MySQL

* Christopher Kings-Lynne in "RE: [HACKERS] RE: [GENERAL] PHPBuilder
* article -- Postgres vs MySQL" dated 2000/11/21 11:05 wrote:

MySQL wins Linux Journal Readers Choice Award again! For the third
Year in a row MySQL won the Readers Choice Award in Linux Journal.
Considering that MySQL earlier this fall won the Linux Magazine
Editors Choice Award, reading magazines on the whole has been a very
rewarding experience for MySQL fans lately.

If you follow their link to www.linuxjournal.com, all I can find is
an article about how _PostgreSQL_ won the Linux Magazine Editors
Choice award! What's with that???

Well, it would seem that Postgres won the _Editor's_ [1] Choice Awards
while MySQL won the _Reader's_ [2] Choice Awards

--
hackers ally

1. http://www2.linuxjournal.com/advertising/Press/2000ec_winners.html
2. http://www2.linuxjournal.com/articles/misc/0029.html

#38

Mitch Vincent

mitch@venux.net

about 25 years ago

In reply to: Don Baccus (#33)

Re: [HACKERS] RE: PHPBuilder article -- Postgres vs MySQL

I've wondered and am still wondering what a lot of these benchmark tests
are out to prove. I'm not sure that any PostgreSQL advocate has ever said or
implied that PostgreSQL is faster than anything, much less MySQL. While I'm
sure it's faster than some, I've just never heard the argument for using
PostgreSQL as "It's faster than anything else". I chose PostgreSQL by what
it could do, not how fast it can SELECT... No benchmark between MySQL and
PostgreSQL (or any other RDBMS ) is ever going to be truly accurate since
there are so many things MySQL simply can't to that PostgreSQL (and others)
can..

As Don often out often and accurately points out, MySQL is not an RDBMS,
I'm not sure what it really is beyond a semi-fancy SQL interface to a file
system.. Is it fast? Yes, it is pretty fast. Fast at the expense of
functionality and stability -- two things that just aren't optional when
you're talking about a good database for anything more complicated than
click-through tracking...

I don't dislike MySQL for any other reason except that it can't do what
I need it to do, period... I'm sure it's good for some things and some
people, I've tried MySQL, tested MySQL and then tossed MySQL into the
garbage can...

I found some very educational conversation here :
http://openacs.org/philosophy/why-not-mysql.html courtesy of Don and others.

-Mitch

----- Original Message -----
From: "Don Baccus" <dhogaza@pacifier.com>
To: "Robert D. Nelson" <RDNELSON@co.centre.pa.us>; <davidb@vectormath.com>;
"Michael Fork" <mfork@toledolink.com>; "Poul L.Christiansen"
<poulc@cs.auc.dk>
Cc: "pgsql-general" <pgsql-general@postgresql.org>; "pgsql-hackers"
<pgsql-hackers@postgresql.org>
Sent: Monday, November 20, 2000 8:48 PM
Subject: Re: [HACKERS] RE: [GENERAL] PHPBuilder article -- Postgres vs MySQL

At 11:22 AM 11/13/00 -0500, Robert D. Nelson wrote:

Still...Regardless of what database they're running, either their
abstraction layer is shit or their queries really need optimized. Is that
perhaps why, even at 5 clients, the page views he shows never went
significantly above 10/sec?

They don't appear to do any client-side query caching, which is

understandable

from one point of view (you need some sort of handle on which queries are
hit frequently enough to warrant devoting RAM to caching the result, or

else

you risk caching things that don't gain you much and cut down on the

amount

of the DB cached in RAM which hits you on other queries). On the other

hand,

you'd think they'd do some analysis...

Still, the table-locking of MySQL just gets in the way. If you can cache
everything, then you can send a postal worker to the mailbox to retrieve
uncached data without significantly impacting throughput (in other words,
the MySQL argument that simple select benchmarks are all you need are
not relevant). If you can't cache anything but have few users, then

perhaps

low levels of concurrency don't hurt. If you don't cache anything but

have

Show quoted text

lots of users, scaling well under high levels of load rule.

My thinking is that intellegent caching coupled with a highly-scalable
database wins. That's the world I'm used to...

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#39

Thomas Lockhart

lockhart@alumni.caltech.edu

about 25 years ago

In reply to: Don Baccus (#33)

Re: RE: [GENERAL] PHPBuilder article -- Postgres vs MySQL

I've wondered and am still wondering what a lot of these benchmark tests
are out to prove.

In this case, the "benchmark test" was not out to prove anything. It was
an good-faith result of a porting effort with a suprising (to the
tester) result.

I'm not sure that any PostgreSQL advocate has ever said or
implied that PostgreSQL is faster than anything, much less MySQL. While I'm
sure it's faster than some, I've just never heard the argument for using
PostgreSQL as "It's faster than anything else".

Very true. But it turns out that in at least some real-world tests, in
this case a real application *built for MySQL*, PostgreSQL was
substantially faster when the number of users climbed above a very small
number. These results are consistant with and supported by GB's initial
published benchmark results.

Two separate styles of comparisons with consistant results might help
someone choose the right solution for their application. No harm in
that, eh?

I chose PostgreSQL by what
it could do, not how fast it can SELECT... No benchmark between MySQL and
PostgreSQL (or any other RDBMS ) is ever going to be truly accurate since
there are so many things MySQL simply can't to that PostgreSQL (and others)
can..

Well, that is another dimension to the evaluation/comparison. But the
testing results stand on their own: you *can* choose PostgreSQL for its
performance, and you *will* have made the right choice. This is
especially gratifying for all of us who have contributed to PostgreSQL
because we *didn't* benchmark it, and *assumed* that MySQL claims for
superior speed under all circumstances were accurate. Turns out it may
be true for single-user mode, but that we've built a darn fast RDBMS for
real-world applications.

One *very unfair* part of these benchmarks and comparisons is that both
MySQL and PostgreSQL can be identified by name for the comparisons, so
they tend to be talked about the most. But the GB benchmarks could lead
one to conclude that if SourceForge had been built with another database
product it would also have seen a performance improvement when tested
with PostgreSQL.

- Thomas

#40

Philip Hallstrom

philip@adhesivemedia.com

about 25 years ago

In reply to: Martin A. Marques (#31)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

My understanding is that with your configs below you will always have 10
httpd proccesses running as long as your machine is up... ie... they will
never die.

On Mon, 20 Nov 2000, Martin A. Marques wrote:

Show quoted text

On Friday 17 November 2000 18:15, Philip Hallstrom wrote:

What is the value for MinSpareServers in your apache's conf? If that's
set to 20, then even if you only hit your site 20 times (one for each),
that many apache processes will continue to live and therefore keep a db
connection open..??

I have 10 as MaxSpareServers and 5 as MinSpareServers.

What would be interesting to know is after your down how many postmaster
processes are running and how many httpd processes are running...

My question would be, when do those connections die?

Saludos... :-)

--
"And I'm happy, because you make me feel good, about me." - Melvin Udall
-----------------------------------------------------------------
Mart��n Marqu��s email: martin@math.unl.edu.ar
Santa Fe - Argentina http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

#41

Tim Uckun

tim@diligence.com

about 25 years ago

In reply to: Don Baccus (#15)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

AOLServer isn't the only system that can pool database connections, so
can servlets/JSP, ColdFusion, ASP, etc. No doubt AOLServer would be more
widely accepted if it used something other than TCL.

There's also support available for Python, though there's still a lot of
work to be done to support the full AOLserver API (same's true of ns_java,
actually).

If you use ADP pages, your use of Tcl is typically restricted to snippets of
code anyway, so I've never really understood the complaints about Tcl...

What about the php module? Does it take advantage of API?
It seems to me your choice of web/application server is kind of dependent
on the language you like. If you like perl/php use apache if you like tcl
use aolserver, if you like java use tomcat,enhydra,orion (or whatever), if
you like python use zope. I guess for the few people who like VB there is
IIS/ASP.

:wq
Tim Uckun
Due Diligence Inc. http://www.diligence.com/ Americas Background
Investigation Expert.
If your company isn't doing background checks, maybe you haven't considered
the risks of a bad hire.

#42

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Tim Uckun (#41)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

At 09:44 AM 11/21/00 -0700, Tim Uckun wrote:

What about the php module? Does it take advantage of API?

I don't know. If not, though, there wouldn't be much point in using
AOLserver, since the simple and efficient database API is the main
attraction. So I think there's a pretty good chance it does.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#43

Franck Martin

franck@sopac.org

about 25 years ago

In reply to: Robert D. Nelson (#7)

Indexing for geographic objects?

I'm building a new geo type and would like to index it. I have heard about
RTREE and boundary box but I'm clueless for the moment about the
implementation....

I have tried to look into PG source code to find the location where the
indexing is done of current line object is done, but couldn't pin point where
is the code and how it works.

I would greatly appreciate if someone could guide me through the methodology to
build an index for a custom type or point me to some readings where the
algorithm is explained (web, book, etc...). I plan to use 3D geographical
objects...

Take this request as a newbie request, because I have never done database
indexing, not because I haven't done programming.

Please reply directly to my e-mail address
Thanks a lot.

franck@sopac.org

#44

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Franck Martin (#43)

Re: [HACKERS] Indexing for geographic objects?

Franck Martin <franck@sopac.org> writes:

I would greatly appreciate if someone could guide me through the
methodology to build an index for a custom type or point me to some
readings where the algorithm is explained (web, book, etc...).

The Programmer's Guide chapter "Interfacing Extensions To Indices"
outlines the procedure for making a new datatype indexable. It
only discusses the case of adding btree support for a new type,
though. For other index classes such as R-tree there are different
sets of required operators, which are not as well documented but
you can find out by looking at code for the already-supported
datatypes.

I plan to use 3D geographical objects...

That's a bit hard since we don't have any indexes suitable for 3-D
coordinates --- the existing R-tree type is for 2-D objects. What's
more it assumes that coordinates are Euclidean, which is probably
not the model you want for geographical coordinates.

In theory you could build a new index type suitable for indexing
3-D points, using the R-tree code as a starting point. I wouldn't
class it as a project suitable for a newbie however :-(.

Depending on what your needs are, you might be able to get by with
projecting your objects into a flat 2-D coordinate system and using
an R-tree index in that space. It'd just be approximate but that
might be close enough for index purposes.

regards, tom lane

#45

Michael Ansley

Michael.Ansley@intec-telecom-systems.com

about 25 years ago

In reply to: Tom Lane (#44)

RE: [HACKERS] Indexing for geographic objects?

Hi,

Remember also that the GiST library has been integrated into PG, (my brother
is doing some thesis workon that at the moment), and you can create new
index types relatively quickly (assuming that you understand the indexing
theory ;-) using this mechanism. Run a web search on GiST for more info.

Currently GiST has support for btree and rtree indexes, and possibly r+ or *
trees, I can't remember which, if any, and IIRC, at least a couple more.
However, if you have a requirement or 3d indexing, and you have the
knowledge available, you should be able to hack a few 3d indexes together
quite quickly.

Cheers...

-----Original Message-----
From: Tom Lane
To: Franck Martin
Cc: pgsql-general; pgsql-hackers
Sent: 11-26-00 4:35 AM
Subject: Re: [HACKERS] Indexing for geographic objects?

Franck Martin <franck@sopac.org> writes:

I would greatly appreciate if someone could guide me through the
methodology to build an index for a custom type or point me to some
readings where the algorithm is explained (web, book, etc...).

I plan to use 3D geographical objects...

In theory you could build a new index type suitable for indexing
3-D points, using the R-tree code as a starting point. I wouldn't
class it as a project suitable for a newbie however :-(.

regards, tom lane

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************

Import Notes

Resolved by subject fallback

#46

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Michael Ansley (#45)

RE: [HACKERS] Indexing for geographic objects?

I'm also interested in GiST and would be happy if somebody could provide
workable example. I have an idea to use GiST indices for our fulltextsearch
system.

Regards,

Oleg
On Sun, 26 Nov 2000, Michael Ansley wrote:

Date: Sun, 26 Nov 2000 11:34:16 -0000
From: Michael Ansley <Michael.Ansley@intec-telecom-systems.com>
To: 'Tom Lane ' <tgl@sss.pgh.pa.us>, 'Franck Martin ' <franck@sopac.org>
Cc: 'pgsql-general ' <pgsql-general@postgresql.org>,
'pgsql-hackers ' <pgsql-hackers@postgresql.org>,
"'t.h.p.ansley@durham.co.uk'" <t.h.p.ansley@durham.co.uk>
Subject: RE: [HACKERS] Indexing for geographic objects?

Hi,

Remember also that the GiST library has been integrated into PG, (my brother
is doing some thesis workon that at the moment), and you can create new
index types relatively quickly (assuming that you understand the indexing
theory ;-) using this mechanism. Run a web search on GiST for more info.

Currently GiST has support for btree and rtree indexes, and possibly r+ or *
trees, I can't remember which, if any, and IIRC, at least a couple more.
However, if you have a requirement or 3d indexing, and you have the
knowledge available, you should be able to hack a few 3d indexes together
quite quickly.

Cheers...

-----Original Message-----
From: Tom Lane
To: Franck Martin
Cc: pgsql-general; pgsql-hackers
Sent: 11-26-00 4:35 AM
Subject: Re: [HACKERS] Indexing for geographic objects?

Franck Martin <franck@sopac.org> writes:

I would greatly appreciate if someone could guide me through the
methodology to build an index for a custom type or point me to some
readings where the algorithm is explained (web, book, etc...).

The Programmer's Guide chapter "Interfacing Extensions To Indices"
outlines the procedure for making a new datatype indexable. It
only discusses the case of adding btree support for a new type,
though. For other index classes such as R-tree there are different
sets of required operators, which are not as well documented but
you can find out by looking at code for the already-supported
datatypes.

I plan to use 3D geographical objects...

That's a bit hard since we don't have any indexes suitable for 3-D
coordinates --- the existing R-tree type is for 2-D objects. What's
more it assumes that coordinates are Euclidean, which is probably
not the model you want for geographical coordinates.

In theory you could build a new index type suitable for indexing
3-D points, using the R-tree code as a starting point. I wouldn't
class it as a project suitable for a newbie however :-(.

Depending on what your needs are, you might be able to get by with
projecting your objects into a flat 2-D coordinate system and using
an R-tree index in that space. It'd just be approximate but that
might be close enough for index purposes.

regards, tom lane

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#47

Noname

selkovjr@mcs.anl.gov

about 25 years ago

In reply to: Oleg Bartunov (#46)

Re: [HACKERS] Indexing for geographic objects?

Oleg Bartunov <oleg@sai.msu.su> wrote:

I'm also interested in GiST and would be happy if somebody could provide
workable example. I have an idea to use GiST indices for our fulltextsearch
system.

I have recently replied to Franck Martin in regards to this indexing
question, but I didn't think the subject was popular enough for me to
contaminate the list(s). You prove me wrong. Here goes:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To: Franck Martin <franck@sopac.org>
From: selkovjr@mcs.anl.gov
Reply-to: selkovjr@mcs.anl.gov
Subject: Re: [HACKERS] Indexing for geographic objects?
In-reply-to: <3A1EE0F4.3DC4161B@sopac.org>
Comments: In-reply-to Franck Martin <franck@sopac.org>
message dated "Sat, 25 Nov 2000 10:43:16 +1300."
Mime-Version: 1.0 (generated by tm-edit 7.108)
Date: Sat, 25 Nov 2000 02:56:03 -0600

It is probably possible to hook up an extension directly with the
R-tree methods available in postgres -- if you stare at the code long
enough and figure how to use the correct strategies. I chose an easier
path years ago and I am still satisfied with the results. Check out
the GiST -- a general access method built on top of R-tree to provide
a user-friendly interface to it and to allow indexing of more abstract
types, for which straight R-tree is not directly applicable.

I have a small set of complete data types, of which a couple
illustrate the use of GiST indexing with the geometrical objects, in:

http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

If you are using a pre-7.0 postrgres, grab the file contrib.tgz,
otherwise take contrib-7.0.tgz. The difference is insignificant, but
the pre-7.0 version will not fit the current schema. Unpack the source
into postgresql-*/contrib and follow instructions in the README
files. The types of interest for you will be seg and cube. You will
find pointers to the original sources and docs in the CREDITS section
of the README file. I also have a version of the original example code
in pggist-patched.tgz, but I did not check if it works with current
postgres. It should not be difficult to fix it if it doesn't -- the
recent development in the optimizer area made certain things
unnecessary.

You might want to check out a working example of the segment data type at:

http://wit.mcs.anl.gov/EMP/indexing.html

(search the page for 'KM')

I will be glad to help, but I would also recommend to send more
sophisticated questions to Joe Hellerstein, the leader of the original
postgres team that developed GiST. He was very helpful whenever I
turned to him during the early stages of my data type project.

--Gene

#48

Franck Martin

Franck@sopac.org

about 25 years ago

In reply to: Noname (#47)

RE: [HACKERS] Indexing for geographic objects?

It seems that your code is exactly what I want.

I have already created geographical objects which contains MBR(Minimum
Bounding Rectangle) in their structure, so it is a question of rewriting
your code to change the access to the cube structure to the MBR structure
inside my geoobject. (cf http://fmaps.sourceforge.net/) Look in the CVS for
latest. I have been slack lately on the project, but I'm not forgetting it.

Quickly I ran through the code, and I think your cube is strictly speaking a
box, which also a MBR.

However I didn't see the case of intersection, which is the main question
when you want to display object that are visible inside a box.

I suppose your code is under GPL, and you have no problem for me to use it,
providing I put your name and credits somewhere.

Cheers.

Franck Martin
Database Development Officer
SOPAC South Pacific Applied Geoscience Commission
Fiji
E-mail: franck@sopac.org
Web site: http://www.sopac.org/

This e-mail is intended for its recipients only. Do not forward this e-mail
without approval. The views expressed in this e-mail may not be necessarily
the views of SOPAC.

-----Original Message-----
From: selkovjr@mcs.anl.gov [mailto:selkovjr@mcs.anl.gov]
Sent: Saturday, 25 November 2000 8:56
To: Franck Martin
Subject: Re: [HACKERS] Indexing for geographic objects?

I have a small set of complete data types, of which a couple
illustrate the use of GiST indexing with the geometrical objects, in:

http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

You might want to check out a working example of the segment data type at:

http://wit.mcs.anl.gov/EMP/indexing.html

(search the page for 'KM')

--Gene

Import Notes

Resolved by subject fallback

#49

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Michael Ansley (#45)

Re: [HACKERS] Indexing for geographic objects?

Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:

Remember also that the GiST library has been integrated into PG, (my brother
is doing some thesis workon that at the moment),

Yeah? Does it still work?

Since the GIST code is not tested by any standard regress test, and is
so poorly documented that hardly anyone can be using it, I've always
assumed that it is probably suffering from a severe case of bit-rot.

I'd love to see someone contribute documentation and regression test
cases for it --- it's a great feature, if it works.

regards, tom lane

#50

Hannu Krosing

hannu@tm.ee

about 25 years ago

In reply to: Franck Martin (#48)

Re: [HACKERS] Indexing for geographic objects?

Franck Martin wrote:

It seems that your code is exactly what I want.

I have already created geographical objects which contains MBR(Minimum
Bounding Rectangle) in their structure, so it is a question of rewriting
your code to change the access to the cube structure to the MBR structure
inside my geoobject. (cf http://fmaps.sourceforge.net/) Look in the CVS for
latest. I have been slack lately on the project, but I'm not forgetting it.

Quickly I ran through the code, and I think your cube is strictly speaking a
box, which also a MBR.

However I didn't see the case of intersection, which is the main question
when you want to display object that are visible inside a box.

I suppose your code is under GPL, and you have no problem for me to use it,
providing I put your name and credits somewhere.

It would be much better if it were under the standard PostgreSQL license
and
if it is included in the standard distribution.

As Tom said, working Gist would be a great feature.

Now if only someone would write the regression tests ;)

BTW, the regression tests for pl/pgsql seem to be somewhat sparse as
well,
missing at least some types of loops, possibly more.

Show quoted text

Franck Martin
Database Development Officer
SOPAC South Pacific Applied Geoscience Commission
Fiji
E-mail: franck@sopac.org
Web site: http://www.sopac.org/

This e-mail is intended for its recipients only. Do not forward this e-mail
without approval. The views expressed in this e-mail may not be necessarily
the views of SOPAC.

-----Original Message-----
From: selkovjr@mcs.anl.gov [mailto:selkovjr@mcs.anl.gov]
Sent: Saturday, 25 November 2000 8:56
To: Franck Martin
Subject: Re: [HACKERS] Indexing for geographic objects?

It is probably possible to hook up an extension directly with the
R-tree methods available in postgres -- if you stare at the code long
enough and figure how to use the correct strategies. I chose an easier
path years ago and I am still satisfied with the results. Check out
the GiST -- a general access method built on top of R-tree to provide
a user-friendly interface to it and to allow indexing of more abstract
types, for which straight R-tree is not directly applicable.

I have a small set of complete data types, of which a couple
illustrate the use of GiST indexing with the geometrical objects, in:

http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

If you are using a pre-7.0 postrgres, grab the file contrib.tgz,
otherwise take contrib-7.0.tgz. The difference is insignificant, but
the pre-7.0 version will not fit the current schema. Unpack the source
into postgresql-*/contrib and follow instructions in the README
files. The types of interest for you will be seg and cube. You will
find pointers to the original sources and docs in the CREDITS section
of the README file. I also have a version of the original example code
in pggist-patched.tgz, but I did not check if it works with current
postgres. It should not be difficult to fix it if it doesn't -- the
recent development in the optimizer area made certain things
unnecessary.

You might want to check out a working example of the segment data type at:

http://wit.mcs.anl.gov/EMP/indexing.html

(search the page for 'KM')

I will be glad to help, but I would also recommend to send more
sophisticated questions to Joe Hellerstein, the leader of the original
postgres team that developed GiST. He was very helpful whenever I
turned to him during the early stages of my data type project.

--Gene

#51

Michael Ansley

Michael.Ansley@intec-telecom-systems.com

about 25 years ago

In reply to: Hannu Krosing (#50)

RE: [HACKERS] Indexing for geographic objects?

To be honest, Tom, I've always seen GiST not just as a great feature, but as
an essential feature. Using Stonebraker's definition of an
object-relational database (which I tend to do, as it's the only one that
I've read about in depth), we really need to be able to properly index
complex data, and using GiST, we can. Besides, it's just plain useful ;-)

MikeA

-----Original Message-----
From: Tom Lane
To: Michael Ansley
Cc: 'Franck Martin '; 'pgsql-general '; 'pgsql-hackers ';
't.h.p.ansley@durham.co.uk'
Sent: 11-27-00 3:32 AM
Subject: Re: [HACKERS] Indexing for geographic objects?

Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:

Remember also that the GiST library has been integrated into PG, (my

brother

is doing some thesis workon that at the moment),

Yeah? Does it still work?

I'd love to see someone contribute documentation and regression test
cases for it --- it's a great feature, if it works.

regards, tom lane

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************

Import Notes

Resolved by subject fallback

#52

Noname

selkovjr@mcs.anl.gov

about 25 years ago

In reply to: Tom Lane (#49)

Re: [HACKERS] Indexing for geographic objects?

Tom Lane wrote:

Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:

Remember also that the GiST library has been integrated into PG, (my brother
is doing some thesis workon that at the moment),

Yeah? Does it still work?

You bet. One would otherwise be hearing from me. I depend on it quite
heavily and I am checking with almost every release. I am now current
with 7.0.2 -- this time it required some change, although not in the c
code. And that's pretty amazing: I was only screwed once since
postgres95 -- by a beta version I don't remember now; then I
complained and the problem was fixed. I don't even know whom I owe
thanks for that.

Since the GIST code is not tested by any standard regress test, and is
so poorly documented that hardly anyone can be using it,

I've always

assumed that it is probably suffering from a severe case of bit-rot.

I'd love to see someone contribute documentation and regression test
cases for it --- it's a great feature, if it works.

The bit rot fortunately did not happen, but the documentation I
promised Bruce many months ago is still "in the works" -- meaning,
something interfered and I haven't had a chance to start. Like a
friend of mine muses all the time, "Promise doesn't mean
marriage". Boy, do I feel guilty.

It's a bit better with the testing. I am not sure how to test the
GiST directly, but I have adapted the current version of regression
tests for the data types that depend on it. One can find them in my
contrib directory, under test/ (again, it's
http://wit.mcs.anl.gov/~selkovjr/pg_extensions/contrib.tgz)

It would be nice if at least one of the GiST types became a built-in
(that would provide for a more intensive testing), but I can also
think of the contrib code being (optionally) included into the main
build and regression test trees. The top-level makefile can have a
couple of special targets to build and test the contribs. I believe my
version of the tests can be a useful example to other contributors
whose code is already in the source tree.

--Gene

#53

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Noname (#52)

Re: [HACKERS] Indexing for geographic objects?

selkovjr@mcs.anl.gov writes:

Tom Lane wrote:

Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:

Remember also that the GiST library has been integrated into PG, (my brother
is doing some thesis workon that at the moment),

Yeah? Does it still work?

You bet. One would otherwise be hearing from me. I depend on it quite
heavily and I am checking with almost every release.

That's very good to hear! I was not aware that anyone was banging on it.

It seems like it would be a fine idea to adopt your stuff at least into
the contrib part of the distribution, maybe even (or eventually) into
the main release. I think we could probably make it part of the regress
tests even if it's contrib --- there's precedent, as regress already
uses some contrib stuff.

Do you have any problem with releasing your stuff under the Postgres
distribution terms (BSD license)?

regards, tom lane

#54

Noname

selkovjr@mcs.anl.gov

about 25 years ago

In reply to: Franck Martin (#48)

Re: [HACKERS] Indexing for geographic objects?

Franck Martin wrote:

I have already created geographical objects which contains MBR(Minimum
Bounding Rectangle) in their structure, so it is a question of rewriting
your code to change the access to the cube structure to the MBR structure
inside my geoobject. (cf http://fmaps.sourceforge.net/) Look in the CVS for
latest. I have been slack lately on the project, but I'm not forgetting it.

I see where you are aiming. I definitely want to be around when it
starts working.

Quickly I ran through the code, and I think your cube is strictly speaking a
box, which also a MBR.

Yes, cube is definitely a misnomer -- it suggests things are
equihedral, which they aren't. I am still looking for a short name or
an acronym that would indicate it is a box with an arbitrary number of
dimensions. With your application, you will surely benefit from a
smaller and faster code geared specifically for 3D.

However I didn't see the case of intersection, which is the main question
when you want to display object that are visible inside a box.

The procedure is there, it is called cube_inter, but there is no
operator for it.

I suppose your code is under GPL, and you have no problem for me to use it,
providing I put your name and credits somewhere.

No problem at all -- I will be honored if you use it. Was I careless
enough not to include a license? It's not exactly a GPL -- it's
completely unrestricted. I should have said that somewhere.

Good luck,

--Gene

#55

Noname

selkovjr@mcs.anl.gov

about 25 years ago

In reply to: Tom Lane (#53)

Re: [HACKERS] Indexing for geographic objects?

Tom Lane wrote:

Do you have any problem with releasing your stuff under the Postgres
distribution terms (BSD license)?

No, I don't see any problem with the BSD license, or any other
license, for that matter. I just had some reservations about releasing
stuff that was far from perfect, and it took me a while to realize
it could be useful as it is for some, and serve as a good starting
point for others. Now I wonder, what does it take to be in contrib?

there's precedent, as regress already
uses some contrib stuff.

I'd be curious to find out what that stuff is and how it's done.

--Gene

#56

Nathan Myers

ncm@zembu.com

about 25 years ago

In reply to: Noname (#54)

Re: Indexing for geographic objects?

On Mon, Nov 27, 2000 at 06:03:33PM -0600, selkovjr@mcs.anl.gov wrote:

Franck Martin wrote:

I suppose your code is under GPL, and you have no problem for me to
use it, providing I put your name and credits somewhere.

No problem at all -- I will be honored if you use it. Was I careless
enough not to include a license? It's not exactly a GPL -- it's
completely unrestricted. I should have said that somewhere.

Note that (AIUI) placing code in the public domain leaves you liable
for damages from somebody misusing it. You have to retain copyright
just to be able to disclaim liability, in the license -- but then you
need to actually have a license. That's why you don't see much public
domain software. (I am not a lawyer.)

Nathan Myers
ncm@zembu.com

#57

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Noname (#55)

Re: [HACKERS] Indexing for geographic objects?

selkovjr@mcs.anl.gov writes:

Tom Lane wrote:

Do you have any problem with releasing your stuff under the Postgres
distribution terms (BSD license)?

No, I don't see any problem with the BSD license, or any other
license, for that matter. I just had some reservations about releasing
stuff that was far from perfect, and it took me a while to realize
it could be useful as it is for some, and serve as a good starting
point for others. Now I wonder, what does it take to be in contrib?

Just contributing it ;-), which I take the above as permission to do.
When I come up for air from the IPC-hacking I'm doing, I'll grab your
tarball and see about adding it as a contrib module.

Many thanks!

regards, tom lane

#58

grasshacker@over-yonder.net

about 25 years ago

In reply to: Don Baccus (#42)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

On Thu, Nov 23, 2000 at 07:58:29AM -0800, some SMTP stream spewed forth:

At 09:44 AM 11/21/00 -0700, Tim Uckun wrote:

What about the php module? Does it take advantage of API?

I don't know. If not, though, there wouldn't be much point in using
AOLserver, since the simple and efficient database API is the main
attraction. So I think there's a pretty good chance it does.

Through the course of another thread on the lists we have concluded that
PHP does not support the AOLServer (or any other similar) database API.
The "blockage" is that PHP includes its own database functions, albeit
they are based on the Postgres, MySQL, etc. APIs individually.

I am considering looking into urging an integration of PHP and
AOLServer's connection pooling (for lack of a better word) stuff.

*shrug*

Show quoted text

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#59

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: GH (#58)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vs MySQL

At 07:50 PM 11/30/00 -0600, GH wrote:

On Thu, Nov 23, 2000 at 07:58:29AM -0800, some SMTP stream spewed forth:

At 09:44 AM 11/21/00 -0700, Tim Uckun wrote:

What about the php module? Does it take advantage of API?

I don't know. If not, though, there wouldn't be much point in using
AOLserver, since the simple and efficient database API is the main
attraction. So I think there's a pretty good chance it does.

Through the course of another thread on the lists we have concluded that
PHP does not support the AOLServer (or any other similar) database API.
The "blockage" is that PHP includes its own database functions, albeit
they are based on the Postgres, MySQL, etc. APIs individually.

I am considering looking into urging an integration of PHP and
AOLServer's connection pooling (for lack of a better word) stuff.

Well, meanwhile I've gotten confirmation from folks in the PHP world
(via an openacs forum) that it still isn't threadsafe, though there's
an effort underway to track down the problems. I don't know how close
to solving this they are.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#60

Frank Joerdens

frank@joerdens.de

about 25 years ago

In reply to: Don Baccus (#42)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vsMySQL

Don Baccus wrote:

[ . . . ]

Well, meanwhile I've gotten confirmation from folks in the PHP world
(via an openacs forum)

Which forum is that? I'm asking because I am currently trying to migrate
from PHP to Openacs, or integrate the two somehow (lurking on this list
has convinced me that openacs deserves a hard look indeed), and thus I'd
like to find a forum where relevant stuff is being discussed.

Ta,
Frank

#61

Daniel Wickstrom

danw@rtp.ericsson.se

about 25 years ago

In reply to: Frank Joerdens (#60)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vsMySQL

"Frank" == Frank Joerdens <frank@joerdens.de> writes:

Frank> Which forum is that? I'm asking because I am currently
Frank> trying to migrate from PHP to Openacs, or integrate the two
Frank> somehow (lurking on this list has convinced me that openacs
Frank> deserves a hard look indeed), and thus I'd like to find a
Frank> forum where relevant stuff is being discussed.

Check out the bboard forums at http://openacs.org/

-Dan

#62

Don Baccus

dhogaza@pacifier.com

about 25 years ago

In reply to: Frank Joerdens (#60)

Re: [HACKERS] Re: PHPBuilder article -- Postgres vsMySQL

At 12:39 PM 12/1/00 +0100, Frank Joerdens wrote:

Don Baccus wrote:

[ . . . ]

Well, meanwhile I've gotten confirmation from folks in the PHP world
(via an openacs forum)

Which forum is that? I'm asking because I am currently trying to migrate
from PHP to Openacs, or integrate the two somehow (lurking on this list
has convinced me that openacs deserves a hard look indeed), and thus I'd
like to find a forum where relevant stuff is being discussed.

Here's the URL to the thread discussion PHP and AOLserver (and other
stuff, as naturally happens to threads!)

http://openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=0000lY&topic_id=11&topic=OpenACS

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#63

Edmar Wiggers

edmar@brasmap.com

about 25 years ago

In reply to: Noname (#54)

RE: [HACKERS] Indexing for geographic objects?

It seems that R-trees become inefficient when the number of dimensions
increase. Has anyone thoght of a transparent way to use Peano codes (hhcode
in Oracle lingo), and use B-tree indexes instead?

Also, I've read that R-trees sometimes suffer a lot when an update overflows
a node in the index.

The only initial problem I see with Peano codes is that the index is made on
real cubes (all dimensions are equal, due to the recursive decomposition of
space). To overcome that, people have talked about using
multiple-entry-indexes. That is, an object is decomposed in a number of
cubes (not too many), which are then indexed.

In this case, there should be a way to make intersection searches be
transparent. Oracle does it using tables and merge-joins. I have thought of
using GiST to do that, but it seemed too advanced for me yet.

So I thought of using the Oracle technique (make tables and use joins).
Problem: I would need a C function to make the cubes describing an spatial
object, but currently C functions cannot return more than one value (have of
thoght of returning an array, but have not tried it). And making inserts
directly from a C function has been described as magic stuff in the
documentation.

Yours sincerely,

Edmar Wiggers
BRASMAP Information Systems
+55 48 9960 2752

#64

Bernard Frankpitt

frankpit@erols.com

about 25 years ago

In reply to: Edmar Wiggers (#63)

Re: [GENERAL] RE: Indexing for geographic objects?

Edmar Wiggers wrote:

It seems that R-trees become inefficient when the number of dimensions
increase. Has anyone thoght of a transparent way to use Peano codes (hhcode
in Oracle lingo), and use B-tree indexes instead?

Do you have a reference, or more information on what a Peano code is?

Bernie

#65

Noname

edmar@brasmap.com

about 25 years ago

In reply to: Bernard Frankpitt (#64)

RE: [HACKERS] Indexing for geographic objects?

Do you have a reference, or more information on what a Peano code is?

Check this out http://www.statkart.no/nlhdb/iveher/hhtext.htm

Shortly, this technique relies on a space filling curve. That is, a
uni-dimensional curve that, on a given plane, covers every single point, and
only covers it once. Since the curve is 1-dimensional, one can use B-tree
indexes on it.

There a number of curves of this type, e.g. Hilbert's and Peano's. The Peano
curve yelds easier calculations, hence is the one Oracle used to make their
Spatial Data Option.

Moreover, the Peano curve describes a point in an helical kind of way,
recursively dividing space. That's why the Norwegian Hydrographic Service
decided to call it "Helical Hyperspatial Codes" (hhcodes). It was from their
research that Oracle Spatial Data Option was born, back in 1995.

I'm not sure about the exact applicability of hhcodes to index multimedia
stuff yet (images, sound), because those are VERY high-dimensional spaces.
But I've done quite some reading/research, and hhcodes have two very nice
advantages over R-trees:

- it is easy (and not costly in performance), to index things in 3D or 4D
(including time too);
- concurrency is much better, because one does not suffer from costly R-tree
updates (B-trees are much better in that). When dealing with 3D or 4D, this
becomes even more important.

By the way, are you Brazilian Bernard? Oddly enough, maybe we live in the
very same city. Florianopolis, SC, Brazil. It's a small world he? :))

#66

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Noname (#52)

Re: Indexing for geographic objects?

Hi,

We've done some work with GiST indices and found a little problem
with optimizer. The problem could be reproduced with Gene's code
(link is in original message below). test data and sql I could send -
it's just 52Kb gzipped file. What is a reason for optimizer to decide
that sequential scan is better (look below for a numbers).
Implicite disabling of seq scan gave much better timings.

Regards,

Oleg

test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Seq Scan on test (cost=0.00..184.01 rows=5000 width=12)

EXPLAIN

% ./bench.pl -d test -b 100
total: 3.19 sec; number: 100; for one: 0.032 sec; found 18 docs

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

On Mon, 27 Nov 2000 selkovjr@mcs.anl.gov wrote:

Date: Mon, 27 Nov 2000 12:36:42 -0600
From: selkovjr@mcs.anl.gov
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: 'pgsql-general ' <pgsql-general@postgresql.org>,
'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Indexing for geographic objects?

Tom Lane wrote:

Michael Ansley <Michael.Ansley@intec-telecom-systems.com> writes:

Remember also that the GiST library has been integrated into PG, (my brother
is doing some thesis workon that at the moment),

Yeah? Does it still work?

You bet. One would otherwise be hearing from me. I depend on it quite
heavily and I am checking with almost every release. I am now current
with 7.0.2 -- this time it required some change, although not in the c
code. And that's pretty amazing: I was only screwed once since
postgres95 -- by a beta version I don't remember now; then I
complained and the problem was fixed. I don't even know whom I owe
thanks for that.

Since the GIST code is not tested by any standard regress test, and is
so poorly documented that hardly anyone can be using it,

I've always

assumed that it is probably suffering from a severe case of bit-rot.

I'd love to see someone contribute documentation and regression test
cases for it --- it's a great feature, if it works.

The bit rot fortunately did not happen, but the documentation I
promised Bruce many months ago is still "in the works" -- meaning,
something interfered and I haven't had a chance to start. Like a
friend of mine muses all the time, "Promise doesn't mean
marriage". Boy, do I feel guilty.

It's a bit better with the testing. I am not sure how to test the
GiST directly, but I have adapted the current version of regression
tests for the data types that depend on it. One can find them in my
contrib directory, under test/ (again, it's
http://wit.mcs.anl.gov/~selkovjr/pg_extensions/contrib.tgz)

It would be nice if at least one of the GiST types became a built-in
(that would provide for a more intensive testing), but I can also
think of the contrib code being (optionally) included into the main
build and regression test trees. The top-level makefile can have a
couple of special targets to build and test the contribs. I believe my
version of the tests can be a useful example to other contributors
whose code is already in the source tree.

--Gene

#67

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Oleg Bartunov (#66)

Re: Indexing for geographic objects?

Oleg Bartunov <oleg@sai.msu.su> writes:

We've done some work with GiST indices and found a little problem
with optimizer.

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

I'd venture that the major problem here is bogus estimated selectivities
for rtree/gist operators. Note the discrepancy between the estimated
row count and the actual (I assume the "found 18 docs" is the true
number of rows output by the query). With an estimated row count even
half that (ie, merely two orders of magnitude away from reality ;-))
the thing would've correctly chosen the index scan over sequential.

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

regards, tom lane

#68

The Hermit Hacker

scrappy@hub.org

about 25 years ago

In reply to: Tom Lane (#67)

Re: Indexing for geographic objects?

just a note here ... recently, we had a client with similar problems with
using index scan, where turning off seqscan did the trick ... we took his
tables, loaded them into a v7.1beta1 server and it correctly comes up with
the index scan ...

Oleg, have you tried this with v7.1 yet?

On Fri, 8 Dec 2000, Tom Lane wrote:

Oleg Bartunov <oleg@sai.msu.su> writes:

We've done some work with GiST indices and found a little problem
with optimizer.

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

I'd venture that the major problem here is bogus estimated selectivities
for rtree/gist operators. Note the discrepancy between the estimated
row count and the actual (I assume the "found 18 docs" is the true
number of rows output by the query). With an estimated row count even
half that (ie, merely two orders of magnitude away from reality ;-))
the thing would've correctly chosen the index scan over sequential.

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

regards, tom lane

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#69

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Tom Lane (#67)

Re: Indexing for geographic objects?

On Fri, 8 Dec 2000, Tom Lane wrote:

Date: Fri, 08 Dec 2000 10:47:37 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Oleg Bartunov <oleg@sai.msu.su>
Cc: selkovjr@mcs.anl.gov, 'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Indexing for geographic objects?

Oleg Bartunov <oleg@sai.msu.su> writes:

We've done some work with GiST indices and found a little problem
with optimizer.

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

I'd venture that the major problem here is bogus estimated selectivities
for rtree/gist operators. Note the discrepancy between the estimated
row count and the actual (I assume the "found 18 docs" is the true
number of rows output by the query). With an estimated row count even

yes, 18 docs is the true number

half that (ie, merely two orders of magnitude away from reality ;-))
the thing would've correctly chosen the index scan over sequential.

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

park-lane:~/app/pgsql/gist_problem$ wc SQL
10009 10049 157987 SQL
about 10,000 rows,
relevant part of script is:
.....skipped...
1.9039...3.5139
1.8716...3.9317
\.
CREATE INDEX test_seg_ix ON test USING gist (s);
vacuum analyze;
^^^^^^^^^^^^^^
explain select * from test where s @ '1.05 .. 3.95';
set enable_seqscan = off;
explain select * from test where s @ '1.05 .. 3.95';

Regards,
Oleg

regards, tom lane

#70

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: The Hermit Hacker (#68)

Re: Indexing for geographic objects?

On Fri, 8 Dec 2000, The Hermit Hacker wrote:

Date: Fri, 8 Dec 2000 12:19:56 -0400 (AST)
From: The Hermit Hacker <scrappy@hub.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Oleg Bartunov <oleg@sai.msu.su>, selkovjr@mcs.anl.gov,
'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Indexing for geographic objects?

just a note here ... recently, we had a client with similar problems with
using index scan, where turning off seqscan did the trick ... we took his
tables, loaded them into a v7.1beta1 server and it correctly comes up with
the index scan ...

Oleg, have you tried this with v7.1 yet?

Not yet. Just a plain 7.0.3 release. Will play with 7.1beta.
But we're working in real life and need things to work in production :-)

regards,
Oleg

On Fri, 8 Dec 2000, Tom Lane wrote:

Oleg Bartunov <oleg@sai.msu.su> writes:

We've done some work with GiST indices and found a little problem
with optimizer.

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

I'd venture that the major problem here is bogus estimated selectivities
for rtree/gist operators. Note the discrepancy between the estimated
row count and the actual (I assume the "found 18 docs" is the true
number of rows output by the query). With an estimated row count even
half that (ie, merely two orders of magnitude away from reality ;-))
the thing would've correctly chosen the index scan over sequential.

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

regards, tom lane

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#71

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Oleg Bartunov (#69)

Re: Indexing for geographic objects?

Oleg Bartunov <oleg@sai.msu.su> writes:

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

about 10,000 rows,

So the thing is estimating 0.5 selectivity, which is a fallback for
operators it knows nothing whatever about.

[ ... digs in Selkov's scripts ... ]

CREATE OPERATOR @ (
LEFTARG = seg, RIGHTARG = seg, PROCEDURE = seg_contains,
COMMUTATOR = '~'
);

CREATE OPERATOR ~ (
LEFTARG = seg, RIGHTARG = seg, PROCEDURE = seg_contained,
COMMUTATOR = '@'
);

Sure 'nuff, no selectivity info attached to these declarations.
Try adding

RESTRICT = contsel, JOIN = contjoinsel

to them. That's still an entirely bogus estimate, but at least
it's a smaller bogus estimate ... small enough to select an indexscan,
one hopes (see utils/adt/geo_selfuncs.c).

I have not dug through Gene's stuff to see which other indexable
operators might be missing selectivity estimates, but I'll bet there
are others. If you have the time to look through it and submit a
patch, I can incorporate it into the version that will go into contrib.

regards, tom lane

#72

The Hermit Hacker

scrappy@hub.org

about 25 years ago

In reply to: Oleg Bartunov (#70)

Re: Indexing for geographic objects?

On Fri, 8 Dec 2000, Oleg Bartunov wrote:

On Fri, 8 Dec 2000, The Hermit Hacker wrote:

Date: Fri, 8 Dec 2000 12:19:56 -0400 (AST)
From: The Hermit Hacker <scrappy@hub.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Oleg Bartunov <oleg@sai.msu.su>, selkovjr@mcs.anl.gov,
'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Indexing for geographic objects?

just a note here ... recently, we had a client with similar problems with
using index scan, where turning off seqscan did the trick ... we took his
tables, loaded them into a v7.1beta1 server and it correctly comes up with
the index scan ...

Oleg, have you tried this with v7.1 yet?

Not yet. Just a plain 7.0.3 release. Will play with 7.1beta.
But we're working in real life and need things to work in production :-)

Okay, then I believe that what you are experience wiht v7.0.3 is already
fixed in v7.1beta, based on similar results I got with some queries and
then tested uver v7.1 ...

regards,
Oleg

On Fri, 8 Dec 2000, Tom Lane wrote:

Oleg Bartunov <oleg@sai.msu.su> writes:

We've done some work with GiST indices and found a little problem
with optimizer.

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

I'd venture that the major problem here is bogus estimated selectivities
for rtree/gist operators. Note the discrepancy between the estimated
row count and the actual (I assume the "found 18 docs" is the true
number of rows output by the query). With an estimated row count even
half that (ie, merely two orders of magnitude away from reality ;-))
the thing would've correctly chosen the index scan over sequential.

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

regards, tom lane

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#73

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Tom Lane (#71)

Re: Indexing for geographic objects?

On Fri, 8 Dec 2000, Tom Lane wrote:

Date: Fri, 08 Dec 2000 12:59:27 -0500
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Oleg Bartunov <oleg@sai.msu.su>
Cc: selkovjr@mcs.anl.gov, 'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Indexing for geographic objects?

Oleg Bartunov <oleg@sai.msu.su> writes:

5000 looks like a suspiciously round number ... how many rows are in
the table? Have you done a vacuum analyze on it?

about 10,000 rows,

So the thing is estimating 0.5 selectivity, which is a fallback for
operators it knows nothing whatever about.

[ ... digs in Selkov's scripts ... ]

CREATE OPERATOR @ (
LEFTARG = seg, RIGHTARG = seg, PROCEDURE = seg_contains,
COMMUTATOR = '~'
);

CREATE OPERATOR ~ (
LEFTARG = seg, RIGHTARG = seg, PROCEDURE = seg_contained,
COMMUTATOR = '@'
);

Sure 'nuff, no selectivity info attached to these declarations.
Try adding

RESTRICT = contsel, JOIN = contjoinsel

to them. That's still an entirely bogus estimate, but at least
it's a smaller bogus estimate ... small enough to select an indexscan,
one hopes (see utils/adt/geo_selfuncs.c).

Great ! Now we have better plan:

test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..61.56 rows=100 width=12)

EXPLAIN

I have not dug through Gene's stuff to see which other indexable
operators might be missing selectivity estimates, but I'll bet there
are others. If you have the time to look through it and submit a
patch, I can incorporate it into the version that will go into contrib.

We didn't look at Gene's stuff yet. Maybe Gene could find a time to
check his code.

regards, tom lane

#74

Noname

selkovjr@mcs.anl.gov

about 25 years ago

In reply to: Tom Lane (#67)

Re: Indexing for geographic objects?

Tom Lane wrote:

Oleg Bartunov <oleg@sai.msu.su> writes:

We've done some work with GiST indices and found a little problem
with optimizer.

test=# set enable_seqscan = off;
SET VARIABLE
test=# explain select * from test where s @ '1.05 .. 3.95';
NOTICE: QUERY PLAN:

Index Scan using test_seg_ix on test (cost=0.00..369.42 rows=5000 width=12)

EXPLAIN
% ./bench.pl -d test -b 100 -i
total: 1.71 sec; number: 100; for one: 0.017 sec; found 18 docs

I'd venture that the major problem here is bogus estimated selectivities
for rtree/gist operators.

Yes, the problem is, I didn't have the foggiest idea how to estimate
selectivity, nor I had any stats when I developed the type. Before
7.0, I had some success using selectivity estimators of another
datatype (I think that was int, but I am not sure). In 7.0, most of
those estimators were gone and I have probably chosen the wrong ones
or none at all, just so I could get it to work again. The performance
was good enough for my taste, so I have even forgotten that was an
issue.

I know, I know: 'good enough' is never good. I apoligize.

--Gene

#75

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Noname (#74)

1 attachment(s)

index support for arrays (GiST)

Hi,

we are getting a bit close to add index support for int arrays using
GiST interface. This will really drive up performance of our full text
search fully based on postgresql. We have a problem with broken index
and couldn't find a reason. I attached archive with sources
for GiST functions and test suite to show a problem - vacuum analyze
at end end of TESTSQL should complain about broken index.
Here is a short description:
1. untar in contrib 7.0.*
2. cd _intarray
3. edit Makefile for TESTDB (name of db for test)
4. createdb TESTDB
5. gmake
6. gmake install
7. psql TESTDB < TESTSQL

Regards,

Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#76

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Oleg Bartunov (#75)

Re: index support for arrays (GiST)

Well,

we found an answer ourserlves. Patch for 7.0.3 is included below.
Credits to Teodor Sigaev (teodor@stack.net)
Some comments:

From src/backend/access/gist/gist.c

/*
** Take a compressed entry, and install it on a page. Since we now know
** where the entry will live, we decompress it and recompress it using
** that knowledge (some compression routines may want to fish around
** on the page, for example, or do something special for leaf nodes.)
*/

After compressing of index it's written to disk decompressed (!) which
is the reason we have the problem with broken index !
It looks like other people just didn't use index decompression function
(at least in Gene's code decompression function just do return ) and
that's why this bug was not discovered. We could make a patch for
upcoming 7.1 if hackers desired. I consider this patch as a bugfix
not a new feature or improvement. We got a very promising results.

Another question to this code is - why gistPageAddItem does
compress - decompress - compress. It's not clear from the comment.

Best regards,

Oleg

-------------------------------------------------------------------------
maze% diff -c backend/access/gist/gist.c 
backend/access/gist/gist.c.orig  
*** backend/access/gist/gist.c  Fri Dec 15 13:03:40 2000
--- backend/access/gist/gist.c.orig     Fri Dec 15 13:00:50 2000
***************
*** 374,380 ****
  {
        GISTENTRY       tmpcentry;
        IndexTuple      itup = (IndexTuple) item;
-       OffsetNumber    retval;

        /*
         * recompress the item given that we now know the exact page and
--- 374,379 ----
***************
*** 386,400 ****
                                   IndexTupleSize(itup) -
sizeof(IndexTupleData), FALSE);
        gistcentryinit(giststate, &tmpcentry, dentry->pred, r, page,
                                   offsetNumber, dentry->bytes, FALSE);
!       *newtup = gist_tuple_replacekey(r, tmpcentry, itup);
!       retval = PageAddItem(page, (Item) *newtup,
IndexTupleSize(*newtup),
!                                               offsetNumber, flags);
        /* be tidy */
        if (tmpcentry.pred != dentry->pred
                && tmpcentry.pred != (((char *) itup) +
sizeof(IndexTupleData)))
                pfree(tmpcentry.pred);

! return (retval);
}

--- 385,398 ----
                                   IndexTupleSize(itup) -
sizeof(IndexTupleData), FALSE);
        gistcentryinit(giststate, &tmpcentry, dentry->pred, r, page,
                                   offsetNumber, dentry->bytes, FALSE);
!       *newtup = gist_tuple_replacekey(r, *dentry, itup);
        /* be tidy */
        if (tmpcentry.pred != dentry->pred
                && tmpcentry.pred != (((char *) itup) +
sizeof(IndexTupleData)))
                pfree(tmpcentry.pred);

! return (PageAddItem(page, (Item) *newtup,
IndexTupleSize(*newtup),
! offsetNumber, flags));
}

-----------------------------------------------------------------------

On Wed, 13 Dec 2000, Oleg Bartunov wrote:

Date: Wed, 13 Dec 2000 18:48:40 +0300 (GMT)
From: Oleg Bartunov <oleg@sai.msu.su>
To: selkovjr@mcs.anl.gov
Cc: Tom Lane <tgl@sss.pgh.pa.us>, vmikheev@SECTORBASE.COM,
'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: [HACKERS] index support for arrays (GiST)

Hi,

we are getting a bit close to add index support for int arrays using
GiST interface. This will really drive up performance of our full text
search fully based on postgresql. We have a problem with broken index
and couldn't find a reason. I attached archive with sources
for GiST functions and test suite to show a problem - vacuum analyze
at end end of TESTSQL should complain about broken index.
Here is a short description:
1. untar in contrib 7.0.*
2. cd _intarray
3. edit Makefile for TESTDB (name of db for test)
4. createdb TESTDB
5. gmake
6. gmake install
7. psql TESTDB < TESTSQL

Regards,

Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#77

Oleg Bartunov

oleg@sai.msu.su

about 25 years ago

In reply to: Oleg Bartunov (#76)

Re: index support for arrays (GiST)

On Fri, 15 Dec 2000, Thomas Lockhart wrote:

Date: Fri, 15 Dec 2000 15:47:01 +0000
From: Thomas Lockhart <lockhart@alumni.caltech.edu>
To: Oleg Bartunov <oleg@sai.msu.su>
Cc: selkovjr@mcs.anl.gov, Tom Lane <tgl@sss.pgh.pa.us>,
vmikheev@SECTORBASE.COM,
'pgsql-hackers ' <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] index support for arrays (GiST)

It looks like other people just didn't use index decompression function
(at least in Gene's code decompression function just do return ) and
that's why this bug was not discovered. We could make a patch for
upcoming 7.1 if hackers desired. I consider this patch as a bugfix
not a new feature or improvement. We got a very promising results.

Yes, send patches! Thanks to you and Gene for getting GiST back into
view; it seems like a great feature which was neglected for too long.

We found one more bug with handling NULL values, so continue digging :-)

О©╫О©╫О©╫О©╫

- Thomas

Import Notes

Reply to msg id not found: 3A3A3CF5.A7F0288E@alumni.caltech.edu | Resolved by subject fallback

#78

Thomas Lockhart

lockhart@alumni.caltech.edu

about 25 years ago

In reply to: Oleg Bartunov (#76)

Re: index support for arrays (GiST)

It looks like other people just didn't use index decompression function
(at least in Gene's code decompression function just do return ) and
that's why this bug was not discovered. We could make a patch for
upcoming 7.1 if hackers desired. I consider this patch as a bugfix
not a new feature or improvement. We got a very promising results.

Yes, send patches! Thanks to you and Gene for getting GiST back into
view; it seems like a great feature which was neglected for too long.

- Thomas

PHPBuilder article -- Postgres vs MySQL

Attachments: