Postgres-7.0.2 optimization question

Started by Igor V. Rafienkoover 25 years ago9 messagesgeneral
Jump to latest
#1Igor V. Rafienko
igorr@ifi.uio.no

Hi,

I've got a slight optimization problem with postgres and I was hoping
someone could give me a clue as to what could be tweaked.

I have a couple of tables which contain little data (around 500,000 tuples
each), and most operations take insanely long time to complete. The
primary keys in both tables are ints (int8, iirc). When I perform a delete
(with a where clause on a part of a primary key), an strace shows that
postgres reads the entire table sequentially (lseek() and read()). Since
each table is around 200MB, things take time.

I tried vacuumdb --analyze. It did not help. I tried creating an index on
the part of the primary key that is used in the abovementioned delete. It
did not help either.

Has anyone encountered the same kind of problems before? In that case, has
anyone found a solution? (the problem is that the DB can very fast get 20
times larger (i.e. 10,000,000 tuples per table is a moderate size), and
I'd rather not witness a delete that takes around 90 minutes (100,000
tuples were deleted) more than once).

TIA,

ivr
--
Women wearing Wonder bras and low-cut blouses lose their right to
complain about having their boobs stared at.
"Things men wish women knew"

#2Adam Ruth
aruth@intercation.com
In reply to: Igor V. Rafienko (#1)
Re: Postgres-7.0.2 optimization question

Post the query you're using, there may be a way to rewrite it to use the
index. I've found this to be true on all kinds of DBMSs.

--
Adam Ruth
InterCation, Inc.
www.intercation.com

""Igor V. Rafienko"" <igorr@ifi.uio.no> wrote in message
news:Pine.SOL.4.21.0010131345100.23627-100000@vigrid.ifi.uio.no...

Show quoted text

Hi,

I've got a slight optimization problem with postgres and I was hoping
someone could give me a clue as to what could be tweaked.

I have a couple of tables which contain little data (around 500,000 tuples
each), and most operations take insanely long time to complete. The
primary keys in both tables are ints (int8, iirc). When I perform a delete
(with a where clause on a part of a primary key), an strace shows that
postgres reads the entire table sequentially (lseek() and read()). Since
each table is around 200MB, things take time.

I tried vacuumdb --analyze. It did not help. I tried creating an index on
the part of the primary key that is used in the abovementioned delete. It
did not help either.

Has anyone encountered the same kind of problems before? In that case, has
anyone found a solution? (the problem is that the DB can very fast get 20
times larger (i.e. 10,000,000 tuples per table is a moderate size), and
I'd rather not witness a delete that takes around 90 minutes (100,000
tuples were deleted) more than once).

TIA,

ivr
--
Women wearing Wonder bras and low-cut blouses lose their right to
complain about having their boobs stared at.
"Things men wish women knew"

#3Igor V. Rafienko
igorr@ifi.uio.no
In reply to: Adam Ruth (#2)
Re: Postgres-7.0.2 optimization question

on Oct 13, 2000, 07:55, Adam Ruth std::cout'ed:

| Post the query you're using, there may be a way to rewrite it to use the
| index. I've found this to be true on all kinds of DBMSs.

Okidoki (somewhat simplified (there are 5 other columns as well, but
they have nothing to do with delete)):

create table foo (
foo1 int8 not null,
foo2 int8 not null,
data1 varchar check( data_1 in ( 'zot', 'qux' ) ) not null,
data2 varchar null,

primary key (foo1,foo2),
unique (foo2,data1)
);

And then an index on foo2:

create index foobar on foo( foo2 );

And then a vacuumdb --analyze.

The query cannot be made simpler, imvho, but since you've asked:

delete from foo where foo2 = 42;

There is also a foreign key into this table from a primary key of another
table. Yes, there is an index on the other table as well.

ivr
--
"... car il faut convenir que Dieu n'a cr�e les femmes que pour
apprivoiser les hommes"
Voltaire

#4Steve Wolfe
steve@iboats.com
In reply to: Igor V. Rafienko (#1)
Re: Postgres-7.0.2 optimization question

On Fri, Oct 13, 2000 at 02:05:18PM +0200, Igor V. Rafienko wrote:

I tried vacuumdb --analyze. It did not help. I tried creating an index

on

If I parse right this is excactly VACUUM without ANALYZE
because '--' is SQL comment start ... no?

"vacuumdb" is invoked at the shell, not in an SQL editor, so "--analyze"
is just an argument passed to the vacuumdb program, causing it to analyze as
it vacuums. "vacuumdb --help" tells you all of the nifty arguments. : )

steve

#5Igor V. Rafienko
igorr@ifi.uio.no
In reply to: Steve Wolfe (#4)
Re: Postgres-7.0.2 optimization question

on Oct 13, 2000, 17:26, Marko Kreen std::cout'ed:

| On Fri, Oct 13, 2000 at 02:05:18PM +0200, Igor V. Rafienko wrote:
| > I tried vacuumdb --analyze. It did not help. I tried creating an index on
|
| If I parse right this is excactly VACUUM without ANALYZE
| because '--' is SQL comment start ... no?

Ehh... no. Not quite:

$ ./vacuumdb --help | grep analyze
vacuumdb cleans and analyzes a PostgreSQL database.
-z, --analyze Update optimizer hints
$

The analyze was performed from outside the database via vacuumdb command,
not via VACUUM command in the psql.

ivr
--
Everybody has to have a philosophy. Some people believe in laissez
faire economics, others believe in reincarnation. Some people even
believe that COBOL is a real programming language.
S. Meyers, "Effective C++", item 26

#6Marko Kreen
markokr@gmail.com
In reply to: Igor V. Rafienko (#1)
Re: Postgres-7.0.2 optimization question

On Fri, Oct 13, 2000 at 02:05:18PM +0200, Igor V. Rafienko wrote:

I tried vacuumdb --analyze. It did not help. I tried creating an index on

If I parse right this is excactly VACUUM without ANALYZE
because '--' is SQL comment start ... no?

--
marko

#7Alfred Perlstein
bright@wintelcom.net
In reply to: Igor V. Rafienko (#1)
Re: Postgres-7.0.2 optimization question

* Igor V. Rafienko <igorr@ifi.uio.no> [001013 05:09] wrote:

Hi,

I've got a slight optimization problem with postgres and I was hoping
someone could give me a clue as to what could be tweaked.

I have a couple of tables which contain little data (around 500,000 tuples
each), and most operations take insanely long time to complete. The
primary keys in both tables are ints (int8, iirc). When I perform a delete
(with a where clause on a part of a primary key), an strace shows that
postgres reads the entire table sequentially (lseek() and read()). Since
each table is around 200MB, things take time.

Postgresql fails to use the index on several of our tables, an
'EXPLAIN <query>' would probably output a lot of lines about
doing a 'sequential scan'.

The only solution that I've been able to come across is to issue
a 'set enable_seqscan=off;' SQL statement on most of my queries
to force postgresql to use an index.

hope this helps,
-Alfred

#8Mitch Vincent
mitch@venux.net
In reply to: Igor V. Rafienko (#1)
Re: Postgres-7.0.2 optimization question

If you could post the queries in question along with the table structure and
EXPLAIN output of the queries, I'm sure someone might be able to suggest
something..

-Mitch

----- Original Message -----
From: "Alfred Perlstein" <bright@wintelcom.net>
To: "Igor V. Rafienko" <igorr@ifi.uio.no>
Cc: <pgsql-general@postgresql.org>
Sent: Friday, October 13, 2000 10:47 AM
Subject: Re: [GENERAL] Postgres-7.0.2 optimization question

* Igor V. Rafienko <igorr@ifi.uio.no> [001013 05:09] wrote:

Hi,

I've got a slight optimization problem with postgres and I was hoping
someone could give me a clue as to what could be tweaked.

I have a couple of tables which contain little data (around 500,000

tuples

each), and most operations take insanely long time to complete. The
primary keys in both tables are ints (int8, iirc). When I perform a

delete

Show quoted text

(with a where clause on a part of a primary key), an strace shows that
postgres reads the entire table sequentially (lseek() and read()). Since
each table is around 200MB, things take time.

Postgresql fails to use the index on several of our tables, an
'EXPLAIN <query>' would probably output a lot of lines about
doing a 'sequential scan'.

The only solution that I've been able to come across is to issue
a 'set enable_seqscan=off;' SQL statement on most of my queries
to force postgresql to use an index.

hope this helps,
-Alfred

#9Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Igor V. Rafienko (#3)
Re: Postgres-7.0.2 optimization question

You might want to try:
delete from foo where foo2=42::int8

IIRC, there is an issue with int8 indexes and integer constants
(which are treated as int4).

Stephan Szabo
sszabo@bigpanda.com

On Fri, 13 Oct 2000, Igor V. Rafienko wrote:

Show quoted text

on Oct 13, 2000, 07:55, Adam Ruth std::cout'ed:

| Post the query you're using, there may be a way to rewrite it to use the
| index. I've found this to be true on all kinds of DBMSs.

Okidoki (somewhat simplified (there are 5 other columns as well, but
they have nothing to do with delete)):

create table foo (
foo1 int8 not null,
foo2 int8 not null,
data1 varchar check( data_1 in ( 'zot', 'qux' ) ) not null,
data2 varchar null,

primary key (foo1,foo2),
unique (foo2,data1)
);

And then an index on foo2:

create index foobar on foo( foo2 );

And then a vacuumdb --analyze.

The query cannot be made simpler, imvho, but since you've asked:

delete from foo where foo2 = 42;

There is also a foreign key into this table from a primary key of another
table. Yes, there is an index on the other table as well.