Slow Inserts on 1 table?

Started by Dan Armbrustover 20 years ago23 messagesgeneral
Jump to latest
#1Dan Armbrust
daniel.armbrust.list@gmail.com

I have one particular insert query that is running orders of magnitude
slower than other insert queries, and I cannot understand why.
For example, Inserts into "conceptProperty" (detailed below) are at
least 5 times faster than inserts into "conceptPropertyMultiAttributes".

When I am running the inserts, postmaster shows as pegging one CPU on
the Fedora Core 3 server it is running on at nearly 100%.

Any advice is appreciated. Here is a lot of info that may shed light on
the issue to someone with more experience than me:

Example Insert Query with data:
INSERT INTO conceptPropertyMultiAttributes (codingSchemeName,
conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI
MetaThesaurus', 'C0000005', 'T-2', 'Source', 'MSH2005_2004_10_12')

EXPLAIN ANALYZE output:
QUERY PLAN
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008
rows=1 loops=1)
Total runtime: 4.032 ms

Table Structure:
CREATE TABLE conceptpropertymultiattributes (
codingschemename character varying(70) NOT NULL,
conceptcode character varying(100) NOT NULL,
propertyid character varying(50) NOT NULL,
attributename character varying(50) NOT NULL,
attributevalue character varying(250) NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptpropertymultiattributes
ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY
(codingschemename, conceptcode, propertyid, attributename, attributevalue);

Foreign Key:
ALTER TABLE ONLY conceptpropertymultiattributes
ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode,
propertyid) REFERENCES conceptproperty(codingschemename, conceptcode,
propertyid);

Structure of Table Referenced by Foreign Key:
CREATE TABLE conceptproperty (
codingschemename character varying(70) NOT NULL,
conceptcode character varying(100) NOT NULL,
propertyid character varying(50) NOT NULL,
property character varying(250) NOT NULL,
"language" character varying(32),
presentationformat character varying(50),
datatype character varying(50),
ispreferred boolean,
degreeoffidelity character varying(50),
matchifnocontext boolean,
representationalform character varying(50),
propertyvalue text NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptproperty
ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename,
conceptcode, propertyid);

Thanks,

Dan

#2Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Dan Armbrust (#1)
Re: Slow Inserts on 1 table?

What indexes are defined on both tables? Are there any triggers or
rules?

On Wed, Jul 20, 2005 at 09:50:54AM -0500, Dan Armbrust wrote:

I have one particular insert query that is running orders of magnitude
slower than other insert queries, and I cannot understand why.
For example, Inserts into "conceptProperty" (detailed below) are at
least 5 times faster than inserts into "conceptPropertyMultiAttributes".

When I am running the inserts, postmaster shows as pegging one CPU on
the Fedora Core 3 server it is running on at nearly 100%.

Any advice is appreciated. Here is a lot of info that may shed light on
the issue to someone with more experience than me:

Example Insert Query with data:
INSERT INTO conceptPropertyMultiAttributes (codingSchemeName,
conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI
MetaThesaurus', 'C0000005', 'T-2', 'Source', 'MSH2005_2004_10_12')

EXPLAIN ANALYZE output:
QUERY PLAN
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008
rows=1 loops=1)
Total runtime: 4.032 ms

Table Structure:
CREATE TABLE conceptpropertymultiattributes (
codingschemename character varying(70) NOT NULL,
conceptcode character varying(100) NOT NULL,
propertyid character varying(50) NOT NULL,
attributename character varying(50) NOT NULL,
attributevalue character varying(250) NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptpropertymultiattributes
ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY
(codingschemename, conceptcode, propertyid, attributename, attributevalue);

Foreign Key:
ALTER TABLE ONLY conceptpropertymultiattributes
ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode,
propertyid) REFERENCES conceptproperty(codingschemename, conceptcode,
propertyid);

Structure of Table Referenced by Foreign Key:
CREATE TABLE conceptproperty (
codingschemename character varying(70) NOT NULL,
conceptcode character varying(100) NOT NULL,
propertyid character varying(50) NOT NULL,
property character varying(250) NOT NULL,
"language" character varying(32),
presentationformat character varying(50),
datatype character varying(50),
ispreferred boolean,
degreeoffidelity character varying(50),
matchifnocontext boolean,
representationalform character varying(50),
propertyvalue text NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptproperty
ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename,
conceptcode, propertyid);

Thanks,

Dan

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

#3Dan Armbrust
daniel.armbrust.list@gmail.com
In reply to: Dan Armbrust (#1)
Re: Slow Inserts on 1 table?

Dan Armbrust wrote:

I have one particular insert query that is running orders of magnitude
slower than other insert queries, and I cannot understand why.
For example, Inserts into "conceptProperty" (detailed below) are at
least 5 times faster than inserts into "conceptPropertyMultiAttributes".

When I am running the inserts, postmaster shows as pegging one CPU on
the Fedora Core 3 server it is running on at nearly 100%.

Any advice is appreciated. Here is a lot of info that may shed light
on the issue to someone with more experience than me:

Example Insert Query with data:
INSERT INTO conceptPropertyMultiAttributes (codingSchemeName,
conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI
MetaThesaurus', 'C0000005', 'T-2', 'Source', 'MSH2005_2004_10_12')

EXPLAIN ANALYZE output:
QUERY PLAN
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008
rows=1 loops=1)
Total runtime: 4.032 ms

Table Structure:
CREATE TABLE conceptpropertymultiattributes (
codingschemename character varying(70) NOT NULL,
conceptcode character varying(100) NOT NULL,
propertyid character varying(50) NOT NULL,
attributename character varying(50) NOT NULL,
attributevalue character varying(250) NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptpropertymultiattributes
ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY
(codingschemename, conceptcode, propertyid, attributename,
attributevalue);

Foreign Key:
ALTER TABLE ONLY conceptpropertymultiattributes
ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode,
propertyid) REFERENCES conceptproperty(codingschemename, conceptcode,
propertyid);

Structure of Table Referenced by Foreign Key:
CREATE TABLE conceptproperty (
codingschemename character varying(70) NOT NULL,
conceptcode character varying(100) NOT NULL,
propertyid character varying(50) NOT NULL,
property character varying(250) NOT NULL,
"language" character varying(32),
presentationformat character varying(50),
datatype character varying(50),
ispreferred boolean,
degreeoffidelity character varying(50),
matchifnocontext boolean,
representationalform character varying(50),
propertyvalue text NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptproperty
ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename,
conceptcode, propertyid);

Thanks,

Dan

Well, I now have a further hunch on why the inserts are so slow on 1
table. Most of the time, when I am doing bulk inserts, I am starting
with an empty database. My insertion program creates the tables,
indexes and foreign keys.

The problem seems to be the foreign key - PostgreSQL is apparently being
to stupid to use the indexes while loading and checking the foreign key
between two large tables - my guess is because analyze has not been run
yet, so it thinks all of the tables are size 0. If I let it run for a
while, then kill the load process, run Analyze, empty the tables, and
then restart, things perform fine. But that is kind of a ridiculous
sequence to have to use to load a database.

Why can't postgres compile some rough statistics on tables without
running analyze? Seems that it would be pretty easy to keep track of
the number of inserts/deletions that have taken place since the last
Analyze execution... It may not be the exact right number, but it would
certainly be smarter than continuing to assume that the tables are size
0, even though it has been doing constant inserts on the tables in
question....

I have already had to disable sequential scans, since the planner is
almost _always_ wrong in deciding whether or not to use an index. I put
the indexes on the columns I choose for a reason - it is because I KNOW
the index read will ALWAYS be faster since I designed the indexes for
the queries I am running. But it still must be doing a sequential scan
on these inserts...

--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

#4Richard Huxton
dev@archonet.com
In reply to: Dan Armbrust (#3)
Re: Slow Inserts on 1 table?

Dan Armbrust wrote:

Dan Armbrust wrote:

I have one particular insert query that is running orders of magnitude
slower than other insert queries, and I cannot understand why.
For example, Inserts into "conceptProperty" (detailed below) are at
least 5 times faster than inserts into "conceptPropertyMultiAttributes".

Well, I now have a further hunch on why the inserts are so slow on 1
table. Most of the time, when I am doing bulk inserts, I am starting
with an empty database. My insertion program creates the tables,
indexes and foreign keys.
The problem seems to be the foreign key - PostgreSQL is apparently being
to stupid to use the indexes while loading and checking the foreign key
between two large tables - my guess is because analyze has not been run
yet, so it thinks all of the tables are size 0.

If you haven't analysed them since creation, it should think size=1000,
which is a safety measure to reduce this sort of problem.

If I let it run for a
while, then kill the load process, run Analyze, empty the tables, and
then restart, things perform fine. But that is kind of a ridiculous
sequence to have to use to load a database.
Why can't postgres compile some rough statistics on tables without
running analyze? Seems that it would be pretty easy to keep track of
the number of inserts/deletions that have taken place since the last
Analyze execution... It may not be the exact right number, but it would
certainly be smarter than continuing to assume that the tables are size
0, even though it has been doing constant inserts on the tables in
question....

Yep, but it would have to do it all the time. That's overhead on every
query.

I have already had to disable sequential scans, since the planner is
almost _always_ wrong in deciding whether or not to use an index.

Then either your stats are badly out, or your other configuration
settings are.

I put
the indexes on the columns I choose for a reason - it is because I KNOW
the index read will ALWAYS be faster since I designed the indexes for
the queries I am running. But it still must be doing a sequential scan
on these inserts...

What, ALWAYS faster, even for the first FK check when there's only one
row in the target table and that's cached?

If you're really in a hurry doing your bulk loads:
1. Use COPY.
2. Drop/restore the foreign-key constraints before/after.
That will be hugely faster than INSERTs, although it's not always an
applicable solution.
--
Richard Huxton
Archonet Ltd

#5John D. Burger
john@mitre.org
In reply to: Dan Armbrust (#3)
Re: Slow Inserts on 1 table?

my guess is because analyze has not been run yet, so it thinks all of
the tables are size 0. If I let it run for a while, then kill the
load process, run Analyze, empty the tables, and then restart, things
perform fine. But that is kind of a ridiculous sequence to have to
use to load a database.

So automate it. After discovering exactly this behavior, I've
developed an idiom for load scripts where I (optionally) commit at some
linear interval, and (optionally) analyze at some exponential interval.
I presume this has been invented countless times, but here's my basic
idea in pseudo-code:

commitInterval = 1000
analyzeFactor = 2
whenToCommit = whenToAnalyze = commitInterval
nInserts = 0

loop over input data
if we decide to insert
insert
nInserts++
if whenToCommit < nInserts
commmit
whenToCommit += commitInterval
if whenToAnalyze < nInserts
analyze
whenToAnalyze *= 2
....

So (with these constants) we commit after 1000 total inserts, then
after 2000, 3000, etc. And we analyze after 1000 inserts, then after
2000, 4000, etc. This is perhaps way too conservative - in particular,
I suspect that it's only the first one or two analyzes that matter -
but it works for me.

The only annoyance is that the interface I use most often, Python's
pgdb, runs everything in a transaction, and you can't analyze in a
transaction. I've gotten around this in a variety of ways, some less
principled than others.

- John D. Burger
MITRE

#6Dan Armbrust
daniel.armbrust.list@gmail.com
In reply to: Richard Huxton (#4)
Re: Slow Inserts on 1 table?

What, ALWAYS faster, even for the first FK check when there's only one
row in the target table and that's cached?

If you're really in a hurry doing your bulk loads:
1. Use COPY.
2. Drop/restore the foreign-key constraints before/after.
That will be hugely faster than INSERTs, although it's not always an
applicable solution.
--
Richard Huxton
Archonet Ltd

It seems like the query planner goes to great lengths to avoid using
indexes because it might take 5 ms longer to execute an index lookup on
a table with one row.

But then, when the table has 1 million rows, and a full scan takes 3
minutes, and the index scan takes 3 seconds, it has no problem picking
the 3 minute route.
I'll gladly give up the 5 ms in turn for not having to wait 3 minutes,
which is why I disabled the sequential scans. If I have a small table,
where indexes won't speed things up, I wont build an index on it.

The other factor, is that most of my tables have at least thousands, and
usually millions of rows. Sequential scans will never be faster for the
queries that I am doing - like I said, that is why I created the indexes.

My loading is done programatically, from another format, so COPY is not
an option. Neither is removing foreign keys, as they are required to
guarantee valid data. I don't really have a problem with the insert
speed when it is working properly - it is on par with other DBs that I
have on the same hardware. The problem is when it stops using the
indexes, for no good reason.

Example, last night, I kicked off a load process - this morning, it had
only managed to make it through about 600,000 rows (split across several
tables). After restarting it this morning, it made it through the same
data in 30 minutes.

If thats not bad and buggy behavior, I don't know what is....

Dan

--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

#7Joshua D. Drake
jd@commandprompt.com
In reply to: Dan Armbrust (#3)
Re: Slow Inserts on 1 table?

Why can't postgres compile some rough statistics on tables without
running analyze?

Why can't you just run analyze? You don't have to empty the tables to do
so and you can alter the statistics on the fly. Heck you can even run
analyze while doing the inserts.

Perhaps the fine manual would be of assistance?

http://www.postgresql.org/docs/8.0/interactive/sql-analyze.html
http://www.postgresql.org/docs/8.0/interactive/sql-altertable.html

I have already had to disable sequential scans, since the planner is
almost _always_ wrong in deciding whether or not to use an index.

Sounds again like you are not setting up your statistics correctly or
running analyze as and when it should.

Sincerely,

Joshua D. Drake

--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org

#8Dan Armbrust
daniel.armbrust.list@gmail.com
In reply to: Joshua D. Drake (#7)
Re: Slow Inserts on 1 table?

Joshua D. Drake wrote:

Why can't postgres compile some rough statistics on tables without
running analyze?

Why can't you just run analyze? You don't have to empty the tables to
do so and you can alter the statistics on the fly. Heck you can even
run analyze while doing the inserts.

I shouldn't have to manually run Analyze to make the DB be capable of
handling inserts involving tables with foreign keys correctly. My code
that is doing the inserts is a java application that works across
multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc.

I shouldn't have to put custom code into it just to make postgres deal
with inserts properly. No other database that I insert data into has
problems like this.

This will look really nice in the instructions for my data loader -

* - If you are using PostgreSQL for your database server, it has a bug
that causes its performance to become abysmal unless you manually run
this "Analyze" command a little while after you start the load process.

I will have users that don't even know what a database is, much less
have to go out and run manual sysadmin level commands on it to make the
thing work.

I have already had to disable sequential scans, since the planner is
almost _always_ wrong in deciding whether or not to use an index.

Sounds again like you are not setting up your statistics correctly or
running analyze as and when it should.

Doesn't matter if the statistics are perfectly up to date. It still
doesn't use the indexes. If the default statistics are so poorly set up
that the planner thinks a 5 million row table scan will be quicker for a
query that is doing exact matches on indexed columns, I would say it is
poorly implemented. So I just disabled that "feature". And it works
fine with sequential scans disabled - I have no problem with it in this
respect, since I can turn it off.

Is there any way that I can disable sequential scans for foreign key checks?

Dan

--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

#9Richard Huxton
dev@archonet.com
In reply to: Dan Armbrust (#6)
Re: Slow Inserts on 1 table?

Dan Armbrust wrote:

What, ALWAYS faster, even for the first FK check when there's only one
row in the target table and that's cached?

If you're really in a hurry doing your bulk loads:
1. Use COPY.
2. Drop/restore the foreign-key constraints before/after.
That will be hugely faster than INSERTs, although it's not always an
applicable solution.
--
Richard Huxton
Archonet Ltd

It seems like the query planner goes to great lengths to avoid using
indexes because it might take 5 ms longer to execute an index lookup on
a table with one row.
But then, when the table has 1 million rows, and a full scan takes 3
minutes, and the index scan takes 3 seconds, it has no problem picking
the 3 minute route.
I'll gladly give up the 5 ms in turn for not having to wait 3 minutes,
which is why I disabled the sequential scans. If I have a small table,
where indexes won't speed things up, I wont build an index on it.

The other factor, is that most of my tables have at least thousands, and
usually millions of rows. Sequential scans will never be faster for the
queries that I am doing - like I said, that is why I created the indexes.

The issue is nothing to do with special "small table" handling code.
It's all to do with not having up-to-date stats. Of course, once you've
analysed your table the system knows your index is good.

My loading is done programatically, from another format, so COPY is not
an option.

Why not? A lot of my bulk-loads are generated from other systems and I
go through a temporary-file/pipe via COPY when I can. When I don't I
block inserts into groups of e.g. 1000 and stick in an analyse/etc as
required.

Neither is removing foreign keys, as they are required to
guarantee valid data.

Ah, but you can still guarantee your data. You can wrap the whole
drop-FK, bulk-load, recreate-FK in a single transaction, and it can
still be faster. Obviously doing this on a high-activity table won't win
though, you'll have to block everyone else doing updates.

I don't really have a problem with the insert
speed when it is working properly - it is on par with other DBs that I
have on the same hardware. The problem is when it stops using the
indexes, for no good reason.

Example, last night, I kicked off a load process - this morning, it had
only managed to make it through about 600,000 rows (split across several
tables). After restarting it this morning, it made it through the same
data in 30 minutes.
If thats not bad and buggy behavior, I don't know what is....

So run ANALYSE in parallel with your load, or break the bulk-load into
blocks and analyse in-line. I'm not sure ripping out PG's cost-based
query analyser will be a popular solution just to address bulk-loads.

--
Richard Huxton
Archonet Ltd

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Dan Armbrust (#8)
Re: Slow Inserts on 1 table?

On Tue, Aug 02, 2005 at 10:01:50AM -0500, Dan Armbrust wrote:

I shouldn't have to manually run Analyze to make the DB be capable of
handling inserts involving tables with foreign keys correctly. My code
that is doing the inserts is a java application that works across
multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc.

[etc, rant removed]

You don't _have_ to be rude. This is a known limitation, and people
have suggested the usual workarounds. This is an open source project --
if you think you can make it better, please by all means post a patch.

HAND.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"El sentido de las cosas no viene de las cosas, sino de
las inteligencias que las aplican a sus problemas diarios
en busca del progreso." (Ernesto Hern�ndez-Novich)

#11Dan Armbrust
daniel.armbrust.list@gmail.com
In reply to: Richard Huxton (#9)
Re: Slow Inserts on 1 table?

My loading is done programatically, from another format, so COPY is
not an option.

Why not? A lot of my bulk-loads are generated from other systems and I
go through a temporary-file/pipe via COPY when I can. When I don't I
block inserts into groups of e.g. 1000 and stick in an analyse/etc as
required.

I guess I should clarify - my inserts are done by a Java application
running on a client machine. This isn't bulk load in the normal definition.

I don't have any problem with the speed of the inserts when they are
working correctly. The only problem is that the query analyzer is
making a really poor decision when it is executing insert statements on
tables that have foreign keys.

So run ANALYSE in parallel with your load, or break the bulk-load into
blocks and analyse in-line. I'm not sure ripping out PG's cost-based
query analyser will be a popular solution just to address bulk-loads.

I never suggested that it needed to be ripped out. It just seems that
when it is looking to check foreign keys, and the statistics are not up
to date (or have not yet been created) it should default to using the
indexes, rather than not using the indexes. The time savings of using
indexes when things are big is FAR bigger than the time savings of not
using indexes when things are small.

Dan

--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

#12Dan Armbrust
daniel.armbrust.list@gmail.com
In reply to: Alvaro Herrera (#10)
Re: Slow Inserts on 1 table?

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Alvaro Herrera wrote:
<blockquote cite="mid20050802151616.GB30066@alvh.no-ip.org" type="cite">
<pre wrap="">On Tue, Aug 02, 2005 at 10:01:50AM -0500, Dan Armbrust wrote:

</pre>
<blockquote type="cite">
<pre wrap="">I shouldn't have to manually run Analyze to make the DB be capable of
handling inserts involving tables with foreign keys correctly. My code
that is doing the inserts is a java application that works across
multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc.
</pre>
</blockquote>
<pre wrap=""><!---->
[etc, rant removed]

You don't _have_ to be rude. This is a known limitation, and people
have suggested the usual workarounds. This is an open source project --
if you think you can make it better, please by all means post a patch.

HAND.

</pre>
</blockquote>
<tt>My apologies, I don't intend to be rude.&nbsp; <br>
<br>
But it is rather easy to get into rant mode when the prevailing opinion
is that not being able to insert rows into a table with a foreign key
without running Analyze after X rows is a misuse of the DB, rather than
a bug.<br>
<br>
I did not know that this is a known limitation, I have not been able to
find any documentation that talks about how foreign keys, indexes, and
the query planner relate.<br>
<br>
My first assumption was that since foreign key creation implicitly
creates the necessary indexes, that these indexes would always be used
for foreign key checks.&nbsp; I wouldn't have even guessed that the query
planner was involved in this portion.&nbsp; But, these are all (apparently
wrong) guesses - I don't know the internals.&nbsp; The performance probably
is better on small tables to not use these indexes.&nbsp; But it seems to
me, that if you know that the statistics are out of date (which I would
think that you should know, if analyze hasn't been run since the tables
were created) that the safer choice would be to use the indexes, rather
than not using the indexes.<br>
<br>
<br>
Dan<br>
</tt><br>
<pre class="moz-signature" cols="72">--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
<a class="moz-txt-link-freetext" href="http://informatics.mayo.edu/&quot;&gt;http://informatics.mayo.edu/&lt;/a&gt;
</pre>
</body>
</html>

#13Dan Armbrust
daniel.armbrust.list@gmail.com
In reply to: Jim Nasby (#2)
Force PostgreSQL to use indexes on foreign key lookups - Was: Slow Inserts on 1 table?

An aha moment:

http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php

Some of the ensuing conversation seemed to indicate that a change was
made in the 8.0 branch in March, that was intended to fix this issue.
Any idea if that fix would have made it into the 8.0.3 release?

Or maybe the fix didn't fix the issue.

Dan

--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: John D. Burger (#5)
Re: Slow Inserts on 1 table?

"John D. Burger" <john@mitre.org> writes:

The only annoyance is that the interface I use most often, Python's
pgdb, runs everything in a transaction, and you can't analyze in a
transaction.

Hm? We've allowed ANALYZE inside a transaction for a long time.

The real solution to Dan's problem, of course, is to throw away the
cached plan for the FK check and re-plan it once the table sizes have
changed enough to invalidate the plan. Neil Conway was working on
infrastructure for this, but it didn't get done in time for 8.1 ...
maybe it will be there in 8.2.

In the meantime, though, I don't see any mention in the thread of
exactly which PG version Dan is using. If it's 8.0.0 or 8.0.1,
an update would probably help --- we tweaked the rules for never-yet-
vacuumed tables in 8.0.2.

regards, tom lane

#15Martijn van Oosterhout
kleptog@svana.org
In reply to: Dan Armbrust (#12)
Re: Slow Inserts on 1 table?

On Tue, Aug 02, 2005 at 10:41:01AM -0500, Dan Armbrust wrote:

But it is rather easy to get into rant mode when the prevailing
opinion is that not being able to insert rows into a table with a
foreign key without running Analyze after X rows is a misuse of the
DB, rather than a bug.

I havn't seen it in this thread but it reminds me of someone who had a
similar problem a long time ago. Basically, he was doing:

TRUNCATE
ANALYZE
<load database>

This screwed everything up, because the ANALYZE set the statistics to
zero size tables. The solution was: *Don't* analyze the table when it's
empty. If he left out the ANALYZE altogether it worked.

It also works because just after a CREATE TABLE it defaults to using
indexes too.

The *only* time it starts worrying about seq scans is if you run ANALYZE
on an empty table. So don't do that.

Hope this helps,

--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#16Gregory Youngblood
gsyoungblood@mac.com
In reply to: Alvaro Herrera (#10)
Re: Slow Inserts on 1 table?

On Aug 2, 2005, at 8:16 AM, Alvaro Herrera wrote:

On Tue, Aug 02, 2005 at 10:01:50AM -0500, Dan Armbrust wrote:

I shouldn't have to manually run Analyze to make the DB be capable of
handling inserts involving tables with foreign keys correctly. My
code
that is doing the inserts is a java application that works across
multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc.

[etc, rant removed]

You don't _have_ to be rude. This is a known limitation, and people
have suggested the usual workarounds. This is an open source
project --
if you think you can make it better, please by all means post a patch.

Not to fan the flames, so to speak, but I do have on question. If
this is a known limitation, I'd expect the behavior to be consistent.
Instead, he is saying that the problem is intermittent. Sometimes it
runs fine, other times it slows down to a snail's pace.

So, does that mean the known problem is indeed intermittent, and not
something that happens every time? Or, is this an issue that can be
mostly eliminated with appropriate tuning?

I ask, because I may soon be facing similar problems, and forewarned
is forearmed. I'd like to anticipate this and tune things accordingly
before it becomes an issue.

Thanks,
Greg

#17John D. Burger
john@mitre.org
In reply to: Tom Lane (#14)
Re: Slow Inserts on 1 table?

The only annoyance is that the interface I use most often, Python's
pgdb, runs everything in a transaction, and you can't analyze in a
transaction.

Hm? We've allowed ANALYZE inside a transaction for a long time.

I'm stuck with using 7.2, for now, and I get this:

ERROR: ANALYZE cannot run inside a BEGIN/END block

I'll be happy to find out that this restriction's been removed in later
versions.

- John D. Burger
MITRE

#18Joshua D. Drake
jd@commandprompt.com
In reply to: John D. Burger (#17)
Re: Slow Inserts on 1 table?

John D. Burger wrote:

The only annoyance is that the interface I use most often, Python's
pgdb, runs everything in a transaction, and you can't analyze in a
transaction.

Hm? We've allowed ANALYZE inside a transaction for a long time.

I'm stuck with using 7.2, for now, and I get this:

ERROR: ANALYZE cannot run inside a BEGIN/END block

I'll be happy to find out that this restriction's been removed in later
versions.

Your running 7.2? That is all kinds of level of... huh? Why?

Sincerely,

Joshua D. Drake

- John D. Burger
MITRE

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

#19John D. Burger
john@mitre.org
In reply to: Joshua D. Drake (#18)
Re: Slow Inserts on 1 table?

Your running 7.2? That is all kinds of level of... huh? Why?

I'm not running it, my organization is. Not sure how to interpret "all
kinds of level of..." Are there any huge suckages that I can use to
leverage an update? I'm familiar with some of the smaller ones.

- John D. Burger
G63

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: John D. Burger (#19)
Re: Slow Inserts on 1 table?

"John D. Burger" <john@mitre.org> writes:

Your running 7.2? That is all kinds of level of... huh? Why?

I'm not running it, my organization is. Not sure how to interpret "all
kinds of level of..." Are there any huge suckages that I can use to
leverage an update? I'm familiar with some of the smaller ones.

Lots, what does it take to get their attention? Feature-wise, there are
such small matters as schemas. Performance-wise, there are quite a lot
of improvements since 7.2. Security-wise, there are unfixable holes
in 7.2 (try "select cash_out(2)").

I hope you're at least on 7.2.8, else you are also vulnerable to a
number of data-loss-grade bugs. I don't recall at the moment whether
there were any data-loss issues that we deemed unfixable in 7.2.*, but
it wouldn't surprise me.

Try perusing the release notes at
http://developer.postgresql.org/docs/postgres/release.html
for ammunition.

regards, tom lane

#21John D. Burger
john@mitre.org
In reply to: Tom Lane (#20)
#22Scott Marlowe
smarlowe@g2switchworks.com
In reply to: John D. Burger (#21)
#23Martijn van Oosterhout
kleptog@svana.org
In reply to: Gregory Youngblood (#16)