Updates, deletes and inserts are very slow. What can I do make them bearable?

Started by Tim Uckunover 15 years ago27 messagesgeneral
Jump to latest
#1Tim Uckun
timuckun@gmail.com

I have a very simple update query.

update cu
set screenshot_file_name = tu.screenshot_file_name,
screenshot_content_type = tu.screenshot_content_type,
screenshot_file_size = tu.screenshot_file_size,
screenshot_status = tu.screenshot_status

from cu
inner join tu on tu.cu_id = cu.id

The join has five to six thousand rows in it.

The only indexed field in the update is screenshot_status which is an integer.

I changed the checkpoint settings to

checkpoint_segments = 256
checkpoint_completion_target = 0.9

but it still does not help.

I am having similar problems with deletes and inserts. Trying to
delete even a few thousand records takes forever. The selects seem to
be just fine.

I am running this on my laptop with no other processes hitting the
database. It's a i5 with lots of RAM and quad core and a IDE drive.

Where is the FAST button for postgres updates? What parameter do I
have to set in order to update 6000 records in under an hour?

#2Gary Chambers
gwchamb@gmail.com
In reply to: Tim Uckun (#1)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

Where is the FAST button for postgres updates? What parameter do I
have to set in order to update 6000 records in under an hour?

Which version of Postgres? Have you investigated more than just two
performance tuning parameters? Does your MS Access version of the
query run any faster?

http://tinyurl.com/2fsjmv4

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */

#3Tim Uckun
timuckun@gmail.com
In reply to: Gary Chambers (#2)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Thu, Oct 21, 2010 at 3:37 PM, Gary Chambers <gwchamb@gmail.com> wrote:

Where is the FAST button for postgres updates? What parameter do I
have to set in order to update 6000 records in under an hour?

Which version of Postgres?

8.4

 >Have you investigated more than just two

performance tuning parameters?

I followed the advice I found when searching this mailing list.

Does your MS Access version of the
query run any faster?

I don't have access but I suspect it would. How long should it take
to update three fields in 6000 records in your opinion.

http://tinyurl.com/2fsjmv4

Ah yes I was expecting something snarky. Thanks.

BTW I have read many of those links and have adjusted some values but
honestly there are so many buttons to push and knobs to dial that it's
hard to know what will fix what.

Also searching the mailing list shows that this is a pretty commonly
asked question but I wasn't able to find an answer other than
increasing the parameters I listed.

#4Greg Smith
gsmith@gregsmith.com
In reply to: Tim Uckun (#3)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

Tim Uckun wrote:

BTW I have read many of those links and have adjusted some values but
honestly there are so many buttons to push and knobs to dial that it's
hard to know what will fix what.

Generally update/delete tuning goes like this:

1) Increase checkpoint_segments (>64, increases beyond that can be
helpful but they eventually level out)
2) Increase shared_buffers (~25% of RAM is normal)
3) Confirm there are no constraints or foreign keys happening at each update
4) Make sure your indexes aren't filled with junk and that VACUUM is
running effectively. REINDEX or CLUSTER tables that haven't been well
maintained in the past.
5) Upgrade to better hardware that has a battery-backed write cache
- or -
Disable synchronous_commit and cheat on individual commits, at the
expense of potential lost transactions after a crash.

Updating rows in PostgreSQL is one of the most intensive things you do
to your disks, and it's hard to get a laptop drive to do a very good job
at that.

--
Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
PostgreSQL 9.0 High Performance http://www.2ndquadrant.com/books

#5Tim Uckun
timuckun@gmail.com
In reply to: Greg Smith (#4)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

1) Increase checkpoint_segments (>64, increases beyond that can be helpful
but they eventually level out)

Changed it back to 64 from 256

2) Increase shared_buffers (~25% of RAM is normal)

Changed it to one gig (25% of my RAM) obviously this involved changing
the shmmax and shmall settings in the kernel.

3) Confirm there are no constraints or foreign keys happening at each update

There are none.

4) Make sure your indexes aren't filled with junk and that VACUUM is running
effectively.  REINDEX or CLUSTER tables that haven't been well maintained in
the past.

Autovacuum is on. I presume it's doing it's job. Didn't re-index or
recluster because it's a dev database and the data rarely changes.

5) Upgrade to better hardware that has a battery-backed write cache

Not for my laptop.

Disable synchronous_commit and cheat on individual commits, at the expense
of potential lost transactions after a crash.

I will leave this as a last resort.

Updating rows in PostgreSQL is one of the most intensive things you do to
your disks, and it's hard to get a laptop drive to do a very good job at
that.

After making the above changes I re-ran the query. It's been running
for five minutes and it's still running.

This is a database with nothing else hitting it.

So obviously something else is out of kilter.

I'll ask the same question I asked Gary.

Say I just apt-get install postgres and do nothing else. One table has
about 500K records. The other has about 5K records. The joins are on
indexed integer fields (one is the primary key). How long should it
take to update five to six thousand records in your experience? Out
of the box with no tuning. How long should this take on an almost new
laptop, four gigs of RAM, i5 quad core processor?

#6Tim Uckun
timuckun@gmail.com
In reply to: Tim Uckun (#5)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

To follow up...

I did a full vacuum analyze on both tables and re-ran the query. Same
story. I ended the query after eight minutes.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Uckun (#1)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

Tim Uckun <timuckun@gmail.com> writes:

I have a very simple update query.

update cu
set screenshot_file_name = tu.screenshot_file_name,
screenshot_content_type = tu.screenshot_content_type,
screenshot_file_size = tu.screenshot_file_size,
screenshot_status = tu.screenshot_status
from cu
inner join tu on tu.cu_id = cu.id

That isn't actually the query you're issuing, because if it were
you would get an error "table name "cu" specified more than once".

I suspect that the query you're actually issuing involves an
unconstrained cartesian product self-join between the target table
and another instance of itself. Postgres doesn't consider that
the target table should be named again in FROM. But it's hard to
be sure about that when looking at a redacted query.

regards, tom lane

#8Tim Uckun
timuckun@gmail.com
In reply to: Tom Lane (#7)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Thu, Oct 21, 2010 at 5:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Tim Uckun <timuckun@gmail.com> writes:

I have a very simple update query.

update cu
set screenshot_file_name = tu.screenshot_file_name,
    screenshot_content_type  = tu.screenshot_content_type,
    screenshot_file_size = tu.screenshot_file_size,
    screenshot_status  = tu.screenshot_status
from  cu
inner join tu on tu.cu_id = cu.id

That isn't actually the query you're issuing, because if it were
you would get an error "table name "cu" specified more than once".

Sorry I shortened the names of the tables.

I suspect that the query you're actually issuing involves an
unconstrained cartesian product self-join between the target table
and another instance of itself.  Postgres doesn't consider that
the target table should be named again in FROM.  But it's hard to
be sure about that when looking at a redacted query.

I apologize for the redacted query. I was trying to make it easier to follow.

Here is the actual query.

update consolidated_urls
set screenshot_file_name = tu.screenshot_file_name,
screenshot_content_type = tu.screenshot_content_type,
screenshot_file_size = tu.screenshot_file_size,
screenshot_status = tu.screenshot_status

from consolidated_urls cu
inner join trending_urls tu on tu.consolidated_url_id = cu.id

This is a simple inner join.

select count(cu.id)
from consolidated_urls cu
inner join trending_urls tu on tu.consolidated_url_id = cu.id

yields 3657 records.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Uckun (#8)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

Tim Uckun <timuckun@gmail.com> writes:

Here is the actual query.

update consolidated_urls
set screenshot_file_name = tu.screenshot_file_name,
screenshot_content_type = tu.screenshot_content_type,
screenshot_file_size = tu.screenshot_file_size,
screenshot_status = tu.screenshot_status
from consolidated_urls cu
inner join trending_urls tu on tu.consolidated_url_id = cu.id

This is a simple inner join.

No, it isn't. This is a three-way join between consolidated_urls, cu,
and tu --- the fact that cu is the same underlying table as
consolidated_urls doesn't change that. And the join is
underconstrained, causing each row of consolidated_urls to be joined
to every row of the cu/tu join. That's why it's taking such an
unreasonably long time --- you're generating many thousands of redundant
updates to each row of consolidated_urls. You should just write this as

update consolidated_urls
set screenshot_file_name = tu.screenshot_file_name,
screenshot_content_type = tu.screenshot_content_type,
screenshot_file_size = tu.screenshot_file_size,
screenshot_status = tu.screenshot_status
from trending_urls tu where tu.consolidated_url_id = consolidated_urls.id

Postgres is a bit different from some other DBMSes in how it interprets
UPDATE ... FROM syntax.

regards, tom lane

#10Tim Uckun
timuckun@gmail.com
In reply to: Tom Lane (#9)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

No, it isn't.  This is a three-way join between consolidated_urls, cu,
and tu --- the fact that cu is the same underlying table as

cu is an alias for consolidated_urls. tu is an alias for trending_urls.

There are only two tables in the query.

consolidated_urls doesn't change that.  And the join is
underconstrained, causing each row of consolidated_urls to be joined
to every row of the cu/tu join.  That's why it's taking such an
unreasonably long time --- you're generating many thousands of redundant
updates to each row of consolidated_urls.  You should just write this as

Once again there are only two tables in the query. The join clause is
inner join trending_urls tu on tu.consolidated_url_id = cu.id

update consolidated_urls
set screenshot_file_name = tu.screenshot_file_name,
  screenshot_content_type  = tu.screenshot_content_type,
  screenshot_file_size = tu.screenshot_file_size,
  screenshot_status  = tu.screenshot_status
from trending_urls tu where tu.consolidated_url_id = consolidated_urls.id

Postgres is a bit different from some other DBMSes in how it interprets
UPDATE ... FROM syntax.

I'll try this too.

Anything to make this query complete in a reasonable amount of time.

#11Tim Uckun
timuckun@gmail.com
In reply to: Tim Uckun (#6)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

One more follow up.

Did a vacuum full on both tables and a re-index on both tables.

Changed the wal_buffers to 16MB (increased the kernel param as a
result) as per
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

I also turned off fsync but still no joy.

#12Thomas Kellerer
spam_eater@gmx.net
In reply to: Tim Uckun (#10)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

Tim Uckun, 21.10.2010 07:05:

No, it isn't. This is a three-way join between consolidated_urls, cu,
and tu --- the fact that cu is the same underlying table as

cu is an alias for consolidated_urls. tu is an alias for trending_urls.

There are only two tables in the query.

Yes, but consolidated_urls is there twice. Which makes it three relations involved in the update
(consolidated_urls, cu and tu)

That's what Tom meant and that's where your cartesian product comes from.

select count(cu.id)
from consolidated_urls cu
inner join trending_urls tu on tu.consolidated_url_id = cu.id

That select is not the same as your UPDATE statement.

If your update statement was re-written to a plain SELECT it would be something like

select count(consolidated_urls.id)
from consolidated_urls, consolidated_urls cu
inner join trending_urls tu on tu.consolidated_url_id = cu.id

See the difference?

Regards
Thomas

#13Scott Marlowe
scott.marlowe@gmail.com
In reply to: Tim Uckun (#10)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Wed, Oct 20, 2010 at 11:05 PM, Tim Uckun <timuckun@gmail.com> wrote:

No, it isn't.  This is a three-way join between consolidated_urls, cu,
and tu --- the fact that cu is the same underlying table as

cu is an alias for consolidated_urls. tu is an alias for trending_urls.

There are only two tables in the query.

What does

explain update (your query continues) say about it? I'm betting you
get a nice big row count in there somewhere.

#14Richard Broersma
richard.broersma@gmail.com
In reply to: Tim Uckun (#1)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Wed, Oct 20, 2010 at 7:24 PM, Tim Uckun <timuckun@gmail.com> wrote:

update cu
set screenshot_file_name = tu.screenshot_file_name,
   screenshot_content_type  = tu.screenshot_content_type,
   screenshot_file_size = tu.screenshot_file_size,
   screenshot_status  = tu.screenshot_status

from  cu
inner join tu on tu.cu_id = cu.id

I am having similar problems with deletes and inserts. Trying to
delete even a few thousand records takes forever.  The selects seem to
be just fine.

Where is the FAST button for postgres updates? What parameter do I
have to set in order to update 6000 records in under an hour?

Is this a pass-through query or is it an ordinary query in Access?
If it is an an ordinary query, I'd expect that to be one cause since
MS-Access will re-write this query so that it updates a single tuple
at a time. So your single update statement becomes 6000 single tuple
update statements. This is part of MS-Access's optimistic locking
mechanism.

Also, this might be an ODBC issue (I have the sample problem on one of
my laptop that is memory constrained but I haven't taken the time to
identify the actual problem). What happens when you issue this query
directly from PSQL, does the query run much faster.

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

#15Brian Hirt
bhirt@me.com
In reply to: Tim Uckun (#10)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

There are only two tables in the query.

Tim,

No, your query is written incorrectly. I don't understand why you come on to this list all hostile and confrontational. Regardless, people still try to help you and then you still ignore the advice of people that are giving you the solutions to your problems.

--brian

#16Scott Marlowe
scott.marlowe@gmail.com
In reply to: Brian Hirt (#15)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Thu, Oct 21, 2010 at 9:33 AM, Brian Hirt <bhirt@me.com> wrote:

There are only two tables in the query.

Tim,

No, your query is written incorrectly.  I don't understand why you come on to this list all hostile and confrontational.  Regardless, people still try to help you and then you still ignore the advice of people that are giving you the solutions to your problems.

Maybe he's used to paid commercial support where people are often
quite rude and hostile to the support staff to try and "motivate" them
or something? I've seen it before for sure.

Again, OP, what does EXPLAIN say about this query?

#17Rob Sargent
robjsargent@gmail.com
In reply to: Scott Marlowe (#16)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On 10/21/2010 10:27 AM, Scott Marlowe wrote:

On Thu, Oct 21, 2010 at 9:33 AM, Brian Hirt <bhirt@me.com> wrote:

There are only two tables in the query.

Tim,

No, your query is written incorrectly. I don't understand why you come on to this list all hostile and confrontational. Regardless, people still try to help you and then you still ignore the advice of people that are giving you the solutions to your problems.

Maybe he's used to paid commercial support where people are often
quite rude and hostile to the support staff to try and "motivate" them
or something? I've seen it before for sure.

Again, OP, what does EXPLAIN say about this query?

Maybe I should re-read, but I didn't feel any confrontation.
Frustration for sure. OP has clearly tried pretty hard, on some tricky
bits too, but I'm betting all for naught if (as seems likely) it's just
mistaken sql. "update from" is NOT straight forward.

#18Scott Marlowe
scott.marlowe@gmail.com
In reply to: Rob Sargent (#17)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Thu, Oct 21, 2010 at 10:37 AM, Rob Sargent <robjsargent@gmail.com> wrote:

On 10/21/2010 10:27 AM, Scott Marlowe wrote:

On Thu, Oct 21, 2010 at 9:33 AM, Brian Hirt <bhirt@me.com> wrote:

There are only two tables in the query.

Tim,

No, your query is written incorrectly.  I don't understand why you come on to this list all hostile and confrontational.  Regardless, people still try to help you and then you still ignore the advice of people that are giving you the solutions to your problems.

Maybe he's used to paid commercial support where people are often
quite rude and hostile to the support staff to try and "motivate" them
or something?  I've seen it before for sure.

Again, OP, what does EXPLAIN say about this query?

Maybe I should re-read, but I didn't feel any confrontation.
Frustration for sure.  OP has clearly tried pretty hard, on some tricky
bits too, but I'm betting all for naught if (as seems likely) it's just
mistaken sql.  "update from" is NOT straight forward.

True. His only real snark was in reponse to the "let me google that
for you" link. OTOH, he's arguing with Tom Lane about whether his SQL
is well formed. There's arguing on the internet is stupid, then
there's arguing with Tom Lane about SQL is stupid.

#19Roberto Scattini
roberto.scattini@gmail.com
In reply to: Rob Sargent (#17)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On Thu, Oct 21, 2010 at 1:37 PM, Rob Sargent <robjsargent@gmail.com> wrote:

On 10/21/2010 10:27 AM, Scott Marlowe wrote:

On Thu, Oct 21, 2010 at 9:33 AM, Brian Hirt <bhirt@me.com> wrote:

There are only two tables in the query.

Tim,

No, your query is written incorrectly. I don't understand why you come

on to this list all hostile and confrontational. Regardless, people still
try to help you and then you still ignore the advice of people that are
giving you the solutions to your problems.

Maybe he's used to paid commercial support where people are often
quite rude and hostile to the support staff to try and "motivate" them
or something? I've seen it before for sure.

Again, OP, what does EXPLAIN say about this query?

Maybe I should re-read, but I didn't feel any confrontation.
Frustration for sure. OP has clearly tried pretty hard, on some tricky
bits too, but I'm betting all for naught if (as seems likely) it's just
mistaken sql. "update from" is NOT straight forward.

i had the same feeling than brian and scott, but i am one of the
"questioners" not the "answerers" on the list, so i didnt said anything...
but the "i-am-touching-many-parameters-and-doesnt-work" and then complain is
not one of the best strategies...
the SQL sentence is far more complicated than it pretended to be...

--
Roberto Scattini
___ _
))_) __ )L __
((__)(('(( ((_)

#20Rob Sargent
robjsargent@gmail.com
In reply to: Scott Marlowe (#18)
Re: Updates, deletes and inserts are very slow. What can I do make them bearable?

On 10/21/2010 10:45 AM, Scott Marlowe wrote:

On Thu, Oct 21, 2010 at 10:37 AM, Rob Sargent <robjsargent@gmail.com> wrote:

On 10/21/2010 10:27 AM, Scott Marlowe wrote:

On Thu, Oct 21, 2010 at 9:33 AM, Brian Hirt <bhirt@me.com> wrote:

There are only two tables in the query.

Tim,

No, your query is written incorrectly. I don't understand why you come on to this list all hostile and confrontational. Regardless, people still try to help you and then you still ignore the advice of people that are giving you the solutions to your problems.

Maybe he's used to paid commercial support where people are often
quite rude and hostile to the support staff to try and "motivate" them
or something? I've seen it before for sure.

Again, OP, what does EXPLAIN say about this query?

Maybe I should re-read, but I didn't feel any confrontation.
Frustration for sure. OP has clearly tried pretty hard, on some tricky
bits too, but I'm betting all for naught if (as seems likely) it's just
mistaken sql. "update from" is NOT straight forward.

True. His only real snark was in reponse to the "let me google that
for you" link. OTOH, he's arguing with Tom Lane about whether his SQL
is well formed. There's arguing on the internet is stupid, then
there's arguing with Tom Lane about SQL is stupid.

Have to admit when I saw that I said to myself OP needs someone to tell
him "whoa, big fella". I've been in similar situations where I was
"sure" of one thing and the problem must be elsewhere, when of course I
was wrong about the one thing...

#21Scott Marlowe
scott.marlowe@gmail.com
In reply to: Rob Sargent (#20)
#22Tim Uckun
timuckun@gmail.com
In reply to: Brian Hirt (#15)
#23Tim Uckun
timuckun@gmail.com
In reply to: Rob Sargent (#20)
#24Scott Marlowe
scott.marlowe@gmail.com
In reply to: Tim Uckun (#22)
#25Tim Uckun
timuckun@gmail.com
In reply to: Scott Marlowe (#24)
#26Greg Smith
gsmith@gregsmith.com
In reply to: Tim Uckun (#22)
#27Tim Uckun
timuckun@gmail.com
In reply to: Greg Smith (#26)