Deleting 173000 records takes forever, blocks async queries for unrelated records
Hello,
in a Facebook game running on
PostgreSQL 8.4.13 and having so many players:
# select count(*) from pref_users;
count
--------
223964
I am trying to get rid of inactive users,
who just visited the canvas page, but
never played (I'm sure, Facebook has
a clever-sounding name for them):
# select count(*) from pref_users
where id not in (select distinct id from pref_money);
count
--------
173936
(1 row)
So I call:
# delete from pref_users
where id not in (select distinct id from pref_money);
but that query lasts forever and
what's more troubling me - it blocks
the async queries of my game daemon
(the Perl function pg_ready starts returning
false all the time and my game accumulates
thousands of yet-to-be-executed SQL queries).
The good news is, that my quad server
doesn't hang - I just see 1 postmaster
process at 90-100% CPU but total load is 20%.
Also my game daemon in Perl recovers
and executes the thousands of queued
up async queries, when I interrupt the
above DELETE query with CTRL-C at
the pgsql prompt - i.e. my game is not buggy.
My question is how handle this?
Why does deleting takes so long,
is it because of CASCADES?
And why does it make the pg_ready
calls of my game daemon return false?
The users I'm deleting aren't active,
they shouldn't "intersect" with the
async queries of my game daemon.
Below are the both SQL tables involved,
thank you for any insights.
Regards
Alex
# \d pref_money
Table "public.pref_money"
Column | Type | Modifiers
--------+-----------------------+-----------------------------------------
id | character varying(32) |
money | integer | not null
yw | character(7) | default to_char(now(), 'IYYY-IW'::text)
Indexes:
"pref_money_money_index" btree (money DESC)
"pref_money_yw_index" btree (yw)
Foreign-key constraints:
"pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON
DELETE CASCADE
# \d pref_users
Table "public.pref_users"
Column | Type | Modifiers
------------+-----------------------------+--------------------
id | character varying(32) | not null
first_name | character varying(64) |
last_name | character varying(64) |
female | boolean |
avatar | character varying(128) |
city | character varying(64) |
login | timestamp without time zone | default now()
last_ip | inet |
logout | timestamp without time zone |
vip | timestamp without time zone |
mail | character varying(254) |
medals | integer | not null default 0
Indexes:
"pref_users_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "pref_cards" CONSTRAINT "pref_cards_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_catch" CONSTRAINT "pref_catch_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_chat" CONSTRAINT "pref_chat_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_discuss" CONSTRAINT "pref_discuss_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_game" CONSTRAINT "pref_game_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_luck" CONSTRAINT "pref_luck_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_match" CONSTRAINT "pref_match_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_misere" CONSTRAINT "pref_misere_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_money" CONSTRAINT "pref_money_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_pass" CONSTRAINT "pref_pass_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_payment" CONSTRAINT "pref_payment_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_rep" CONSTRAINT "pref_rep_author_fkey" FOREIGN KEY
(author) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_rep" CONSTRAINT "pref_rep_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_scores" CONSTRAINT "pref_scores_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_status" CONSTRAINT "pref_status_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_votes" CONSTRAINT "pref_votes_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Alexander Farber wrote:
in a Facebook game running on
PostgreSQL 8.4.13 and having so many players:# select count(*) from pref_users;
count
--------
223964I am trying to get rid of inactive users,
who just visited the canvas page, but
never played (I'm sure, Facebook has
a clever-sounding name for them):# select count(*) from pref_users
where id not in (select distinct id from pref_money);
count
--------
173936
(1 row)So I call:
# delete from pref_users
where id not in (select distinct id from pref_money);but that query lasts forever and
what's more troubling me - it blocks
the async queries of my game daemon
(the Perl function pg_ready starts returning
false all the time and my game accumulates
thousands of yet-to-be-executed SQL queries).The good news is, that my quad server
doesn't hang - I just see 1 postmaster
process at 90-100% CPU but total load is 20%.
I hope it's a backend and not the postmaster.
Also my game daemon in Perl recovers
and executes the thousands of queued
up async queries, when I interrupt the
above DELETE query with CTRL-C at
the pgsql prompt - i.e. my game is not buggy.My question is how handle this?
Why does deleting takes so long,
is it because of CASCADES?And why does it make the pg_ready
calls of my game daemon return false?
The users I'm deleting aren't active,
they shouldn't "intersect" with the
async queries of my game daemon.Below are the both SQL tables involved,
thank you for any insights.
It would help if you send EXPLAIN output for the DELETE statement.
# \d pref_money
Table "public.pref_money"
Column | Type | Modifiers
--------+-----------------------+-----------------------------------------
id | character varying(32) |
money | integer | not null
yw | character(7) | default to_char(now(), 'IYYY-IW'::text)
Indexes:
"pref_money_money_index" btree (money DESC)
"pref_money_yw_index" btree (yw)
Foreign-key constraints:
"pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON
DELETE CASCADE
The problem is very likely that you have no index on the "id" column.
That means that for each id deleted from pref_users, the cascading
delete has to perform a full table scan on pref_money to find the
corresponding rows. This is very likely the cause of your problem.
Indeed, this table does not have a primary key. That is usually a bad
idea. How about PRIMARY KEY (id, yw)?
That should speed up the query.
# \d pref_users
Table "public.pref_users"
Column | Type | Modifiers
------------+-----------------------------+--------------------
id | character varying(32) | not null
[...]
Indexes:
"pref_users_pkey" PRIMARY KEY, btree (id)
Referenced by:
[many tables]
While you are at it, check all the other tables referencing pref_users
and make sure that they have an index on the referencing column.
Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, 1 Feb 2013 10:38:26 +0100, Alexander Farber wrote:
Hello,
in a Facebook game running on
PostgreSQL 8.4.13 and having so many players:# select count(*) from pref_users;
count
--------
223964I am trying to get rid of inactive users,
who just visited the canvas page, but
never played (I'm sure, Facebook has
a clever-sounding name for them):# select count(*) from pref_users
where id not in (select distinct id from pref_money);
count
--------
173936
(1 row)So I call:
# delete from pref_users
where id not in (select distinct id from pref_money);but that query lasts forever and
what's more troubling me - it blocks
the async queries of my game daemon
(the Perl function pg_ready starts returning
false all the time and my game accumulates
thousands of yet-to-be-executed SQL queries).The good news is, that my quad server
doesn't hang - I just see 1 postmaster
process at 90-100% CPU but total load is 20%.Also my game daemon in Perl recovers
and executes the thousands of queued
up async queries, when I interrupt the
above DELETE query with CTRL-C at
the pgsql prompt - i.e. my game is not buggy.My question is how handle this?
Why does deleting takes so long,
is it because of CASCADES?And why does it make the pg_ready
calls of my game daemon return false?
The users I'm deleting aren't active,
they shouldn't "intersect" with the
async queries of my game daemon.Below are the both SQL tables involved,
thank you for any insights.Regards
Alex# \d pref_money
Table "public.pref_money"
Column | Type | Modifiers
--------+-----------------------+-----------------------------------------
id | character varying(32) |
money | integer | not null
yw | character(7) | default to_char(now(), 'IYYY-IW'::text)
Indexes:
"pref_money_money_index" btree (money DESC)
"pref_money_yw_index" btree (yw)
Foreign-key constraints:
"pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON
DELETE CASCADE
I believe an index on pref_money.id could help.
Without it Postgres is, if I understand correctly, making 173936 table scans on pref_money
to try (and fail) to find for each deleted row the referencing row in that table.
# \d pref_users
[...]
--
Michał Politowski
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi,
What Albe said about PKs. I'm also not very fond of people using text
in keys (even if it's a varchar or a char field). Test the same thing
with a numeric key and you are likely to see a difference.
Apart from making sure your design is ok, you might want to keep this
stuff well in the background, firing it as a frequent (and small) job.
Sort of "your own private vacuum job".
Something like
delete from pref_users
where
id not in ( select distinct id
from pref_money )
limit 100;
Where 100 may be any number of records you find to be compatible with
a smooth performance. Keep calling this stuff at a suitable interval
(X secs in between each call), and it will silently do the cleaning
without creating giant transactions.
You definitely want to make a good design BEFORE doing this, though.
Cheers
Bèrto
On 1 February 2013 09:38, Alexander Farber <alexander.farber@gmail.com> wrote:
Hello,
in a Facebook game running on
PostgreSQL 8.4.13 and having so many players:# select count(*) from pref_users;
count
--------
223964I am trying to get rid of inactive users,
who just visited the canvas page, but
never played (I'm sure, Facebook has
a clever-sounding name for them):# select count(*) from pref_users
where id not in (select distinct id from pref_money);
count
--------
173936
(1 row)So I call:
# delete from pref_users
where id not in (select distinct id from pref_money);but that query lasts forever and
what's more troubling me - it blocks
the async queries of my game daemon
(the Perl function pg_ready starts returning
false all the time and my game accumulates
thousands of yet-to-be-executed SQL queries).The good news is, that my quad server
doesn't hang - I just see 1 postmaster
process at 90-100% CPU but total load is 20%.Also my game daemon in Perl recovers
and executes the thousands of queued
up async queries, when I interrupt the
above DELETE query with CTRL-C at
the pgsql prompt - i.e. my game is not buggy.My question is how handle this?
Why does deleting takes so long,
is it because of CASCADES?And why does it make the pg_ready
calls of my game daemon return false?
The users I'm deleting aren't active,
they shouldn't "intersect" with the
async queries of my game daemon.Below are the both SQL tables involved,
thank you for any insights.Regards
Alex# \d pref_money
Table "public.pref_money"
Column | Type | Modifiers
--------+-----------------------+-----------------------------------------
id | character varying(32) |
money | integer | not null
yw | character(7) | default to_char(now(), 'IYYY-IW'::text)
Indexes:
"pref_money_money_index" btree (money DESC)
"pref_money_yw_index" btree (yw)
Foreign-key constraints:
"pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON
DELETE CASCADE# \d pref_users
Table "public.pref_users"
Column | Type | Modifiers
------------+-----------------------------+--------------------
id | character varying(32) | not null
first_name | character varying(64) |
last_name | character varying(64) |
female | boolean |
avatar | character varying(128) |
city | character varying(64) |
login | timestamp without time zone | default now()
last_ip | inet |
logout | timestamp without time zone |
vip | timestamp without time zone |
mail | character varying(254) |
medals | integer | not null default 0
Indexes:
"pref_users_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "pref_cards" CONSTRAINT "pref_cards_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_catch" CONSTRAINT "pref_catch_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_chat" CONSTRAINT "pref_chat_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_discuss" CONSTRAINT "pref_discuss_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_game" CONSTRAINT "pref_game_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_luck" CONSTRAINT "pref_luck_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_match" CONSTRAINT "pref_match_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_misere" CONSTRAINT "pref_misere_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_money" CONSTRAINT "pref_money_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_pass" CONSTRAINT "pref_pass_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_payment" CONSTRAINT "pref_payment_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_rep" CONSTRAINT "pref_rep_author_fkey" FOREIGN KEY
(author) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_rep" CONSTRAINT "pref_rep_id_fkey" FOREIGN KEY (id)
REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_scores" CONSTRAINT "pref_scores_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_status" CONSTRAINT "pref_status_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE
TABLE "pref_votes" CONSTRAINT "pref_votes_id_fkey" FOREIGN KEY
(id) REFERENCES pref_users(id) ON DELETE CASCADE--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
==============================
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Thanks, I will add an index(id, yw) and check.
The limit suggestion (yes, I already have a "vacuum"
cronjob to purge spam-users from drupal_users etc.)
doesn't work:
# delete from pref_users
where id not in (select distinct id from pref_money) limit 10;
ERROR: syntax error at or near "limit"
LINE 2: ...ere id not in (select distinct id from pref_money) limit 10;
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
righto. You need a CTE to do that
create table deletable (
id bigint not null primary key);
create table condition ( id bigint not null primary key);
insert into deletable select generate_series(1,500);
insert into condition select generate_series(1,50);
WITH target AS (
select id
from deletable d
where
d.id not in (select id from condition)
limit 10 )
delete
from deletable
where
id in (select id from target);
This will open a hole from 51 to 60, next time from 61 to 70 etc...
Cheers
Bèrto
On 1 February 2013 11:30, Alexander Farber <alexander.farber@gmail.com> wrote:
Thanks, I will add an index(id, yw) and check.
The limit suggestion (yes, I already have a "vacuum"
cronjob to purge spam-users from drupal_users etc.)
doesn't work:# delete from pref_users
where id not in (select distinct id from pref_money) limit 10;
ERROR: syntax error at or near "limit"
LINE 2: ...ere id not in (select distinct id from pref_money) limit 10;--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
==============================
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
# delete from pref_users
where id not in (select distinct id from pref_money) limit 10;
ERROR: syntax error at or near "limit"
LINE 2: ...ere id not in (select distinct id from pref_money) limit 10;
Or this?
DELETE FROM pref_users
WHERE id IN (
SELECT id
FROM pref_users
WHERE id NOT IN (
SELECT DISTINCT id
FROM pref_money
)
LIMIT 10
);
Saw elsewhere another suggestion with EXCEPT, which is also sexy.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general