Interesting new bug?

Started by Tim Perdueover 25 years ago4 messages
#1Tim Perdue
tim@sourceforge.net

Allright, I'm running 7.0.2 with Tom Lane's backwards index scan patch
applied.

I'm attempting to select out of a large table (10GB) with about 4
million rows, and it winds up just sitting and doing "nothing" forever.
If I check the process list, I see it using about 9% of the CPU.

This table is vacuum analyzed nightly - here's a description and EXPLAIN
from the query I'm trying to run.

Any ideas? I haven't been able to run the admin pages on Geocrawler ever
since I upgraded to 7.0.2

Tim

db_geocrawler=# \d tbl_mail_archive
Table "tbl_mail_archive"
Attribute | Type |
Modifier
----------------------+----------+----------------------------------------------
fld_mailid | integer | not null default
nextval('seq_mailid'::text)
fld_mail_list | integer |
fld_mail_date | char(14) |
fld_mail_is_followup | integer |
fld_mail_from | text |
fld_mail_subject | text |
fld_mail_body | text |
fld_mail_email | text |
fld_mail_year | integer |
fld_mail_month | integer |
Indices: idx_archive_list,
idx_archive_list_date,
idx_archive_year,
idx_mail_archive_list_yr_mo,
tbl_mail_archive_pkey

I'm manually deleting the rows without knowing what they are - and
that's bad - this query shows that the rows do exist, but for some
reason you can't select them out of the db.

db_geocrawler=# begin;
BEGIN
db_geocrawler=# delete from tbl_mail_archive where fld_mail_list=0;
DELETE 1032
db_geocrawler=# delete from tbl_mail_chunks where fld_mail_list=0;
DELETE 39
db_geocrawler=# commit;
COMMIT

db_geocrawler=# explain SELECT * FROM tbl_mail_archive WHERE
fld_mail_list=0 ORDER BY fld_mailid ASC LIMIT 10 OFFSET 0;
NOTICE: QUERY PLAN:

Index Scan using tbl_mail_archive_pkey on tbl_mail_archive
(cost=0.00..6402391.68 rows=19357 width=80)

EXPLAIN

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Perdue (#1)
Re: Interesting new bug?

Tim Perdue <tim@sourceforge.net> writes:

I'm attempting to select out of a large table (10GB) with about 4
million rows, and it winds up just sitting and doing "nothing" forever.

db_geocrawler=# explain SELECT * FROM tbl_mail_archive WHERE
fld_mail_list=0 ORDER BY fld_mailid ASC LIMIT 10 OFFSET 0;
NOTICE: QUERY PLAN:

Index Scan using tbl_mail_archive_pkey on tbl_mail_archive
(cost=0.00..6402391.68 rows=19357 width=80)

Interesting. Since there's no explicit sort in the plan, I infer that
index tbl_mail_archive_pkey is on fld_mailid, meaning that the indexscan
yields data already sorted by fld_mailid --- otherwise a sort step would
be needed. Evidently the optimizer is guessing that "scan in fld_mailid
order until you have 10 rows where fld_mail_list=0" is faster than
"find all rows with fld_mail_list=0 and then sort by fld_mailid".

Since you're complaining, I guess that this is not so :-( ... but I'm
not sure how the optimizer might be taught to guess that. What exactly
are the indexes *on* here; how many rows are in the table; and how many
rows satisfy fld_mail_list=0?

regards, tom lane

#3Tim Perdue
tim@sourceforge.net
In reply to: Tim Perdue (#1)
Re: Interesting new bug?

Tom Lane wrote:

Tim Perdue <tim@sourceforge.net> writes:

I'm attempting to select out of a large table (10GB) with about 4
million rows, and it winds up just sitting and doing "nothing" forever.

db_geocrawler=# explain SELECT * FROM tbl_mail_archive WHERE
fld_mail_list=0 ORDER BY fld_mailid ASC LIMIT 10 OFFSET 0;
NOTICE: QUERY PLAN:

Index Scan using tbl_mail_archive_pkey on tbl_mail_archive
(cost=0.00..6402391.68 rows=19357 width=80)

Interesting. Since there's no explicit sort in the plan, I infer that
index tbl_mail_archive_pkey is on fld_mailid, meaning that the indexscan
yields data already sorted by fld_mailid --- otherwise a sort step would
be needed. Evidently the optimizer is guessing that "scan in fld_mailid
order until you have 10 rows where fld_mail_list=0" is faster than
"find all rows with fld_mail_list=0 and then sort by fld_mailid".

Since you're complaining, I guess that this is not so :-( ... but I'm
not sure how the optimizer might be taught to guess that. What exactly
are the indexes *on* here; how many rows are in the table; and how many
rows satisfy fld_mail_list=0?

There is an index on fld_mail_list and there were 1093 rows that matched
out of about 4.1 million.

I wonder if this is the same problem we had before where I need to order
by fld_mail_list, fld_mailid instead of just on fld_mailid. If so, you
need to get that fixed in the optimizer.

db_geocrawler=# explain
db_geocrawler-# SELECT * FROM tbl_mail_archive WHERE
db_geocrawler-# fld_mail_list=0 ORDER BY fld_mail_list ASC,fld_mailid
ASC LIMIT 10 OFFSET 0;
NOTICE: QUERY PLAN:

Sort (cost=78282.54..78282.54 rows=19357 width=80)
-> Index Scan using idx_archive_list on tbl_mail_archive
(cost=0.00..76904.24 rows=19357 width=80)

EXPLAIN

Notice how it is now using the right index, because I am doing a sort on
fld_mail_list first.

Tim

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723

#4Tim Perdue
tim@sourceforge.net
In reply to: Tim Perdue (#1)
Re: Interesting new bug?

What did you think of this? I fixed my problem by changing my query -
but I shouldn't have had to. This looks like a weakness in your
optimizer, having to first sort on criteria that you don't care about.

Tim

Tim Perdue wrote:

Tom Lane wrote:

Tim Perdue <tim@sourceforge.net> writes:

I'm attempting to select out of a large table (10GB) with about 4
million rows, and it winds up just sitting and doing "nothing" forever.

db_geocrawler=# explain SELECT * FROM tbl_mail_archive WHERE
fld_mail_list=0 ORDER BY fld_mailid ASC LIMIT 10 OFFSET 0;
NOTICE: QUERY PLAN:

Index Scan using tbl_mail_archive_pkey on tbl_mail_archive
(cost=0.00..6402391.68 rows=19357 width=80)

Interesting. Since there's no explicit sort in the plan, I infer that
index tbl_mail_archive_pkey is on fld_mailid, meaning that the indexscan
yields data already sorted by fld_mailid --- otherwise a sort step would
be needed. Evidently the optimizer is guessing that "scan in fld_mailid
order until you have 10 rows where fld_mail_list=0" is faster than
"find all rows with fld_mail_list=0 and then sort by fld_mailid".

Since you're complaining, I guess that this is not so :-( ... but I'm
not sure how the optimizer might be taught to guess that. What exactly
are the indexes *on* here; how many rows are in the table; and how many
rows satisfy fld_mail_list=0?

There is an index on fld_mail_list and there were 1093 rows that matched
out of about 4.1 million.

I wonder if this is the same problem we had before where I need to order
by fld_mail_list, fld_mailid instead of just on fld_mailid. If so, you
need to get that fixed in the optimizer.

db_geocrawler=# explain
db_geocrawler-# SELECT * FROM tbl_mail_archive WHERE
db_geocrawler-# fld_mail_list=0 ORDER BY fld_mail_list ASC,fld_mailid
ASC LIMIT 10 OFFSET 0;
NOTICE: QUERY PLAN:

Sort (cost=78282.54..78282.54 rows=19357 width=80)
-> Index Scan using idx_archive_list on tbl_mail_archive
(cost=0.00..76904.24 rows=19357 width=80)

EXPLAIN

Notice how it is now using the right index, because I am doing a sort on
fld_mail_list first.

--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723