Sequential scan faster than index

Started by Arthur Ramseyabout 3 years ago3 messagesgeneral
Jump to latest
#1Arthur Ramsey
arthur.ramsey@code42.com

I'm trying to figure out why a sequential scan is out performing. I've
tried psql 13.7, psql14.6 and REINDEX. The REINDEX didn't help. This is
on an RDS instance that's a db.m5.large (2 * vCPU, 8 GB memory) with 200
storage on io1 with 10000 provisioned IOPS. I restarted the RDS instance
in between each test.

Schema: https://pastebin.com/bZvTLYGa

psql 13.7: https://pastebin.com/QtHXJtxq
psql 14.6: https://pastebin.com/PfVbCPJp
psql 14.6 with enable_seqscan=OFF: https://pastebin.com/i0g2NfUh
psql 14.6 with enable_sort=OFF: https://pastebin.com/9v9SGunw

Thanks,
Arthur

#2Luca Ferrari
fluca1978@gmail.com
In reply to: Arthur Ramsey (#1)
Re: Sequential scan faster than index

On Tue, Feb 21, 2023 at 10:30 AM Arthur Ramsey <arthur.ramsey@code42.com> wrote:

I'm trying to figure out why a sequential scan is out performing. I've tried psql 13.7, psql14.6 and REINDEX. The REINDEX didn't help. This is on an RDS instance that's a db.m5.large (2 * vCPU, 8 GB memory) with 200 storage on io1 with 10000 provisioned IOPS. I restarted the RDS instance in between each test.

At glance, it seems there are very few rows to operate on:

-> Seq Scan on t_status_175000000_176000000 t_status_39
(cost=0.00..13.80 rows=380 width=14) (actual time=0.009..0.010 rows=0
loops=1)

#3David Rowley
dgrowleyml@gmail.com
In reply to: Arthur Ramsey (#1)
Re: Sequential scan faster than index

On Tue, 21 Feb 2023 at 22:30, Arthur Ramsey <arthur.ramsey@code42.com> wrote:

I'm trying to figure out why a sequential scan is out performing. I've tried psql 13.7, psql14.6 and REINDEX. The REINDEX didn't help. This is on an RDS instance that's a db.m5.large (2 * vCPU, 8 GB memory) with 200 storage on io1 with 10000 provisioned IOPS. I restarted the RDS instance in between each test.

You'll probably see something more meaningful if you SET
track_io_timing = on; and EXPLAIN (ANALYZE, BUFFERS). There are
plenty of genuine reasons for seq scans being faster than scanning an
index. Scanning large indexes may require large amounts of random I/O,
especially when the indexes are larger than the RAM in the machine.
I've no idea how large these indexes are, but the row counts in your
plans seem to indicate that you're dealing with something in the order
of 30 million rows. If your disk latency is high then scanning an
entire large index that's mostly not in RAM is going to not perform
great. You'll likely see that I/O reads are taking longer with the
index scan plan than with the seq scan plan. That might change if you
were to do something like CLUSTER the partition by the index you're
using here. However, clustering on that index might make various
other queries slower. It's also important to read the documents about
CLUSTER and understand what it does and what it does not do too.

The fact that the planner wants to use the seq scan and if you disable
enable_seqscan and it becomes slower, then that's an indication that
the planner is likely making the correct choice. The problem might
just be with your expectations of index scan performance.

David