select statment going slow and slow while using IN (xx,xx)

Started by Yan Chunluabout 14 years ago3 messagesgeneral
Jump to latest
#1Yan Chunlu
springrider@gmail.com

I am transforming a db with millions records to anther schema. for
some reason I need to select the records using IN (xx,xx).

the ids in the IN was about 1000 recored every time, but I found the
query was getting slow while the selection moving on.

the shared buffer is 2048M. and the cpu and io usage is as normal.

But while I am processing a smaller db on another machine, which only
has the default 24MB, the selection went very smooth.

here is the logs:

start... 0 limit:1000
selection time: 0.140721082687
sort_options
msgtime
start... 1000 limit:1000
selection time: 0.122759103775
start... 2000 limit:1000
selection time: 0.150802850723
start... 3000 limit:1000
selection time: 0.173918008804
start... 4000 limit:1000
selection time: 0.212812900543
start... 5000 limit:1000
selection time: 0.255054950714
start... 6000 limit:1000
selection time: 0.230540037155
start... 7000 limit:1000
selection time: 0.24426317215
start... 8000 limit:1000
selection time: 0.326669931412
start... 9000 limit:1000
selection time: 0.351358175278
start... 10000 limit:1000
selection time: 0.386382102966
start... 11000 limit:1000
selection time: 0.440491914749
start... 12000 limit:1000
selection time: 0.443608045578
start... 13000 limit:1000
selection time: 0.49751496315
start... 14000 limit:1000
selection time: 12.0050361156
start... 15000 limit:1000
selection time: 26.3596658707

start... 16000 limit:1000
selection time: 43.5269529819

#2Yan Chunlu
springrider@gmail.com
In reply to: Yan Chunlu (#1)
Re: select statment going slow and slow while using IN (xx,xx)

I also tried explain but found nothing special:

explain select * from data_table where thing_id in
(164438,112478,102941,112377,164442,181764,104028);
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on data_table (cost=31.75..579.10 rows=141 width=53)
Recheck Cond: (thing_id = ANY
('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))
-> Bitmap Index Scan on idx_data_table (cost=0.00..31.71 rows=141 width=0)
Index Cond: (thing_id = ANY
('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))

Show quoted text

On Thu, Jan 12, 2012 at 4:47 PM, Yan Chunlu <springrider@gmail.com> wrote:

I am transforming a db with millions records to anther schema.  for
some reason I need to select the records using IN (xx,xx).

the ids in the IN was about 1000 recored every time, but I found the
query was getting slow while the selection moving on.

the shared buffer is 2048M.   and the cpu and io usage is as normal.

But while I am processing a smaller db on another machine, which only
has the default 24MB, the selection went very smooth.

here is the logs:

start... 0 limit:1000
selection time: 0.140721082687
sort_options
msgtime
start... 1000 limit:1000
selection time: 0.122759103775
start... 2000 limit:1000
selection time: 0.150802850723
start... 3000 limit:1000
selection time: 0.173918008804
start... 4000 limit:1000
selection time: 0.212812900543
start... 5000 limit:1000
selection time: 0.255054950714
start... 6000 limit:1000
selection time: 0.230540037155
start... 7000 limit:1000
selection time: 0.24426317215
start... 8000 limit:1000
selection time: 0.326669931412
start... 9000 limit:1000
selection time: 0.351358175278
start... 10000 limit:1000
selection time: 0.386382102966
start... 11000 limit:1000
selection time: 0.440491914749
start... 12000 limit:1000
selection time: 0.443608045578
start... 13000 limit:1000
selection time: 0.49751496315
start... 14000 limit:1000
selection time: 12.0050361156
start... 15000 limit:1000
selection time: 26.3596658707

start... 16000 limit:1000
selection time: 43.5269529819

#3Yan Chunlu
springrider@gmail.com
In reply to: Yan Chunlu (#2)
Re: select statment going slow and slow while using IN (xx,xx)

seems similar to this problem:
http://stackoverflow.com/questions/5198380/improving-postgres-psycopg2-query-performance-for-python-to-the-same-level-of-ja

but no solution yet.

Show quoted text

On Thu, Jan 12, 2012 at 5:00 PM, Yan Chunlu <springrider@gmail.com> wrote:

I also tried explain but found nothing special:

explain select * from data_table where thing_id in
(164438,112478,102941,112377,164442,181764,104028);
                                             QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on data_table  (cost=31.75..579.10 rows=141 width=53)
  Recheck Cond: (thing_id = ANY
('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))
  ->  Bitmap Index Scan on idx_data_table  (cost=0.00..31.71 rows=141 width=0)
        Index Cond: (thing_id = ANY
('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))

On Thu, Jan 12, 2012 at 4:47 PM, Yan Chunlu <springrider@gmail.com> wrote:

I am transforming a db with millions records to anther schema.  for
some reason I need to select the records using IN (xx,xx).

the ids in the IN was about 1000 recored every time, but I found the
query was getting slow while the selection moving on.

the shared buffer is 2048M.   and the cpu and io usage is as normal.

But while I am processing a smaller db on another machine, which only
has the default 24MB, the selection went very smooth.

here is the logs:

start... 0 limit:1000
selection time: 0.140721082687
sort_options
msgtime
start... 1000 limit:1000
selection time: 0.122759103775
start... 2000 limit:1000
selection time: 0.150802850723
start... 3000 limit:1000
selection time: 0.173918008804
start... 4000 limit:1000
selection time: 0.212812900543
start... 5000 limit:1000
selection time: 0.255054950714
start... 6000 limit:1000
selection time: 0.230540037155
start... 7000 limit:1000
selection time: 0.24426317215
start... 8000 limit:1000
selection time: 0.326669931412
start... 9000 limit:1000
selection time: 0.351358175278
start... 10000 limit:1000
selection time: 0.386382102966
start... 11000 limit:1000
selection time: 0.440491914749
start... 12000 limit:1000
selection time: 0.443608045578
start... 13000 limit:1000
selection time: 0.49751496315
start... 14000 limit:1000
selection time: 12.0050361156
start... 15000 limit:1000
selection time: 26.3596658707

start... 16000 limit:1000
selection time: 43.5269529819