Asynchronous I/O in Postgres
Postgres 8.4 and 9.0 have the parameter named
"effective_io_concurrency". The manual page is very short, it says the
following:
Sets the number of concurrent disk I/O operations that PostgreSQL
expects can be executed simultaneously. Raising this value will increase
the number of I/O operations that any individual PostgreSQL session
attempts to initiate in parallel. The allowed range is 1 to 1000, or
zero to disable issuance of asynchronous I/O requests.
http://www.postgresql.org/docs/current/static/runtime-config-resource.html
My initial understanding was that this was the size of the table,
containing aiocb pointers, so that PgSQL can launch up to 1000
simultaneous aio_read or aio_write, per process. While monitoring the
system, I noticed that there is no asynchronous I/O at all! Nothing,
nada, zilch! Then I noticed that the "postgres" binary, is not even
linked with libaio, so aio_read was out of the question:
-bash-3.2$ ldd postgres|grep libaio
-bash-3.2$
The platform is Postgres 9.0.1 on RH EL 5.5 x86-64. My understanding of
the "effective_io_concurrency" was apparently very wrong. What is the
"effective concurrency" and what are those "simultaneous I/O requests"
that man page is talking about. Can somebody please define in precise
terms what is it that this parameter defines? What kind of "concurrent
I/O" is Postgres doing without asynchronous I/O calls? If this parameter
is just a stub for the future reference, I'd like to know. Will Postgres
use asynchronous I/O? Is that planned?
--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com
Mladen Gogala wrote:
The platform is Postgres 9.0.1 on RH EL 5.5 x86-64. My understanding of
the "effective_io_concurrency" was apparently very wrong. What is the
"effective concurrency" and what are those "simultaneous I/O requests"
that man page is talking about. Can somebody please define in precise
terms what is it that this parameter defines? What kind of "concurrent
I/O" is Postgres doing without asynchronous I/O calls? If this parameter
is just a stub for the future reference, I'd like to know. Will Postgres
use asynchronous I/O? Is that planned?
The mystery deepens. I thought that this might be the size of the I/O
vector, for readv and writev routines, but not so. I did
"ltrace -e readv -p <PID> on a PID that was doing a large sequential
scan and not a single "readv" library call was encountered. All calls
were just plain and simple "read" calls. Where is the concurrency? I am
really curious now. The LWN article pompously announced that PostgreSQL
9.0 will use asynchronous I/O, with aio_read and aio_write. What does
effective_io_concurrency define? What kind of "concurrent I/O" is
Postgresql doing? This doesn't look very "concurrent":
read(65, "\16\0\0\0\210\254\333\240\1\0\4\0L\0P\0\0 \4
\0\0\0\0000\231\240\r\370\227p\2"..., 8192) = 8192
read(65, "\16\0\0\0000\1\334\240\1\0\4\0008\0 \1\0 \4
\0\0\0\0(\234\260\7`\231\220\5"..., 8192) = 8192
read(65, "\16\0\0\0\20;\334\240\1\0\4\0<\0(\1\0 \4
\0\0\0\0\360\233\36\10\320\232@\2"..., 8192) = 8192
read(65, "\16\0\0\0Pk\334\240\1\0\4\0004\0\300\0\0 \4 \0\0\0\0H\232p\v
\224P\f"..., 8192) = 8192
read(65, "\16\0\0\0\220\273C\241\1\0\4\0D\0p\0\0 \4
\0\0\0\0\230\234\320\6\220\233\16\2"..., 8192) = 8192
read(65, "\16\0\0\0P\311\335\240\1\0\4\0<\0008\1\0 \4
\0\0\0\0\240\231\300\fp\230`\2"..., 8192) = 8192
read(65, "\16\0\0\0\260*\335\240\1\0\4\0008\0\350\0\0 \4
\0\0\0\0\20\230\340\0178\224\256\7"..., 8192) = 8192
read(65, "\16\0\0\0\20\10\337\240\1\0\4\0004\0h\0\0 \4
\0\0\0\0\210\231\356\f\230\225\340\7"..., 8192) = 8192
read(65, "\16\0\0\0\220\310C\241\1\0\4\0@\0\260\0\0 \4
\0\0\0\0H\231p\r0\227.\4"..., 8192) = 8192
read(65, "\16\0\0\0\350-\301\241\1\0\4\0<\0X\0\0 \4
\0\0\0\0H\232p\v\0\226\216\10"..., 8192) = 8192
Descriptor 65 is a DB file:
[root@lpo-postgres-01 ~]# cd /proc/16663/fd
[root@lpo-postgres-01 fd]# ls -l 65
lrwx------ 1 postgres postgres 64 Oct 7 23:26 65 ->
/software/pgsql/m-over/PG_9.0_201008051/16417/1572186.7
So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?
--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com
Mladen Gogala wrote:
So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?
To rephrase my question, can anybody tell me where in the code is it used?
--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com
On Fri, Oct 8, 2010 at 8:14 AM, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:
Mladen Gogala wrote:
So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?To rephrase my question, can anybody tell me where in the code is it used?
The docs are a bit sparse here :-(
But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.
The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm>
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."
Josh
Josh Kupershmidt wrote:
On Fri, Oct 8, 2010 at 8:14 AM, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:
Mladen Gogala wrote:
So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?To rephrase my question, can anybody tell me where in the code is it used?
The docs are a bit sparse here :-(
But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm>
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."
So, this this also true for community Postgres? Can someone suggest
updated docs?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
[moving to -docs]
On Wed, Oct 20, 2010 at 10:30 PM, Bruce Momjian <bruce@momjian.us> wrote:
Josh Kupershmidt wrote:
But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm>
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."So, this this also true for community Postgres? Can someone suggest
updated docs?
It looks like effective_io_concurrency only has an impact on bitmap
heap scans. I think a brief mention of this fact in the docs for
effective_io_concurrency should suffice, patch attached.
Josh
Attachments:
effective_io_concurrency-doc.patchapplication/octet-stream; name=effective_io_concurrency-doc.patchDownload+3-3
On Tue, Oct 26, 2010 at 8:46 PM, Josh Kupershmidt <schmiddy@gmail.com> wrote:
It looks like effective_io_concurrency only has an impact on bitmap
heap scans. I think a brief mention of this fact in the docs for
effective_io_concurrency should suffice, patch attached.
Good idea, committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company