Asynchronous I/O in Postgres

Started by Mladen Gogalaover 15 years ago7 messagesdocs
Jump to latest
#1Mladen Gogala
mladen.gogala@vmsinfo.com

Postgres 8.4 and 9.0 have the parameter named
"effective_io_concurrency". The manual page is very short, it says the
following:

Sets the number of concurrent disk I/O operations that PostgreSQL
expects can be executed simultaneously. Raising this value will increase
the number of I/O operations that any individual PostgreSQL session
attempts to initiate in parallel. The allowed range is 1 to 1000, or
zero to disable issuance of asynchronous I/O requests.

http://www.postgresql.org/docs/current/static/runtime-config-resource.html

My initial understanding was that this was the size of the table,
containing aiocb pointers, so that PgSQL can launch up to 1000
simultaneous aio_read or aio_write, per process. While monitoring the
system, I noticed that there is no asynchronous I/O at all! Nothing,
nada, zilch! Then I noticed that the "postgres" binary, is not even
linked with libaio, so aio_read was out of the question:

-bash-3.2$ ldd postgres|grep libaio
-bash-3.2$

The platform is Postgres 9.0.1 on RH EL 5.5 x86-64. My understanding of
the "effective_io_concurrency" was apparently very wrong. What is the
"effective concurrency" and what are those "simultaneous I/O requests"
that man page is talking about. Can somebody please define in precise
terms what is it that this parameter defines? What kind of "concurrent
I/O" is Postgres doing without asynchronous I/O calls? If this parameter
is just a stub for the future reference, I'd like to know. Will Postgres
use asynchronous I/O? Is that planned?

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

#2Mladen Gogala
mladen.gogala@vmsinfo.com
In reply to: Mladen Gogala (#1)
Re: Asynchronous I/O in Postgres

Mladen Gogala wrote:

The platform is Postgres 9.0.1 on RH EL 5.5 x86-64. My understanding of
the "effective_io_concurrency" was apparently very wrong. What is the
"effective concurrency" and what are those "simultaneous I/O requests"
that man page is talking about. Can somebody please define in precise
terms what is it that this parameter defines? What kind of "concurrent
I/O" is Postgres doing without asynchronous I/O calls? If this parameter
is just a stub for the future reference, I'd like to know. Will Postgres
use asynchronous I/O? Is that planned?

The mystery deepens. I thought that this might be the size of the I/O
vector, for readv and writev routines, but not so. I did
"ltrace -e readv -p <PID> on a PID that was doing a large sequential
scan and not a single "readv" library call was encountered. All calls
were just plain and simple "read" calls. Where is the concurrency? I am
really curious now. The LWN article pompously announced that PostgreSQL
9.0 will use asynchronous I/O, with aio_read and aio_write. What does
effective_io_concurrency define? What kind of "concurrent I/O" is
Postgresql doing? This doesn't look very "concurrent":

read(65, "\16\0\0\0\210\254\333\240\1\0\4\0L\0P\0\0 \4
\0\0\0\0000\231\240\r\370\227p\2"..., 8192) = 8192
read(65, "\16\0\0\0000\1\334\240\1\0\4\0008\0 \1\0 \4
\0\0\0\0(\234\260\7`\231\220\5"..., 8192) = 8192
read(65, "\16\0\0\0\20;\334\240\1\0\4\0<\0(\1\0 \4
\0\0\0\0\360\233\36\10\320\232@\2"..., 8192) = 8192
read(65, "\16\0\0\0Pk\334\240\1\0\4\0004\0\300\0\0 \4 \0\0\0\0H\232p\v
\224P\f"..., 8192) = 8192
read(65, "\16\0\0\0\220\273C\241\1\0\4\0D\0p\0\0 \4
\0\0\0\0\230\234\320\6\220\233\16\2"..., 8192) = 8192
read(65, "\16\0\0\0P\311\335\240\1\0\4\0<\0008\1\0 \4
\0\0\0\0\240\231\300\fp\230`\2"..., 8192) = 8192
read(65, "\16\0\0\0\260*\335\240\1\0\4\0008\0\350\0\0 \4
\0\0\0\0\20\230\340\0178\224\256\7"..., 8192) = 8192
read(65, "\16\0\0\0\20\10\337\240\1\0\4\0004\0h\0\0 \4
\0\0\0\0\210\231\356\f\230\225\340\7"..., 8192) = 8192
read(65, "\16\0\0\0\220\310C\241\1\0\4\0@\0\260\0\0 \4
\0\0\0\0H\231p\r0\227.\4"..., 8192) = 8192
read(65, "\16\0\0\0\350-\301\241\1\0\4\0<\0X\0\0 \4
\0\0\0\0H\232p\v\0\226\216\10"..., 8192) = 8192

Descriptor 65 is a DB file:
[root@lpo-postgres-01 ~]# cd /proc/16663/fd
[root@lpo-postgres-01 fd]# ls -l 65
lrwx------ 1 postgres postgres 64 Oct 7 23:26 65 ->
/software/pgsql/m-over/PG_9.0_201008051/16417/1572186.7

So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

#3Mladen Gogala
mladen.gogala@vmsinfo.com
In reply to: Mladen Gogala (#2)
Re: Asynchronous I/O in Postgres

Mladen Gogala wrote:

So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?

To rephrase my question, can anybody tell me where in the code is it used?

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

#4Josh Kupershmidt
schmiddy@gmail.com
In reply to: Mladen Gogala (#3)
Re: Asynchronous I/O in Postgres

On Fri, Oct 8, 2010 at 8:14 AM, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:

Mladen Gogala wrote:

So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?

To rephrase my question, can anybody tell me where in the code is it used?

The docs are a bit sparse here :-(

But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.

The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm&gt;
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."

Josh

#5Bruce Momjian
bruce@momjian.us
In reply to: Josh Kupershmidt (#4)
Re: Asynchronous I/O in Postgres

Josh Kupershmidt wrote:

On Fri, Oct 8, 2010 at 8:14 AM, Mladen Gogala <mladen.gogala@vmsinfo.com> wrote:

Mladen Gogala wrote:

So, essentially, the process is reading block by block, in a sequence.
What, exactly, does "effective_io_concurrency" mean?

To rephrase my question, can anybody tell me where in the code is it used?

The docs are a bit sparse here :-(

But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.

The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm&gt;
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."

So, this this also true for community Postgres? Can someone suggest
updated docs?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#6Josh Kupershmidt
schmiddy@gmail.com
In reply to: Bruce Momjian (#5)
Re: [NOVICE] Asynchronous I/O in Postgres

[moving to -docs]

On Wed, Oct 20, 2010 at 10:30 PM, Bruce Momjian <bruce@momjian.us> wrote:

Josh Kupershmidt wrote:

But it looks to me like effective_io_concurrency only affects bitmap
heap scans. The setting from effective_io_concurrency gets put into
"target_prefetch_pages" in ./src/backend/utils/misc/guc.c . But the
only place which uses that variable is
./src/backend/executor/nodeBitmapHeapscan.c.

The EnterpriseDB docs
<http://www.enterprisedb.com/docs/en/8.3R2/perf/Postgres_Plus_Advanced_Server_Performance_Guide-17.htm&gt;
mention:
"effective_io_concurrency is only used for Bitmap Heap Scans. For
normal sequential scans the operating system should handle read-ahead
internally (On Linux, see the blockdev command, in particular --setra
and --setfra)."

So, this this also true for community Postgres?  Can someone suggest
updated docs?

It looks like effective_io_concurrency only has an impact on bitmap
heap scans. I think a brief mention of this fact in the docs for
effective_io_concurrency should suffice, patch attached.

Josh

Attachments:

effective_io_concurrency-doc.patchapplication/octet-stream; name=effective_io_concurrency-doc.patchDownload+3-3
#7Robert Haas
robertmhaas@gmail.com
In reply to: Josh Kupershmidt (#6)
Re: [NOVICE] Asynchronous I/O in Postgres

On Tue, Oct 26, 2010 at 8:46 PM, Josh Kupershmidt <schmiddy@gmail.com> wrote:

It looks like effective_io_concurrency only has an impact on bitmap
heap scans. I think a brief mention of this fact in the docs for
effective_io_concurrency should suffice, patch attached.

Good idea, committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company