mmap vs read/write

Started by Huw Rogersover 27 years ago3 messages
#1Huw Rogers
count0@fsj.co.jp

Someone posted a (readonly) benchtest of mmap vs
read/write I/O using the following code:

for (off = 0; 1; off += MMAP_SIZE)
{
addr = mmap(0, MMAP_SIZE, PROT_READ, 0, fd, off);
assert(addr != NULL);

for (j = 0; j < MMAP_SIZE; j++)
if (*(addr + j) != ' ')
spaces++;
munmap(addr,MMAP_SIZE);
}

This is unfair to mmap since mmap is called once
per page. Better to mmap large regions (many
pages at once), then use msync() to force
write any modified pages. Access purely in
memory mmap'd I/O is _many_ times faster than
read/write under Solaris or Linux later
than 2.1.99 (prior to 2.1.99, Linux had
slow mmap performance).

Limitation on mmap is mainly that you
can't map more than 2Gb of data at once
under most existing O.S.s, (including
heap and stack), so simplistic mapping
of entire DBMS data files doesn't
scale for large databases, and you
need to cache region mappings to
avoid running out of PTEs.

The need to collocate information in
adjacent pages could be why Informix has
clustered indexes, the internal structure
of which I'd like to know more about.

-Huw

#2Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Huw Rogers (#1)
Re: [HACKERS] mmap vs read/write

This is unfair to mmap since mmap is called once
per page. Better to mmap large regions (many
pages at once), then use msync() to force
write any modified pages. Access purely in
memory mmap'd I/O is _many_ times faster than
read/write under Solaris or Linux later
than 2.1.99 (prior to 2.1.99, Linux had
slow mmap performance).

This makes me feel better. Linux is killing BSD/OS in mapping tests.

See my other posting.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#3Noname
ocie@paracel.com
In reply to: Huw Rogers (#1)
Re: [HACKERS] mmap vs read/write

Huw Rogers wrote:

Someone posted a (readonly) benchtest of mmap vs
read/write I/O using the following code:

for (off = 0; 1; off += MMAP_SIZE)
{
addr = mmap(0, MMAP_SIZE, PROT_READ, 0, fd, off);
assert(addr != NULL);

for (j = 0; j < MMAP_SIZE; j++)
if (*(addr + j) != ' ')
spaces++;
munmap(addr,MMAP_SIZE);
}

This is unfair to mmap since mmap is called once
per page. Better to mmap large regions (many
pages at once), then use msync() to force
write any modified pages. Access purely in

Better yet, request the pages ahead of time and have another process
map them in "asynchronously". By the time the process is ready to map
the page in for itself, the page will have already been read in from
the disk, and a memory buffer will be allocated for it.

I want to try and implement this in a simple demo program when I get a
chance.

Ocie