O_DIRECT, or madvise and/or posix_fadvise

Started by Nonameabout 19 years ago3 messages
#1Noname
markwkm@gmail.com

I caught this thread about O_DIRECT on kerneltrap.org:
http://kerneltrap.org/node/7563

It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system. I got
the impression that posix_fadvise in the Linux kernel isn't as good as
it could be. I noticed in xlog.c that the use of posix_fadvise is
disabled. Maybe it's time to do some more experimenting and working
with the Linux kernel developers. Or perhaps there is another OS that
would be better to experiment with?

Not sure where to start but do people think this is worth taking a stab at?

Regards,
Mark

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Noname (#1)
Re: O_DIRECT, or madvise and/or posix_fadvise

On Thu, Jan 11, 2007 at 02:35:13PM -0800, markwkm@gmail.com wrote:

I caught this thread about O_DIRECT on kerneltrap.org:
http://kerneltrap.org/node/7563

It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system. I got
the impression that posix_fadvise in the Linux kernel isn't as good as
it could be. I noticed in xlog.c that the use of posix_fadvise is
disabled. Maybe it's time to do some more experimenting and working
with the Linux kernel developers. Or perhaps there is another OS that
would be better to experiment with?

Postgres doesn't use O_DIRECT and probably never will. The system is
esigned to use the system cache, not bypass it.

What recent discussions have highlighted is the need to more accurately
control the flow of data to disk. Apparently currently kernel try to
hold data back much longer than is useful.

Not that I'm volunterring to deal with this.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#3Noname
markwkm@gmail.com
In reply to: Martijn van Oosterhout (#2)
Re: O_DIRECT, or madvise and/or posix_fadvise

On 1/12/07, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Thu, Jan 11, 2007 at 02:35:13PM -0800, markwkm@gmail.com wrote:

I caught this thread about O_DIRECT on kerneltrap.org:
http://kerneltrap.org/node/7563

It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system. I got
the impression that posix_fadvise in the Linux kernel isn't as good as
it could be. I noticed in xlog.c that the use of posix_fadvise is
disabled. Maybe it's time to do some more experimenting and working
with the Linux kernel developers. Or perhaps there is another OS that
would be better to experiment with?

Postgres doesn't use O_DIRECT and probably never will. The system is
esigned to use the system cache, not bypass it.

What recent discussions have highlighted is the need to more accurately
control the flow of data to disk. Apparently currently kernel try to
hold data back much longer than is useful.

Right, so my understanding is that.PostgreSQL needs to provide the OS
with information with how it wants it to control the flow with
posix_fadvise, and it sounds like the Linux folks believe their
implementation of posix_fadvise needs some work.

Not that I'm volunterring to deal with this.

Have a nice day,

Regards,
Mark