md: measure just FileSync() for pgstat_io without FileClose()
Hi,
As per the attachment, shouldn't we exclude the timing of close() from
fsync()'s duration?
-J.
Attachments:
v1-0001-md-measure-just-FileSync-for-pgstat_io-without-Fi.patchtext/x-patch; charset=US-ASCII; name=v1-0001-md-measure-just-FileSync-for-pgstat_io-without-Fi.patchDownload+3-4
Original
From: Jakub Wartak <jakub.wartak@enterprisedb.com>
Date: 2026-04-30 19:14
To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: md: measure just FileSync() for pgstat_io without FileClose()Hi,
As?per?the?attachment,?shouldn't?we?exclude?the?timing?of?close()?from
fsync()'s?duration?-J.
Hi Jakub,
Thanks a lot for your patch!
I agree that ideally the call to `pgstat_count_io_op_time(..., IOOP_FSYNC, ...)` inside `mdsyncfiletag()`
should only measure and accurately reflect the actual execution time of `FileSync()`.
However, I haven’t figured out a clean way to capture and retain the required timing metrics within
the existing logic of `mdsyncfiletag()`, so that we can feed them into `pgstat_count_io_op_time()` (or
its variants) for proper IO statistics tracking. Conceptually, the ideal flow would look like this:
```c
io_start = pgstat_prepare_io_time(track_io_timing);
result = FileSync(file, WAIT_EVENT_DATA_FILE_SYNC);
io_end = pgstat_get_io_time???();
if (need_to_close)
FileClose(file);
pgstat_count_io_op_time_v2(..., io_start, io_end, ...);
```
I’ve been thinking through the logic of `PathNameOpenFile()`, the subsequent `FileClose()` call,
and how this sequence impacts the fsync timing measurement inside `mdsyncfiletag()`.
Let’s break down all heavy or complex work done inside `FileClose()` for temporary file handles
opened and closed within `mdsyncfiletag()`:
1. AIO-related logic
Files opened via `PathNameOpenFile()` get flags `O_RDWR | PG_BINARY` (potentially combined
with `PG_O_DIRECT`). When we call `FileClose()`, it invokes `pgaio_closing_fd(vfdP->fd)`.
I’ll admit I haven’t fully wrapped my head around the full AIO subsystem, which is quite complex.
For these data file handles, we only pass `PG_O_DIRECT` at open time and do not leverage AIO
or io_uring for the I/O path. That leads me to suspect `pgaio_closing_fd()` will trigger no meaningful
AIO cleanup work here. This is especially relevant given the `while` loop inside `pgaio_closing_fd()` that
may invoke the potentially slow blocking wait `pgaio_io_wait()`.
My analysis on this part is tentative, and I welcome any corrections or further insights from the community.
2. Logic tied to `vfdP->fdstate`
When opening a file via `PathNameOpenFile()`, `vfdP->fdstate` is initialized to `0x0`.
During `FileClose()`, the checks `(vfdP->fdstate & FD_TEMP_FILE_LIMIT)` and `(vfdP->fdstate & FD_DELETE_AT_CLOSE)`
will both evaluate to false, so none of the associated cleanup routines for files will run.
3. Logic tied to `vfdP->resowner`
`PathNameOpenFile()` sets `vfdP->resowner = NULL`.
In turn, the conditional `if (vfdP->resowner)` inside `FileClose()` will not be satisfied,
meaning no resource owner cleanup logic will execute either.
To sum up: the file handles we temporarily open and close locally within `mdsyncfiletag()`
do not trigger any expensive cleanup paths in `FileClose()`. The overhead introduced by
closing these file descriptors should be negligible compared to the latency of the preceding `FileSync()` call.
I should note I have not yet run comprehensive benchmark tests to quantify this overhead
, so this is only a qualitative assessment.
From a broader system design perspective, closing the file descriptor ahead of emitting
IO statistics via `pgstat_count_io_op_time()` is also a reasonable performance optimization:
we release unused file resources as early as possible.
Happy to discuss this further, and I’m open to any feedback, alternative ideas or corrections you might have.
regards,
--
ZizhuanLiu (X-MAN)
44973863@qq.com