Proposed patch: synchronized_scanning GUC variable
Per today's -hackers discussion, add a GUC variable to allow clients to
disable the new synchronized-scanning behavior, and make pg_dump disable
sync scans so that it will reliably preserve row ordering. This is a
pretty trivial patch, but seeing how late we are in the 8.3 release
cycle, I thought I'd better post it for comment anyway.
regards, tom lane
On Jan 27, 2008 3:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Per today's -hackers discussion, add a GUC variable to allow clients to
disable the new synchronized-scanning behavior, and make pg_dump disable
sync scans so that it will reliably preserve row ordering. This is a
pretty trivial patch, but seeing how late we are in the 8.3 release
cycle, I thought I'd better post it for comment anyway.
+1
--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
On Jan 27, 2008 3:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Per today's -hackers discussion, add a GUC variable to allow clients to
disable the new synchronized-scanning behavior, and make pg_dump disable
sync scans so that it will reliably preserve row ordering. This is a
pretty trivial patch, but seeing how late we are in the 8.3 release
cycle, I thought I'd better post it for comment anyway.+1
I liked the "synchronized_sequential_scans" idea myself.
But otherwise, sure.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!
Per today's -hackers discussion, add a GUC variable to
allow clients
to disable the new synchronized-scanning behavior, and
make pg_dump
disable sync scans so that it will reliably preserve row
ordering.
This is a pretty trivial patch, but seeing how late we are
in the 8.3
release cycle, I thought I'd better post it for comment anyway.
+1
I liked the "synchronized_sequential_scans" idea myself.
But otherwise, sure.
this will save some hackwork I have to do for a import/export project I am
doing.
+1, please
Regards,
Gevik Babakhani
------------------------------------------------
PostgreSQL NL http://www.postgresql.nl
TrueSoftware BV http://www.truesoftware.nl
------------------------------------------------
Gregory Stark <stark@enterprisedb.com> writes:
I liked the "synchronized_sequential_scans" idea myself.
The name is still open for discussion --- it's an easy
search-and-replace in the patch ...
regards, tom lane
On Sun, 2008-01-27 at 21:54 +0000, Gregory Stark wrote:
I liked the "synchronized_sequential_scans" idea myself.
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?
-Neil
[ redirecting thread to -hackers ]
Neil Conway <neilc@samurai.com> writes:
On Sun, 2008-01-27 at 21:54 +0000, Gregory Stark wrote:
I liked the "synchronized_sequential_scans" idea myself.
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?
We have enable_seqscan already, so that last choice seems to fit in.
regards, tom lane
On Jan 27, 2008, at 21:04 , Tom Lane wrote:
[ redirecting thread to -hackers ]
Neil Conway <neilc@samurai.com> writes:
On Sun, 2008-01-27 at 21:54 +0000, Gregory Stark wrote:
I liked the "synchronized_sequential_scans" idea myself.
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?We have enable_seqscan already, so that last choice seems to fit in.
Would it make sense to match the plural as well?
Michael Glaesemann
grzm seespotcode net
Michael Glaesemann <grzm@seespotcode.net> writes:
Neil Conway <neilc@samurai.com> writes:
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?
Would it make sense to match the plural as well?
Actually, the best suggestion I've seen so far is Guillaume's
"synchronize_seqscans" --- make it a verb phrase.
regards, tom lane
Tom Lane wrote:
Per today's -hackers discussion, add a GUC variable to allow clients to
disable the new synchronized-scanning behavior, and make pg_dump disable
sync scans so that it will reliably preserve row ordering. This is a
pretty trivial patch, but seeing how late we are in the 8.3 release
cycle, I thought I'd better post it for comment anyway.regards, tom lane
Index: src/bin/pg_dump/pg_dump.c =================================================================== RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v retrieving revision 1.481 diff -c -r1.481 pg_dump.c *** src/bin/pg_dump/pg_dump.c 1 Jan 2008 19:45:55 -0000 1.481 --- src/bin/pg_dump/pg_dump.c 27 Jan 2008 20:00:18 -0000 *************** *** 553,558 **** --- 553,572 ---- do_sql_command(g_conn, "SET DATESTYLE = ISO");/* + * If supported, set extra_float_digits so that we can dump float data + * exactly (given correctly implemented float I/O code, anyway) + */ + if (g_fout->remoteVersion >= 70400) + do_sql_command(g_conn, "SET extra_float_digits TO 2"); + + /* + * If synchronized scanning is supported, disable it, to prevent + * unpredictable changes in row ordering across a dump and reload. + */ + if (g_fout->remoteVersion >= 80300) + do_sql_command(g_conn, "SET synchronized_scanning TO off"); + + /* * Start serializable transaction to dump consistent data. */ do_sql_command(g_conn, "BEGIN");
Hi,
Can somebody explain why it's important to load with
synchronized_scanning off?
do_sql_command(g_conn, "SET synchronized_scanning TO off");
Thanks
Russell Smith
On Mon, 2008-01-28 at 17:27 +1100, Russell Smith wrote:
Can somebody explain why it's important to load with
synchronized_scanning off?
*Loading* with synchronized scanning off is not important (and is not
implemented by the patch).
*Dumping* with synchronized scanning off is necessary to ensure that the
order of the rows in the pg_dump matches the on-disk order of the rows
in the table, which is important if you want to preserve the clustering
of the table data on restore.
See the -hackers thread:
http://markmail.org/message/qbytsco6oj2qkxsa
-Neil
Hi Russell,
On Jan 28, 2008 7:27 AM, Russell Smith <mr-russ@pws.com.au> wrote:
Can somebody explain why it's important to load with
synchronized_scanning off?do_sql_command(g_conn, "SET synchronized_scanning TO off");
It's the start point of this patch. See this thread [
http://archives.postgresql.org/pgsql-hackers/2008-01/msg00987.php ]
for more information.
--
Guillaume
Guillaume Smet wrote:
Hi Russell,
On Jan 28, 2008 7:27 AM, Russell Smith <mr-russ@pws.com.au> wrote:
Can somebody explain why it's important to load with
synchronized_scanning off?do_sql_command(g_conn, "SET synchronized_scanning TO off");
It's the start point of this patch. See this thread [
http://archives.postgresql.org/pgsql-hackers/2008-01/msg00987.php ]
for more information
Sorry, total brain fade in interpreting the patch.
g_conn is our connection to the database, not the command we are dumping.
Sorry
Russell
I liked the "synchronized_sequential_scans" idea myself.
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?We have enable_seqscan already, so that last choice seems to fit in.
Yes looks good, how about synchronized_seqscan without plural ?
Andreas
On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:
[ redirecting thread to -hackers ]
Neil Conway <neilc@samurai.com> writes:
On Sun, 2008-01-27 at 21:54 +0000, Gregory Stark wrote:
I liked the "synchronized_sequential_scans" idea myself.
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?We have enable_seqscan already, so that last choice seems to fit in.
If we're going to have a GUC, we may as well make it as useful as
possible.
Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.
Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold". This would then be the size of
a table above which we would perform synch scans. If its set to -1, then
this would be the same as "off" in all cases. The default value would be
25% of shared_buffers. (Think we can only do that at initdb time
currently).
If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
On Jan 28, 2008, at 6:14 PM, Simon Riggs wrote:
On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:
[ redirecting thread to -hackers ]
Neil Conway <neilc@samurai.com> writes:
On Sun, 2008-01-27 at 21:54 +0000, Gregory Stark wrote:
I liked the "synchronized_sequential_scans" idea myself.
I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?We have enable_seqscan already, so that last choice seems to fit in.
If we're going to have a GUC, we may as well make it as useful as
possible.Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold". This would then be the
size of
a table above which we would perform synch scans. If its set to -1,
then
this would be the same as "off" in all cases. The default value
would be
25% of shared_buffers. (Think we can only do that at initdb time
currently).If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)
+1
This is in fact a lot more flexible and transparent.
It gives us a lot more control over the process and it is easy to
explain / understand.
best regards,
hans
--
Cybertec Schönig & Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at
Simon Riggs <simon@2ndquadrant.com> writes:
Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold".
This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not. As the note in heapam.c says,
* If the table is large relative to NBuffers, use a bulk-read access
* strategy and enable synchronized scanning (see syncscan.c). Although
* the thresholds for these features could be different, we make them the
* same so that there are only two behaviors to tune rather than four.
It's a bit late in the cycle to be revisiting that choice. Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave. The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.
regards, tom lane
On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold".This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not. As the note in heapam.c says,* If the table is large relative to NBuffers, use a bulk-read access
* strategy and enable synchronized scanning (see syncscan.c). Although
* the thresholds for these features could be different, we make them the
* same so that there are only two behaviors to tune rather than four.It's a bit late in the cycle to be revisiting that choice. Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave. The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.
OK, good points.
I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.
Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune?
The objective of having a tuning hook would have been simply to
normalise the effects of increasing shared_buffers anyway, so fixing it
would solve the problem I see.
(8MB is the default, based upon 25% of 32MB).
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
On Sun, 2008-01-27 at 15:07 -0500, Tom Lane wrote:
Per today's -hackers discussion, add a GUC variable to allow clients to
disable the new synchronized-scanning behavior, and make pg_dump disable
sync scans so that it will reliably preserve row ordering. This is a
pretty trivial patch, but seeing how late we are in the 8.3 release
cycle, I thought I'd better post it for comment anyway.
I apologize for the late reply, but I have one comment I'd like to add.
+ if (g_fout->remoteVersion >= 80300) + do_sql_command(g_conn, "SET synchronized_scanning TO off"); + + /* * Start serializable transaction to dump consistent data. */
I think that pg_dump is a reasonable use case for synchoronized scans
when the table has not been clustered. It could potentially make pg_dump
have much less of a performance impact when run against an active
system.
I think it's worth considering enabling sync scans for non-clustered
tables if it would not interfere with the release. Of course, a painless
8.3 release is the top priority.
Regards,
Jeff Davis
Simon Riggs wrote:
On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold".This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not. As the note in heapam.c says,* If the table is large relative to NBuffers, use a bulk-read access
* strategy and enable synchronized scanning (see syncscan.c). Although
* the thresholds for these features could be different, we make them the
* same so that there are only two behaviors to tune rather than four.It's a bit late in the cycle to be revisiting that choice. Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave. The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.OK, good points.
I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune?
Synchronized scans, and the bulk-read strategy, don't help if the table
fits in cache. If it fits in shared buffers, you're better off keeping
it there, than swap pages between the OS cache and shared buffers, or
spend any effort synchronizing scans. That's why we agreed back then
that the threshold should be X% of shared_buffers.
It's a good point that we don't want pg_dump to screw up the cluster
order, but that's the only use case I've seen this far for disabling
sync scans. Even that wouldn't matter much if our estimate for
"clusteredness" didn't get screwed up by a table that looks like this:
"5 6 7 8 9 1 2 3 4"
Now, maybe there's more use cases where you'd want to tune the
threshold, but I'd like to see some before we add more knobs.
To benefit from a lower threshold, you'd need to have a table large
enough that its cache footprint matters, but is still smaller than 25%
of shared_buffers, and have seq scans on it. In that scenario, you might
benefit from a lower threshold, because that would leave some
shared_buffers free for other use. Even that is quite hand-wavey; the
buffer cache LRU algorithm handles that kind of scenarios reasonably
well already, and whether or not
To benefit from a larger threshold, you'd need to have a table larger
than 25% of shared_buffers, but still smaller than shared_buffers, and
seq scan it often enough that you want to keep it in shared buffers. If
you're frequently seq scanning a table of that size, you're most likely
suffering from a bad plan. Even then, the performance difference
shouldn't be that great, the table surely fits in OS cache anyway, with
typical shared_buffers settings.
Tables that are seq scanned are typically very small, like a summary
table with just a few rows, or huge tables in a data warehousing
system. Between the extremes, I don't think the threshold actually has a
very big impact.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com