Tracking last scan time

Started by Dave Pageover 3 years ago43 messages

dpage@pgadmin.org

over 3 years ago

1 attachment(s)

Often it is beneficial to review one's schema with a view to removing
indexes (and sometimes tables) that are no longer required. It's very
difficult to understand when that is the case by looking at the number of
scans of a relation as, for example, an index may be used infrequently but
may be critical in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan
time for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
help with this.

Due to the use of gettimeofday(), those values are only maintained if a new
GUC, track_scans, is set to on. By default, it is off.

I did run a 12 hour test to see what the performance impact is. pgbench was
run with scale factor 10000 and 75 users across 4 identical bare metal
machines running Rocky 8 in parallel which showed roughly a -2% average
performance penalty against HEAD with track_scans enabled. Machines were
PowerEdge R7525's with 128GB RAM, dual 16C/32T AMD 7302 CPUs, with the data
directory on 6 x 800GB 12Gb/s SSD SAS drives in RAID 0. Kernel time source
is tsc.

HEAD track_scans Penalty (%)
box1 19582.49735 19341.8881 -1.22869541
box2 19936.55513 19928.07479 -0.04253664659
box3 19631.78895 18649.64379 -5.002830696
box4 19810.86767 19420.67192 -1.969604525
Average 19740.42728 19335.06965 -2.05343896

Doc and test updates included.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachments:

last_scan_v1.diffapplication/octet-stream; name=last_scan_v1.diffDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a5cd4e44c7..9c37023ddd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7983,6 +7983,38 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-track-scans" xreflabel="track_scans">
+      <term><varname>track_scans</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>track_scans</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables the tracking of the last scan time for tables and indexes.
+        This parameter is off by default, as it will repeatedly query the
+        operating system for the current time, which may cause significant
+        overhead on some platforms.  You can use the
+        <xref linkend="pgtesttiming"/> tool to measure the overhead of
+        timing on your system.
+        Last sequential and index scan times for tables are displayed in
+        <link linkend="pg-stat-all-tables-view">
+        <structname>pg_stat_all_tables</structname></link> system view as
+        well as the <structname>pg_stat_sys_tables</structname> and
+        <structname>pg_stat_user_tables</structname> views.
+        Similarly, the last scan times for indexes are shown in the
+        <link linkend="pg-stat-all-indexes-view">
+        <structname>pg_stat_all_indexes</structname></link> view as well
+        as the <structname>pg_stat_sys_indexes</structname> and
+        <structname>pg_stat_user_indexes</structname> views.
+        The last scan times are useful in determining when tables and indexes
+        were last used. In the case of indexes, this can be particularly
+        useful when reviewing the schema with the aim of removing unnecessary
+        indexes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-io-timing" xreflabel="track_io_timing">
       <term><varname>track_io_timing</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..d0b9966442 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4385,6 +4385,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_seq_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last sequential scan of this table.
+       Only populated if <xref linkend="guc-track-scans"/> is enabled
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>seq_tup_read</structfield> <type>bigint</type>
@@ -4403,6 +4413,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last index scan of this table.
+       Only populated if <xref linkend="guc-track-scans"/> is enabled
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_fetch</structfield> <type>bigint</type>
@@ -4654,6 +4674,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last scan of this index.
+       Only populated if <xref linkend="guc-track-scans"/> is enabled
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_read</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5a844b63a1..801ba03ba4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -657,8 +657,10 @@ CREATE VIEW pg_stat_all_tables AS
             N.nspname AS schemaname,
             C.relname AS relname,
             pg_stat_get_numscans(C.oid) AS seq_scan,
+            pg_stat_get_lastscan(C.oid) AS last_seq_scan,
             pg_stat_get_tuples_returned(C.oid) AS seq_tup_read,
             sum(pg_stat_get_numscans(I.indexrelid))::bigint AS idx_scan,
+            max(pg_stat_get_lastscan(I.indexrelid)) AS last_idx_scan,
             sum(pg_stat_get_tuples_fetched(I.indexrelid))::bigint +
             pg_stat_get_tuples_fetched(C.oid) AS idx_tup_fetch,
             pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins,
@@ -775,6 +777,7 @@ CREATE VIEW pg_stat_all_indexes AS
             C.relname AS relname,
             I.relname AS indexrelname,
             pg_stat_get_numscans(I.oid) AS idx_scan,
+            pg_stat_get_lastscan(I.oid) AS last_idx_scan,
             pg_stat_get_tuples_returned(I.oid) AS idx_tup_read,
             pg_stat_get_tuples_fetched(I.oid) AS idx_tup_fetch
     FROM pg_class C JOIN
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index a846d9ffb6..b1f12d4dc9 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -50,6 +50,10 @@ static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop)
 static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
 
 
+/* GUC variables */
+bool		pgstat_track_scans = false;
+
+
 /*
  * Copy stats between relations. This is used for things like REINDEX
  * CONCURRENTLY.
@@ -789,6 +793,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry = &shtabstats->stats;
 
 	tabentry->numscans += lstats->t_counts.t_numscans;
+	if (pgstat_track_scans && lstats->t_counts.t_numscans)
+		tabentry->lastscan = GetCurrentTimestamp();
 	tabentry->tuples_returned += lstats->t_counts.t_tuples_returned;
 	tabentry->tuples_fetched += lstats->t_counts.t_tuples_fetched;
 	tabentry->tuples_inserted += lstats->t_counts.t_tuples_inserted;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index d9e2a79382..7eb8660989 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -52,6 +52,25 @@ pg_stat_get_numscans(PG_FUNCTION_ARGS)
 }
 
 
+Datum
+pg_stat_get_lastscan(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	TimestampTz	result;
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = tabentry->lastscan;
+
+	if (result == 0)
+		PG_RETURN_NULL();
+	else
+		PG_RETURN_TIMESTAMPTZ(result);
+}
+
+
 Datum
 pg_stat_get_tuples_returned(PG_FUNCTION_ARGS)
 {
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9fbbfb1be5..371cd597f4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1600,6 +1600,15 @@ static struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"track_scans", PGC_SUSET, STATS_CUMULATIVE,
+			gettext_noop("Records the last scan time for relations."),
+			NULL
+		},
+		&pgstat_track_scans,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"track_wal_io_timing", PGC_SUSET, STATS_CUMULATIVE,
 			gettext_noop("Collects timing statistics for WAL I/O activity."),
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index be47583122..f5320fb39e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5249,6 +5249,10 @@
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_numscans' },
+{ oid => '2173', descr => 'statistics: time of the last scan for table/index',
+  proname => 'pg_stat_get_lastscan', provolatile => 's', proparallel => 'r',
+  prorettype => 'timestamptz', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_lastscan' },
 { oid => '1929', descr => 'statistics: number of tuples read by seqscan',
   proname => 'pg_stat_get_tuples_returned', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ac28f813b4..d0882650a7 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -242,7 +242,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA7
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA8
 
 typedef struct PgStat_ArchiverStats
 {
@@ -355,6 +355,7 @@ typedef struct PgStat_StatSubEntry
 typedef struct PgStat_StatTabEntry
 {
 	PgStat_Counter numscans;
+	TimestampTz lastscan;
 
 	PgStat_Counter tuples_returned;
 	PgStat_Counter tuples_fetched;
@@ -643,6 +644,7 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
 /* GUC parameters */
 extern PGDLLIMPORT bool pgstat_track_counts;
 extern PGDLLIMPORT int pgstat_track_functions;
+extern PGDLLIMPORT bool pgstat_track_scans;
 extern PGDLLIMPORT int pgstat_fetch_consistency;
 
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7ec3d2688f..d9287ef0c9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1768,6 +1768,7 @@ pg_stat_all_indexes| SELECT c.oid AS relid,
     c.relname,
     i.relname AS indexrelname,
     pg_stat_get_numscans(i.oid) AS idx_scan,
+    pg_stat_get_lastscan(i.oid) AS last_idx_scan,
     pg_stat_get_tuples_returned(i.oid) AS idx_tup_read,
     pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch
    FROM (((pg_class c
@@ -1779,8 +1780,10 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     n.nspname AS schemaname,
     c.relname,
     pg_stat_get_numscans(c.oid) AS seq_scan,
+    pg_stat_get_lastscan(c.oid) AS last_seq_scan,
     pg_stat_get_tuples_returned(c.oid) AS seq_tup_read,
     (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan,
+    max(pg_stat_get_lastscan(i.indexrelid)) AS last_idx_scan,
     ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch,
     pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins,
     pg_stat_get_tuples_updated(c.oid) AS n_tup_upd,
@@ -2112,6 +2115,7 @@ pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2120,8 +2124,10 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,
@@ -2156,6 +2162,7 @@ pg_stat_user_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2164,8 +2171,10 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,

Greg Stark

stark@mit.edu

over 3 years ago

In reply to: Dave Page (#1)

Re: Tracking last scan time

On Tue, 23 Aug 2022 at 11:00, Dave Page <dpage@pgadmin.org> wrote:

Often it is beneficial to review one's schema with a view to removing indexes (and sometimes tables) that are no longer required. It's very difficult to understand when that is the case by looking at the number of scans of a relation as, for example, an index may be used infrequently but may be critical in those times when it is used.

I think this is easy to answer in a prometheus/datadog/etc world since
you can consult the history of the count to see when it was last
incremented. (Or do effectively that continously).

I guess that just reinforces the idea that it should be optional.
Perhaps there's room for some sort of general feature for controlling
various time series aggregates like max() and min() sum() or, uhm,
timeoflastchange() on whatever stats you want. That would let us
remove a bunch of stuff from pg_stat_statements and let users turn on
just the ones they want. And also let users enable things like time of
last rollback or conflict etc. But that's just something to think
about down the road.

The attached patch against HEAD adds optional tracking of the last scan time for relations. It updates pg_stat_*_tables with new last_seq_scan and last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to help with this.

Due to the use of gettimeofday(), those values are only maintained if a new GUC, track_scans, is set to on. By default, it is off.

Bikeshedding warning -- "track_scans" could equally apply to almost
any stats about scans. I think the really relevant thing here is the
times, not the scans. I think the GUC should be "track_scan_times". Or
could that still be confused with scan durations? Maybe
"track_scan_timestamps"?

You could maybe make the gettimeofday cheaper by doing it less often.
Like, skipping the increment if the old timestamp is newer than 1s
before the transaction start time (I think that's available free if
some other guc is enabled but I don't recall). Or isn't this cb
normally happening after transaction end? So xactStopTimestamp might
be available already?

--
greg

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Greg Stark (#2)

Re: Tracking last scan time

On Tue, 23 Aug 2022 at 13:07, Greg Stark <stark@mit.edu> wrote:

On Tue, 23 Aug 2022 at 11:00, Dave Page <dpage@pgadmin.org> wrote:

Often it is beneficial to review one's schema with a view to removing

indexes (and sometimes tables) that are no longer required. It's very
difficult to understand when that is the case by looking at the number of
scans of a relation as, for example, an index may be used infrequently but
may be critical in those times when it is used.

I think this is easy to answer in a prometheus/datadog/etc world since
you can consult the history of the count to see when it was last
incremented. (Or do effectively that continously).

Yes. But not every PostgreSQL instance is monitored in that way.

I guess that just reinforces the idea that it should be optional.
Perhaps there's room for some sort of general feature for controlling
various time series aggregates like max() and min() sum() or, uhm,
timeoflastchange() on whatever stats you want. That would let us
remove a bunch of stuff from pg_stat_statements and let users turn on
just the ones they want. And also let users enable things like time of
last rollback or conflict etc. But that's just something to think
about down the road.

It's certainly an interesting idea.

The attached patch against HEAD adds optional tracking of the last scan

time for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
help with this.

Due to the use of gettimeofday(), those values are only maintained if a

new GUC, track_scans, is set to on. By default, it is off.

Bikeshedding warning -- "track_scans" could equally apply to almost
any stats about scans. I think the really relevant thing here is the
times, not the scans. I think the GUC should be "track_scan_times". Or
could that still be confused with scan durations? Maybe
"track_scan_timestamps"?

The latter seems reasonable.

You could maybe make the gettimeofday cheaper by doing it less often.
Like, skipping the increment if the old timestamp is newer than 1s
before the transaction start time (I think that's available free if
some other guc is enabled but I don't recall). Or isn't this cb
normally happening after transaction end? So xactStopTimestamp might
be available already?

Something like:

if (pgstat_track_scan_timestamps && lstats->t_counts.t_numscans &&
tabentry->lastscan + USECS_PER_SEC <
GetCurrentTransactionStopTimestamp())
tabentry->lastscan = GetCurrentTimestamp();

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Dave Page (#1)

Re: Tracking last scan time

On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:

Often it is beneficial to review one's schema with a view to removing indexes
(and sometimes tables) that are no longer required. It's very difficult to
understand when that is the case by looking at the number of scans of a
relation as, for example, an index may be used infrequently but may be critical
in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan time
for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
help with this.

Would it be simpler to allow the sequential and index scan columns to be
cleared so you can look later to see if it is non-zero? Should we allow
arbitrary clearing of stat columns?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Bruce Momjian (#4)

Re: Tracking last scan time

On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:

Often it is beneficial to review one's schema with a view to removing

indexes

(and sometimes tables) that are no longer required. It's very difficult

to

understand when that is the case by looking at the number of scans of a
relation as, for example, an index may be used infrequently but may be

critical

in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan

time

for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column

to

help with this.

Would it be simpler to allow the sequential and index scan columns to be
cleared so you can look later to see if it is non-zero? Should we allow
arbitrary clearing of stat columns?

I don't think so, because then stat values wouldn't necessarily correlate
with each other, and you wouldn't know when any of them were last reset
unless we started tracking each individual reset. At least now you can see
when they were all reset, and you know they were reset at the same time.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Dave Page (#5)

Re: Tracking last scan time

On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:

On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:

Often it is beneficial to review one's schema with a view to removing

indexes

(and sometimes tables) that are no longer required. It's very difficult

to

understand when that is the case by looking at the number of scans of a
relation as, for example, an index may be used infrequently but may be

critical

in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan

time

for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column

to

help with this.

Would it be simpler to allow the sequential and index scan columns to be
cleared so you can look later to see if it is non-zero? Should we allow

I don't think so, because then stat values wouldn't necessarily correlate with
each other, and you wouldn't know when any of them were last reset unless we
started tracking each individual reset. At least now you can see when they were
all reset, and you know they were reset at the same time.

Yeah, true. I was more asking if these two columns are in some way
special or if people would want a more general solution, and if so, is
that something we want in core Postgres.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Bruce Momjian (#6)

Re: Tracking last scan time

On Wed, 24 Aug 2022 at 16:03, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:

On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Aug 23, 2022 at 10:55:09AM +0100, Dave Page wrote:

Often it is beneficial to review one's schema with a view to

removing

indexes

(and sometimes tables) that are no longer required. It's very

difficult

to

understand when that is the case by looking at the number of scans

of a

relation as, for example, an index may be used infrequently but

may be

critical

in those times when it is used.

The attached patch against HEAD adds optional tracking of the last

scan

time

for relations. It updates pg_stat_*_tables with new last_seq_scan

and

last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan

column

to

help with this.

Would it be simpler to allow the sequential and index scan columns

to be

cleared so you can look later to see if it is non-zero? Should we

allow

I don't think so, because then stat values wouldn't necessarily

correlate with

each other, and you wouldn't know when any of them were last reset

unless we

started tracking each individual reset. At least now you can see when

they were

all reset, and you know they were reset at the same time.

Yeah, true. I was more asking if these two columns are in some way
special or if people would want a more general solution, and if so, is
that something we want in core Postgres.

They're special in the sense that they're the ones you're most likely going
to look at to see how much a relation is used I think (at least, I'd look
at them rather than the tuple counts).

There are certainly other things for which a last usage value may be
useful. Functions/procedures for example, or views. The benefits to
removing unused objects of that type are far, far lower than indexes or
tables of course.

There are other potential use cases for similar timestamps, such as object
creation times (and creating user), but they are more useful for auditing
than monitoring and optimisation.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

David Rowley

dgrowleyml@gmail.com

over 3 years ago

In reply to: Bruce Momjian (#6)

Re: Tracking last scan time

On Thu, 25 Aug 2022 at 03:03, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:

On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:
Would it be simpler to allow the sequential and index scan columns to be
cleared so you can look later to see if it is non-zero? Should we allow

I don't think so, because then stat values wouldn't necessarily correlate with
each other, and you wouldn't know when any of them were last reset unless we
started tracking each individual reset. At least now you can see when they were
all reset, and you know they were reset at the same time.

Yeah, true. I was more asking if these two columns are in some way
special or if people would want a more general solution, and if so, is
that something we want in core Postgres.

Back when I used to do a bit of PostgreSQL DBA stuff, I had a nightly
job setup to record the state of pg_stat_all_tables and put that into
another table along with the current date. I then had a view that did
some calculations with col - LAG(col) OVER (PARTITION BY relid ORDER
BY date) to fetch the numerical values for each date. I didn't ever
want to reset the stats because it messes with autovacuum. If you zero
out n_ins_since_vacuum more often than auto-vacuum would trigger, then
bad things happen over time (we should really warn about that in the
docs).

I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on a
post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

We do also need to consider now that stats are stored in shared memory
that any fields we add are in RAM.

David

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: David Rowley (#8)

1 attachment(s)

Re: Tracking last scan time

On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 25 Aug 2022 at 03:03, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote:

On Wed, 24 Aug 2022 at 15:18, Bruce Momjian <bruce@momjian.us> wrote:
Would it be simpler to allow the sequential and index scan columns

to be

cleared so you can look later to see if it is non-zero? Should we

allow

I don't think so, because then stat values wouldn't necessarily

correlate with

each other, and you wouldn't know when any of them were last reset

unless we

started tracking each individual reset. At least now you can see when

they were

all reset, and you know they were reset at the same time.

Yeah, true. I was more asking if these two columns are in some way
special or if people would want a more general solution, and if so, is
that something we want in core Postgres.

Back when I used to do a bit of PostgreSQL DBA stuff, I had a nightly
job setup to record the state of pg_stat_all_tables and put that into
another table along with the current date. I then had a view that did
some calculations with col - LAG(col) OVER (PARTITION BY relid ORDER
BY date) to fetch the numerical values for each date. I didn't ever
want to reset the stats because it messes with autovacuum. If you zero
out n_ins_since_vacuum more often than auto-vacuum would trigger, then
bad things happen over time (we should really warn about that in the
docs).

I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on a
post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

There are usually other ways to perform monitoring tasks, but there is
something to be said for the convenience of having functionality built in
and not having to rely on tools, scripts, or post-it notes :-)

We do also need to consider now that stats are stored in shared memory
that any fields we add are in RAM.

That is a fair point. I believe this is both minimal, and useful though.

I've attached a v2 patch that incorporates Greg's suggestions.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachments:

last_scan_v2.diffapplication/octet-stream; name=last_scan_v2.diffDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a5cd4e44c7..0753dbb45c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7983,6 +7983,38 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-track-scan-timestamps" xreflabel="track_scan_timestamps">
+      <term><varname>track_scan_timestamps</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>track_scan_timestamps</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables the tracking of the last scan time for tables and indexes.
+        This parameter is off by default, as it will repeatedly query the
+        operating system for the current time, which may cause significant
+        overhead on some platforms.  You can use the
+        <xref linkend="pgtesttiming"/> tool to measure the overhead of
+        timing on your system.
+        Last sequential and index scan times for tables are displayed in
+        <link linkend="pg-stat-all-tables-view">
+        <structname>pg_stat_all_tables</structname></link> system view as
+        well as the <structname>pg_stat_sys_tables</structname> and
+        <structname>pg_stat_user_tables</structname> views.
+        Similarly, the last scan times for indexes are shown in the
+        <link linkend="pg-stat-all-indexes-view">
+        <structname>pg_stat_all_indexes</structname></link> view as well
+        as the <structname>pg_stat_sys_indexes</structname> and
+        <structname>pg_stat_user_indexes</structname> views.
+        The last scan times are useful in determining when tables and indexes
+        were last used. In the case of indexes, this can be particularly
+        useful when reviewing the schema with the aim of removing unnecessary
+        indexes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-io-timing" xreflabel="track_io_timing">
       <term><varname>track_io_timing</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..8e5ca789fe 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4385,6 +4385,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_seq_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last sequential scan of this table.
+       Only populated if <xref linkend="guc-track-scan-timestamps"/> is enabled
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>seq_tup_read</structfield> <type>bigint</type>
@@ -4403,6 +4413,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last index scan of this table.
+       Only populated if <xref linkend="guc-track-scan-timestamps"/> is enabled
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_fetch</structfield> <type>bigint</type>
@@ -4654,6 +4674,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last scan of this index.
+       Only populated if <xref linkend="guc-track-scan-timestamps"/> is enabled
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_read</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5a844b63a1..801ba03ba4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -657,8 +657,10 @@ CREATE VIEW pg_stat_all_tables AS
             N.nspname AS schemaname,
             C.relname AS relname,
             pg_stat_get_numscans(C.oid) AS seq_scan,
+            pg_stat_get_lastscan(C.oid) AS last_seq_scan,
             pg_stat_get_tuples_returned(C.oid) AS seq_tup_read,
             sum(pg_stat_get_numscans(I.indexrelid))::bigint AS idx_scan,
+            max(pg_stat_get_lastscan(I.indexrelid)) AS last_idx_scan,
             sum(pg_stat_get_tuples_fetched(I.indexrelid))::bigint +
             pg_stat_get_tuples_fetched(C.oid) AS idx_tup_fetch,
             pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins,
@@ -775,6 +777,7 @@ CREATE VIEW pg_stat_all_indexes AS
             C.relname AS relname,
             I.relname AS indexrelname,
             pg_stat_get_numscans(I.oid) AS idx_scan,
+            pg_stat_get_lastscan(I.oid) AS last_idx_scan,
             pg_stat_get_tuples_returned(I.oid) AS idx_tup_read,
             pg_stat_get_tuples_fetched(I.oid) AS idx_tup_fetch
     FROM pg_class C JOIN
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index a846d9ffb6..ea3ceb0792 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -50,6 +50,10 @@ static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop)
 static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
 
 
+/* GUC variables */
+bool		pgstat_track_scan_timestamps = false;
+
+
 /*
  * Copy stats between relations. This is used for things like REINDEX
  * CONCURRENTLY.
@@ -789,6 +793,9 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry = &shtabstats->stats;
 
 	tabentry->numscans += lstats->t_counts.t_numscans;
+	if (pgstat_track_scan_timestamps && lstats->t_counts.t_numscans &&
+		tabentry->lastscan + USECS_PER_SEC < GetCurrentTransactionStopTimestamp())
+			tabentry->lastscan = GetCurrentTimestamp();
 	tabentry->tuples_returned += lstats->t_counts.t_tuples_returned;
 	tabentry->tuples_fetched += lstats->t_counts.t_tuples_fetched;
 	tabentry->tuples_inserted += lstats->t_counts.t_tuples_inserted;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index d9e2a79382..7eb8660989 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -52,6 +52,25 @@ pg_stat_get_numscans(PG_FUNCTION_ARGS)
 }
 
 
+Datum
+pg_stat_get_lastscan(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	TimestampTz	result;
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = tabentry->lastscan;
+
+	if (result == 0)
+		PG_RETURN_NULL();
+	else
+		PG_RETURN_TIMESTAMPTZ(result);
+}
+
+
 Datum
 pg_stat_get_tuples_returned(PG_FUNCTION_ARGS)
 {
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9fbbfb1be5..5de7c1e0f9 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1600,6 +1600,15 @@ static struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"track_scan_timestamps", PGC_SUSET, STATS_CUMULATIVE,
+			gettext_noop("Records the last scan time for relations."),
+			NULL
+		},
+		&pgstat_track_scan_timestamps,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"track_wal_io_timing", PGC_SUSET, STATS_CUMULATIVE,
 			gettext_noop("Collects timing statistics for WAL I/O activity."),
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index be47583122..f5320fb39e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5249,6 +5249,10 @@
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_numscans' },
+{ oid => '2173', descr => 'statistics: time of the last scan for table/index',
+  proname => 'pg_stat_get_lastscan', provolatile => 's', proparallel => 'r',
+  prorettype => 'timestamptz', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_lastscan' },
 { oid => '1929', descr => 'statistics: number of tuples read by seqscan',
   proname => 'pg_stat_get_tuples_returned', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ac28f813b4..18ad9af978 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -242,7 +242,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA7
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA8
 
 typedef struct PgStat_ArchiverStats
 {
@@ -355,6 +355,7 @@ typedef struct PgStat_StatSubEntry
 typedef struct PgStat_StatTabEntry
 {
 	PgStat_Counter numscans;
+	TimestampTz lastscan;
 
 	PgStat_Counter tuples_returned;
 	PgStat_Counter tuples_fetched;
@@ -643,6 +644,7 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
 /* GUC parameters */
 extern PGDLLIMPORT bool pgstat_track_counts;
 extern PGDLLIMPORT int pgstat_track_functions;
+extern PGDLLIMPORT bool pgstat_track_scan_timestamps;
 extern PGDLLIMPORT int pgstat_fetch_consistency;
 
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7ec3d2688f..d9287ef0c9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1768,6 +1768,7 @@ pg_stat_all_indexes| SELECT c.oid AS relid,
     c.relname,
     i.relname AS indexrelname,
     pg_stat_get_numscans(i.oid) AS idx_scan,
+    pg_stat_get_lastscan(i.oid) AS last_idx_scan,
     pg_stat_get_tuples_returned(i.oid) AS idx_tup_read,
     pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch
    FROM (((pg_class c
@@ -1779,8 +1780,10 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     n.nspname AS schemaname,
     c.relname,
     pg_stat_get_numscans(c.oid) AS seq_scan,
+    pg_stat_get_lastscan(c.oid) AS last_seq_scan,
     pg_stat_get_tuples_returned(c.oid) AS seq_tup_read,
     (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan,
+    max(pg_stat_get_lastscan(i.indexrelid)) AS last_idx_scan,
     ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch,
     pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins,
     pg_stat_get_tuples_updated(c.oid) AS n_tup_upd,
@@ -2112,6 +2115,7 @@ pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2120,8 +2124,10 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,
@@ -2156,6 +2162,7 @@ pg_stat_user_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2164,8 +2171,10 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,

#10

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Dave Page (#9)

Re: Tracking last scan time

On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:

On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on a
post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

There are usually other ways to perform monitoring tasks, but there is
something to be said for the convenience of having functionality built in and
not having to rely on tools, scripts, or post-it notes :-)

Should we consider using something cheaper like time() so we don't need
a GUC to enable this?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

#11

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Bruce Momjian (#10)

Re: Tracking last scan time

On Tue, 30 Aug 2022 at 19:46, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:

On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on

a

post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

There are usually other ways to perform monitoring tasks, but there is
something to be said for the convenience of having functionality built

in and

not having to rely on tools, scripts, or post-it notes :-)

Should we consider using something cheaper like time() so we don't need
a GUC to enable this?

Interesting idea, but on my mac at least, 100,000,000 gettimeofday() calls
takes about 2 seconds, whilst 100,000,000 time() calls takes 14(!) seconds.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#12

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Dave Page (#11)

Re: Tracking last scan time

On Wed, Aug 31, 2022 at 05:02:33PM +0100, Dave Page wrote:

On Tue, 30 Aug 2022 at 19:46, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:

On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com> wrote:
I don't have a particular opinion about the patch, I'm just pointing
out that there are other ways. Even just writing down the numbers on

a

post-it note and coming back in a month to see if they've changed is
enough to tell if the table or index has been used.

There are usually other ways to perform monitoring tasks, but there is
something to be said for the convenience of having functionality built in

and

not having to rely on tools, scripts, or post-it notes :-)

Should we consider using something cheaper like time() so we don't need
a GUC to enable this?

Interesting idea, but on my mac at least, 100,000,000 gettimeofday() calls
takes about 2 seconds, whilst 100,000,000 time() calls takes 14(!) seconds.

Wow. I was just thinking you need second-level accuracy, which must be
cheap somewhere.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

#13

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Dave Page (#1)

Re: Tracking last scan time

Hi,

On 2022-08-23 10:55:09 +0100, Dave Page wrote:

Often it is beneficial to review one's schema with a view to removing
indexes (and sometimes tables) that are no longer required. It's very
difficult to understand when that is the case by looking at the number of
scans of a relation as, for example, an index may be used infrequently but
may be critical in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan
time for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
help with this.

Due to the use of gettimeofday(), those values are only maintained if a new
GUC, track_scans, is set to on. By default, it is off.

I did run a 12 hour test to see what the performance impact is. pgbench was
run with scale factor 10000 and 75 users across 4 identical bare metal
machines running Rocky 8 in parallel which showed roughly a -2% average
performance penalty against HEAD with track_scans enabled. Machines were
PowerEdge R7525's with 128GB RAM, dual 16C/32T AMD 7302 CPUs, with the data
directory on 6 x 800GB 12Gb/s SSD SAS drives in RAID 0. Kernel time source
is tsc.

HEAD track_scans Penalty (%)
box1 19582.49735 19341.8881 -1.22869541
box2 19936.55513 19928.07479 -0.04253664659
box3 19631.78895 18649.64379 -5.002830696
box4 19810.86767 19420.67192 -1.969604525
Average 19740.42728 19335.06965 -2.05343896

Based on the size of those numbers this was a r/w pgbench. If it has this
noticable an impact for r/w, with a pretty low number of scans/sec, how's the
overhead for r/o (which can have 2 orders of magnitude more scans/sec)? It
must be quite bad.

I don't think we should accept this feature with this overhead - but I also
think we can do better, by accepting a bit less accuracy. For this to be
useful we don't need a perfectly accurate timestamp. The statement start time
is probably not accurate enough, but we could just have bgwriter or such
update one in shared memory every time we wake up? Or perhaps we could go to
an even lower granularity, by putting in the current LSN or such?

Greetings,

Andres Freund

#14

Matthias van de Meent

boekewurm+postgres@gmail.com

over 3 years ago

In reply to: Andres Freund (#13)

Re: Tracking last scan time

On Wed, 31 Aug 2022 at 18:21, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-08-23 10:55:09 +0100, Dave Page wrote:

Often it is beneficial to review one's schema with a view to removing
indexes (and sometimes tables) that are no longer required. It's very
difficult to understand when that is the case by looking at the number of
scans of a relation as, for example, an index may be used infrequently but
may be critical in those times when it is used.

The attached patch against HEAD adds optional tracking of the last scan
time for relations. It updates pg_stat_*_tables with new last_seq_scan and
last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
help with this.

Due to the use of gettimeofday(), those values are only maintained if a new
GUC, track_scans, is set to on. By default, it is off.

I did run a 12 hour test to see what the performance impact is. pgbench was
run with scale factor 10000 and 75 users across 4 identical bare metal
machines running Rocky 8 in parallel which showed roughly a -2% average
performance penalty against HEAD with track_scans enabled. Machines were
PowerEdge R7525's with 128GB RAM, dual 16C/32T AMD 7302 CPUs, with the data
directory on 6 x 800GB 12Gb/s SSD SAS drives in RAID 0. Kernel time source
is tsc.

HEAD track_scans Penalty (%)
box1 19582.49735 19341.8881 -1.22869541
box2 19936.55513 19928.07479 -0.04253664659
box3 19631.78895 18649.64379 -5.002830696
box4 19810.86767 19420.67192 -1.969604525
Average 19740.42728 19335.06965 -2.05343896

Based on the size of those numbers this was a r/w pgbench. If it has this
noticable an impact for r/w, with a pretty low number of scans/sec, how's the
overhead for r/o (which can have 2 orders of magnitude more scans/sec)? It
must be quite bad.

I don't think we should accept this feature with this overhead - but I also
think we can do better, by accepting a bit less accuracy. For this to be
useful we don't need a perfectly accurate timestamp. The statement start time
is probably not accurate enough, but we could just have bgwriter or such
update one in shared memory every time we wake up? Or perhaps we could go to
an even lower granularity, by putting in the current LSN or such?

I don't think that LSN is precise enough. For example, if you're in a
(mostly) read-only system, the system may go long times without any
meaningful records being written.

As for having a lower granularity and preventing the
one-syscall-per-Relation issue, can't we reuse the query_start or
state_change timestamps that appear in pg_stat_activity (potentially
updated immediately before this stat flush), or some other per-backend
timestamp that is already maintained and considered accurate enough
for this use?
Regardless, with this patch as it is we get a new timestamp for each
relation processed, which I think is a waste of time (heh) even in
VDSO-enabled systems.

Apart from the above, I don't have any other meaningful opinion on
this patch - it might be a good addition, but I don't consume stats
often enough to make a good cost / benefit comparison.

Kind regards,

Matthias van de Meent

#15

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Matthias van de Meent (#14)

Re: Tracking last scan time

On Wed, Aug 31, 2022 at 07:52:49PM +0200, Matthias van de Meent wrote:

As for having a lower granularity and preventing the
one-syscall-per-Relation issue, can't we reuse the query_start or
state_change timestamps that appear in pg_stat_activity (potentially

Yeah, query start should be fine, but not transaction start time.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

#16

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Matthias van de Meent (#14)

Re: Tracking last scan time

Hi,

On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:

As for having a lower granularity and preventing the
one-syscall-per-Relation issue, can't we reuse the query_start or
state_change timestamps that appear in pg_stat_activity (potentially
updated immediately before this stat flush), or some other per-backend
timestamp that is already maintained and considered accurate enough
for this use?

The problem is that it won't change at all for a query that runs for a week -
and we'll report the timestamp from a week ago when it finally ends.

But given this is done when stats are flushed, which only happens after the
transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
we got to flushing the transaction stats we'll already have computed that.

tabentry->numscans += lstats->t_counts.t_numscans;
+	if (pgstat_track_scans && lstats->t_counts.t_numscans)
+		tabentry->lastscan = GetCurrentTimestamp();

Besides replacing GetCurrentTimestamp() with
GetCurrentTransactionStopTimestamp(), this should then also check if
tabentry->lastscan is already newer than the new timestamp.

Greetings,

Andres Freund

#17

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Andres Freund (#16)

Re: Tracking last scan time

On Wed, Aug 31, 2022 at 11:56:29AM -0700, Andres Freund wrote:

Hi,

On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:

As for having a lower granularity and preventing the
one-syscall-per-Relation issue, can't we reuse the query_start or
state_change timestamps that appear in pg_stat_activity (potentially
updated immediately before this stat flush), or some other per-backend
timestamp that is already maintained and considered accurate enough
for this use?

The problem is that it won't change at all for a query that runs for a week -
and we'll report the timestamp from a week ago when it finally ends.

But given this is done when stats are flushed, which only happens after the
transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
we got to flushing the transaction stats we'll already have computed that.

Oh, good point --- it is safer to show a more recent time than a too-old
time.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

#18

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Bruce Momjian (#12)

Re: Tracking last scan time

On Wed, 31 Aug 2022 at 17:13, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Aug 31, 2022 at 05:02:33PM +0100, Dave Page wrote:

On Tue, 30 Aug 2022 at 19:46, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Aug 26, 2022 at 02:05:36PM +0100, Dave Page wrote:

On Thu, 25 Aug 2022 at 01:44, David Rowley <dgrowleyml@gmail.com>

wrote:

I don't have a particular opinion about the patch, I'm just

pointing

out that there are other ways. Even just writing down the

numbers on

a

post-it note and coming back in a month to see if they've

changed is

enough to tell if the table or index has been used.

There are usually other ways to perform monitoring tasks, but

there is

something to be said for the convenience of having functionality

built in

and

not having to rely on tools, scripts, or post-it notes :-)

Should we consider using something cheaper like time() so we don't

need

a GUC to enable this?

Interesting idea, but on my mac at least, 100,000,000 gettimeofday()

calls

takes about 2 seconds, whilst 100,000,000 time() calls takes 14(!)

seconds.

Wow. I was just thinking you need second-level accuracy, which must be
cheap somewhere.

Second-level accuracy would indeed be fine for this. Frankly, for my use
case just the date would be enough, but I can imagine people wanting
greater accuracy than that.

And yes, I was very surprised by the timing results I got as well. I guess
it's a quirk of macOS - on a Linux box I get ~4s for gettimeofday() and ~1s
for time().

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#19

Bruce Momjian

bruce@momjian.us

over 3 years ago

In reply to: Dave Page (#18)

Re: Tracking last scan time

On Thu, Sep 1, 2022 at 09:46:59AM +0100, Dave Page wrote:

On Wed, 31 Aug 2022 at 17:13, Bruce Momjian <bruce@momjian.us> wrote:
Wow. I was just thinking you need second-level accuracy, which must be
cheap somewhere.

Second-level accuracy would indeed be fine for this. Frankly, for my use case
just the date would be enough, but I can imagine people wanting greater
accuracy than that.

And yes, I was very surprised by the timing results I got as well. I guess it's
a quirk of macOS - on a Linux box I get ~4s for gettimeofday() and ~1s for time
().

i think we lose 95% of our users if we require it to be enabled so let's
work to find a way it can be always enabled.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

#20

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Bruce Momjian (#19)

Re: Tracking last scan time

On Thu, 1 Sept 2022 at 13:04, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Sep 1, 2022 at 09:46:59AM +0100, Dave Page wrote:

On Wed, 31 Aug 2022 at 17:13, Bruce Momjian <bruce@momjian.us> wrote:
Wow. I was just thinking you need second-level accuracy, which must

be

cheap somewhere.

Second-level accuracy would indeed be fine for this. Frankly, for my use

case

just the date would be enough, but I can imagine people wanting greater
accuracy than that.

And yes, I was very surprised by the timing results I got as well. I

guess it's

a quirk of macOS - on a Linux box I get ~4s for gettimeofday() and ~1s

for time

().

i think we lose 95% of our users if we require it to be enabled so let's
work to find a way it can be always enabled.

So based on Andres' suggestion, something like this seems like it might
work:

if (pgstat_track_scan_timestamps && lstats->t_counts.t_numscans)
{
TimestampTz t = GetCurrentTransactionStopTimestamp();
if (t > tabentry->lastscan)
tabentry->lastscan = t;
}

If that seems like a good option, I can run some more benchmarks (and then
remove the GUC if it looks good).

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#21

Matthias van de Meent

boekewurm+postgres@gmail.com

over 3 years ago

In reply to: Andres Freund (#16)

Re: Tracking last scan time

On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:

As for having a lower granularity and preventing the
one-syscall-per-Relation issue, can't we reuse the query_start or
state_change timestamps that appear in pg_stat_activity (potentially
updated immediately before this stat flush), or some other per-backend
timestamp that is already maintained and considered accurate enough
for this use?

The problem is that it won't change at all for a query that runs for a week -
and we'll report the timestamp from a week ago when it finally ends.

This earlier proposal to reuse pg_stat_activity values is also invalid
because those timestamps don't exist when you SET track_activities =
OFF.

But given this is done when stats are flushed, which only happens after the
transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
we got to flushing the transaction stats we'll already have computed that.

I'm not entirely happy with that, as that would still add function
call overhead, and potentially still call GetCurrentTimestamp() in
this somewhat hot loop.

As an alternative, we could wire the `now` variable in
pgstat_report_stat (generated from
GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
and then into flush_pending_cb (or, store this in a static variable)
so that we can reuse that value, saving any potential function call
overhead.

tabentry->numscans += lstats->t_counts.t_numscans;
+     if (pgstat_track_scans && lstats->t_counts.t_numscans)
+             tabentry->lastscan = GetCurrentTimestamp();
Besides replacing GetCurrentTimestamp() with
GetCurrentTransactionStopTimestamp(), this should then also check if
tabentry->lastscan is already newer than the new timestamp.

I wonder how important that is. This value only gets set in a stats
flush, which may skew the stat update by several seconds (up to
PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
long that it will set the values to It is possible, but I think it is
extremely unlikely that this is going to be important when you
consider that these stat flushes are not expected to run for more than
1 second.

Kind regards,

Matthias van de Meent

#22

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Matthias van de Meent (#21)

Re: Tracking last scan time

Hi,

On 2022-09-01 14:18:42 +0200, Matthias van de Meent wrote:

On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:

But given this is done when stats are flushed, which only happens after the
transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
we got to flushing the transaction stats we'll already have computed that.

I'm not entirely happy with that, as that would still add function
call overhead, and potentially still call GetCurrentTimestamp() in
this somewhat hot loop.

We already used GetCurrentTransactionStopTimestamp() (as you reference below)
before we get to this point, so I doubt that we'll ever call
GetCurrentTimestamp(). And it's hard to imagine that the function call
overhead of GetCurrentTransactionStopTimestamp() matters compared to acquiring
locks etc.

As an alternative, we could wire the `now` variable in
pgstat_report_stat (generated from
GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
and then into flush_pending_cb (or, store this in a static variable)
so that we can reuse that value, saving any potential function call
overhead.

Passing it in doesn't clearly seem an improvement, but I also don't have a
strong opinion on it. I am strongly against the static variable approach.

tabentry->numscans += lstats->t_counts.t_numscans;
+     if (pgstat_track_scans && lstats->t_counts.t_numscans)
+             tabentry->lastscan = GetCurrentTimestamp();
Besides replacing GetCurrentTimestamp() with
GetCurrentTransactionStopTimestamp(), this should then also check if
tabentry->lastscan is already newer than the new timestamp.
I wonder how important that is. This value only gets set in a stats
flush, which may skew the stat update by several seconds (up to
PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
long that it will set the values to It is possible, but I think it is
extremely unlikely that this is going to be important when you
consider that these stat flushes are not expected to run for more than
1 second.

I think it'll be confusing if you have values going back and forth, even if
just by a little. And it's cheap to defend against, so why not just do that?

Greetings,

Andres Freund

#23

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Andres Freund (#22)

1 attachment(s)

Re: Tracking last scan time

On Thu, 1 Sept 2022 at 19:35, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-09-01 14:18:42 +0200, Matthias van de Meent wrote:

On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:

But given this is done when stats are flushed, which only happens

after the

transaction ended, we can just use

GetCurrentTransactionStopTimestamp() - if

we got to flushing the transaction stats we'll already have computed

that.

I'm not entirely happy with that, as that would still add function
call overhead, and potentially still call GetCurrentTimestamp() in
this somewhat hot loop.

We already used GetCurrentTransactionStopTimestamp() (as you reference
below)
before we get to this point, so I doubt that we'll ever call
GetCurrentTimestamp(). And it's hard to imagine that the function call
overhead of GetCurrentTransactionStopTimestamp() matters compared to
acquiring
locks etc.

Vik and I looked at this a little, and found that we actually don't have
generally have GetCurrentTransactionStopTimestamp() at this point - a
simple 'select * from pg_class' will result in 9 passes of this code, none
of which have xactStopTimestamp != 0.

After discussing it a little, we came to the conclusion that for the stated
use case, xactStartTimestamp is actually accurate enough, provided that we
only ever update it with a newer value. It would only likely be in extreme
edge-cases where the difference between start and end transaction time
would have any bearing on whether or not one might drop a table/index for
lack of use.

Doing it this way also means we no longer need the GUC to enable the
feature, which as Bruce notes, is likely to lose 95% of users.

Updated patch attached:

- GUC removed.
- The timestamp recorded is xactStartTimestamp.
- Docs updated to make it clear we're recording transaction start time.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachments:

last_scan_v3.diffapplication/octet-stream; name=last_scan_v3.diffDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..ff5421da46 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4385,6 +4385,17 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_seq_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last sequential scan of this table. The time value
+       is always the transaction start time of the most recent scan of the
+       table to avoid unnecessary overhead in getting the current time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>seq_tup_read</structfield> <type>bigint</type>
@@ -4403,6 +4414,18 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last index scan of this table. The time value
+       is always the transaction start time of the most recent index
+       scan of the table to avoid unnecessary overhead in getting the
+       current time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_fetch</structfield> <type>bigint</type>
@@ -4654,6 +4677,17 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last scan of this index. The time value
+       is always the transaction start time of the most recent scan of the
+       index to avoid unnecessary overhead in getting the current time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_read</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5a844b63a1..801ba03ba4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -657,8 +657,10 @@ CREATE VIEW pg_stat_all_tables AS
             N.nspname AS schemaname,
             C.relname AS relname,
             pg_stat_get_numscans(C.oid) AS seq_scan,
+            pg_stat_get_lastscan(C.oid) AS last_seq_scan,
             pg_stat_get_tuples_returned(C.oid) AS seq_tup_read,
             sum(pg_stat_get_numscans(I.indexrelid))::bigint AS idx_scan,
+            max(pg_stat_get_lastscan(I.indexrelid)) AS last_idx_scan,
             sum(pg_stat_get_tuples_fetched(I.indexrelid))::bigint +
             pg_stat_get_tuples_fetched(C.oid) AS idx_tup_fetch,
             pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins,
@@ -775,6 +777,7 @@ CREATE VIEW pg_stat_all_indexes AS
             C.relname AS relname,
             I.relname AS indexrelname,
             pg_stat_get_numscans(I.oid) AS idx_scan,
+            pg_stat_get_lastscan(I.oid) AS last_idx_scan,
             pg_stat_get_tuples_returned(I.oid) AS idx_tup_read,
             pg_stat_get_tuples_fetched(I.oid) AS idx_tup_fetch
     FROM pg_class C JOIN
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index a846d9ffb6..c8c6c29a87 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -789,6 +789,12 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry = &shtabstats->stats;
 
 	tabentry->numscans += lstats->t_counts.t_numscans;
+	if (lstats->t_counts.t_numscans)
+	{
+		TimestampTz t = GetCurrentTransactionStartTimestamp();
+		if (t > tabentry->lastscan)
+			tabentry->lastscan = t;
+	}
 	tabentry->tuples_returned += lstats->t_counts.t_tuples_returned;
 	tabentry->tuples_fetched += lstats->t_counts.t_tuples_fetched;
 	tabentry->tuples_inserted += lstats->t_counts.t_tuples_inserted;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index be15b4b2e5..8c0237c496 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -52,6 +52,25 @@ pg_stat_get_numscans(PG_FUNCTION_ARGS)
 }
 
 
+Datum
+pg_stat_get_lastscan(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	TimestampTz	result;
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = tabentry->lastscan;
+
+	if (result == 0)
+		PG_RETURN_NULL();
+	else
+		PG_RETURN_TIMESTAMPTZ(result);
+}
+
+
 Datum
 pg_stat_get_tuples_returned(PG_FUNCTION_ARGS)
 {
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737a33..d9a98e1c2e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5249,6 +5249,10 @@
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_numscans' },
+{ oid => '2173', descr => 'statistics: time of the last scan for table/index',
+  proname => 'pg_stat_get_lastscan', provolatile => 's', proparallel => 'r',
+  prorettype => 'timestamptz', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_lastscan' },
 { oid => '1929', descr => 'statistics: number of tuples read by seqscan',
   proname => 'pg_stat_get_tuples_returned', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ac28f813b4..597e2f238a 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -242,7 +242,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA7
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA8
 
 typedef struct PgStat_ArchiverStats
 {
@@ -355,6 +355,7 @@ typedef struct PgStat_StatSubEntry
 typedef struct PgStat_StatTabEntry
 {
 	PgStat_Counter numscans;
+	TimestampTz lastscan;
 
 	PgStat_Counter tuples_returned;
 	PgStat_Counter tuples_fetched;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7ec3d2688f..d9287ef0c9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1768,6 +1768,7 @@ pg_stat_all_indexes| SELECT c.oid AS relid,
     c.relname,
     i.relname AS indexrelname,
     pg_stat_get_numscans(i.oid) AS idx_scan,
+    pg_stat_get_lastscan(i.oid) AS last_idx_scan,
     pg_stat_get_tuples_returned(i.oid) AS idx_tup_read,
     pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch
    FROM (((pg_class c
@@ -1779,8 +1780,10 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     n.nspname AS schemaname,
     c.relname,
     pg_stat_get_numscans(c.oid) AS seq_scan,
+    pg_stat_get_lastscan(c.oid) AS last_seq_scan,
     pg_stat_get_tuples_returned(c.oid) AS seq_tup_read,
     (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan,
+    max(pg_stat_get_lastscan(i.indexrelid)) AS last_idx_scan,
     ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch,
     pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins,
     pg_stat_get_tuples_updated(c.oid) AS n_tup_upd,
@@ -2112,6 +2115,7 @@ pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2120,8 +2124,10 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,
@@ -2156,6 +2162,7 @@ pg_stat_user_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2164,8 +2171,10 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,

#24

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Dave Page (#23)

Re: Tracking last scan time

Hi,

On 2022-09-06 14:15:56 +0100, Dave Page wrote:

Vik and I looked at this a little, and found that we actually don't have
generally have GetCurrentTransactionStopTimestamp() at this point - a
simple 'select * from pg_class' will result in 9 passes of this code, none
of which have xactStopTimestamp != 0.

Huh, pgstat_report_stat() used GetCurrentTransactionStopTimestamp() has used
for a long time. Wonder when that was broken. Looks like it's set only when a
xid is assigned. We should fix this.

After discussing it a little, we came to the conclusion that for the stated
use case, xactStartTimestamp is actually accurate enough, provided that we
only ever update it with a newer value. It would only likely be in extreme
edge-cases where the difference between start and end transaction time
would have any bearing on whether or not one might drop a table/index for
lack of use.

I don't at all agree with this. Since we already use
GetCurrentTransactionStopTimestamp() in this path we should fix it.

Greetings,

Andres Freund

#25

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 3 years ago

In reply to: Andres Freund (#24)

Re: Tracking last scan time

At Tue, 6 Sep 2022 08:53:25 -0700, Andres Freund <andres@anarazel.de> wrote in

Hi,

On 2022-09-06 14:15:56 +0100, Dave Page wrote:

Vik and I looked at this a little, and found that we actually don't have
generally have GetCurrentTransactionStopTimestamp() at this point - a
simple 'select * from pg_class' will result in 9 passes of this code, none
of which have xactStopTimestamp != 0.

Huh, pgstat_report_stat() used GetCurrentTransactionStopTimestamp() has used
for a long time. Wonder when that was broken. Looks like it's set only when a
xid is assigned. We should fix this.

/*
* GetCurrentTransactionStopTimestamp
*
* We return current time if the transaction stop time hasn't been set
* (which can happen if we decide we don't need to log an XLOG record).

So, that seems like intentional since 2007 (957d08c81f). It seems to
me that the patch assumes that the only other use of the timstamp is
pgstats and it didn't let GetCurrentTransactionStopTimestamp() set the
variable for future use.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#26

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Andres Freund (#24)

1 attachment(s)

Re: Tracking last scan time

On Tue, 6 Sept 2022 at 16:53, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-09-06 14:15:56 +0100, Dave Page wrote:

Vik and I looked at this a little, and found that we actually don't have
generally have GetCurrentTransactionStopTimestamp() at this point - a
simple 'select * from pg_class' will result in 9 passes of this code,

none

of which have xactStopTimestamp != 0.

Huh, pgstat_report_stat() used GetCurrentTransactionStopTimestamp() has
used
for a long time. Wonder when that was broken. Looks like it's set only
when a
xid is assigned. We should fix this.

After discussing it a little, we came to the conclusion that for the

stated

use case, xactStartTimestamp is actually accurate enough, provided that

we

only ever update it with a newer value. It would only likely be in

extreme

edge-cases where the difference between start and end transaction time
would have any bearing on whether or not one might drop a table/index for
lack of use.

I don't at all agree with this. Since we already use
GetCurrentTransactionStopTimestamp() in this path we should fix it.

I just spent some time looking at this, and as far as I can see, we only
set xactStopTimestamp if the transaction needs to be WAL logged (and in
those cases, it is set before the stats callback runs). As you note though,
we are already calling GetCurrentTransactionStopTimestamp() in the
read-only case anyway, and thus already incurring the cost of
gettimeofday().

Here's a v4 patch. This reverts to using
GetCurrentTransactionStopTimestamp() for the last_scan times, and will
set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp()
is called, thus avoiding multiple gettimeofday() calls.
SetCurrentTransactionStopTimestamp() is removed, as is use
of xactStopTimestamp (except when resetting it to 0).

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachments:

last_scan_v4.diffapplication/octet-stream; name=last_scan_v4.diffDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..646859decf 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4385,6 +4385,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_seq_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last sequential scan of this table, based on the
+       most recent transaction stop time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>seq_tup_read</structfield> <type>bigint</type>
@@ -4403,6 +4413,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last index scan of this table, based on the
+       most recent transaction stop time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_fetch</structfield> <type>bigint</type>
@@ -4654,6 +4674,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last scan of this index, based on the
+       most recent transaction stop time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_read</structfield> <type>bigint</type>
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 50f092d7eb..3d79b2848f 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -871,9 +871,10 @@ GetCurrentStatementStartTimestamp(void)
 TimestampTz
 GetCurrentTransactionStopTimestamp(void)
 {
-	if (xactStopTimestamp != 0)
-		return xactStopTimestamp;
-	return GetCurrentTimestamp();
+	if (xactStopTimestamp == 0)
+		xactStopTimestamp = GetCurrentTimestamp();
+
+	return xactStopTimestamp;
 }
 
 /*
@@ -891,15 +892,6 @@ SetCurrentStatementStartTimestamp(void)
 		Assert(stmtStartTimestamp != 0);
 }
 
-/*
- *	SetCurrentTransactionStopTimestamp
- */
-static inline void
-SetCurrentTransactionStopTimestamp(void)
-{
-	xactStopTimestamp = GetCurrentTimestamp();
-}
-
 /*
  *	GetCurrentTransactionNestLevel
  *
@@ -1396,9 +1388,7 @@ RecordTransactionCommit(void)
 		START_CRIT_SECTION();
 		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
 
-		SetCurrentTransactionStopTimestamp();
-
-		XactLogCommitRecord(xactStopTimestamp,
+		XactLogCommitRecord(GetCurrentTransactionStopTimestamp(),
 							nchildren, children, nrels, rels,
 							ndroppedstats, droppedstats,
 							nmsgs, invalMessages,
@@ -1422,7 +1412,7 @@ RecordTransactionCommit(void)
 		 */
 
 		if (!replorigin || replorigin_session_origin_timestamp == 0)
-			replorigin_session_origin_timestamp = xactStopTimestamp;
+			replorigin_session_origin_timestamp = GetCurrentTransactionStopTimestamp();
 
 		TransactionTreeSetCommitTsData(xid, nchildren, children,
 									   replorigin_session_origin_timestamp,
@@ -1754,8 +1744,7 @@ RecordTransactionAbort(bool isSubXact)
 		xact_time = GetCurrentTimestamp();
 	else
 	{
-		SetCurrentTransactionStopTimestamp();
-		xact_time = xactStopTimestamp;
+		xact_time = GetCurrentTransactionStopTimestamp();
 	}
 
 	XactLogAbortRecord(xact_time,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5a844b63a1..801ba03ba4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -657,8 +657,10 @@ CREATE VIEW pg_stat_all_tables AS
             N.nspname AS schemaname,
             C.relname AS relname,
             pg_stat_get_numscans(C.oid) AS seq_scan,
+            pg_stat_get_lastscan(C.oid) AS last_seq_scan,
             pg_stat_get_tuples_returned(C.oid) AS seq_tup_read,
             sum(pg_stat_get_numscans(I.indexrelid))::bigint AS idx_scan,
+            max(pg_stat_get_lastscan(I.indexrelid)) AS last_idx_scan,
             sum(pg_stat_get_tuples_fetched(I.indexrelid))::bigint +
             pg_stat_get_tuples_fetched(C.oid) AS idx_tup_fetch,
             pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins,
@@ -775,6 +777,7 @@ CREATE VIEW pg_stat_all_indexes AS
             C.relname AS relname,
             I.relname AS indexrelname,
             pg_stat_get_numscans(I.oid) AS idx_scan,
+            pg_stat_get_lastscan(I.oid) AS last_idx_scan,
             pg_stat_get_tuples_returned(I.oid) AS idx_tup_read,
             pg_stat_get_tuples_fetched(I.oid) AS idx_tup_fetch
     FROM pg_class C JOIN
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index a846d9ffb6..55a355f583 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -789,6 +789,12 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry = &shtabstats->stats;
 
 	tabentry->numscans += lstats->t_counts.t_numscans;
+	if (lstats->t_counts.t_numscans)
+	{
+		TimestampTz t = GetCurrentTransactionStopTimestamp();
+		if (t > tabentry->lastscan)
+			tabentry->lastscan = t;
+	}
 	tabentry->tuples_returned += lstats->t_counts.t_tuples_returned;
 	tabentry->tuples_fetched += lstats->t_counts.t_tuples_fetched;
 	tabentry->tuples_inserted += lstats->t_counts.t_tuples_inserted;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index be15b4b2e5..8c0237c496 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -52,6 +52,25 @@ pg_stat_get_numscans(PG_FUNCTION_ARGS)
 }
 
 
+Datum
+pg_stat_get_lastscan(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	TimestampTz	result;
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = tabentry->lastscan;
+
+	if (result == 0)
+		PG_RETURN_NULL();
+	else
+		PG_RETURN_TIMESTAMPTZ(result);
+}
+
+
 Datum
 pg_stat_get_tuples_returned(PG_FUNCTION_ARGS)
 {
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737a33..d9a98e1c2e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5249,6 +5249,10 @@
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_numscans' },
+{ oid => '2173', descr => 'statistics: time of the last scan for table/index',
+  proname => 'pg_stat_get_lastscan', provolatile => 's', proparallel => 'r',
+  prorettype => 'timestamptz', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_lastscan' },
 { oid => '1929', descr => 'statistics: number of tuples read by seqscan',
   proname => 'pg_stat_get_tuples_returned', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ac28f813b4..597e2f238a 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -242,7 +242,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA7
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA8
 
 typedef struct PgStat_ArchiverStats
 {
@@ -355,6 +355,7 @@ typedef struct PgStat_StatSubEntry
 typedef struct PgStat_StatTabEntry
 {
 	PgStat_Counter numscans;
+	TimestampTz lastscan;
 
 	PgStat_Counter tuples_returned;
 	PgStat_Counter tuples_fetched;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7ec3d2688f..d9287ef0c9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1768,6 +1768,7 @@ pg_stat_all_indexes| SELECT c.oid AS relid,
     c.relname,
     i.relname AS indexrelname,
     pg_stat_get_numscans(i.oid) AS idx_scan,
+    pg_stat_get_lastscan(i.oid) AS last_idx_scan,
     pg_stat_get_tuples_returned(i.oid) AS idx_tup_read,
     pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch
    FROM (((pg_class c
@@ -1779,8 +1780,10 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     n.nspname AS schemaname,
     c.relname,
     pg_stat_get_numscans(c.oid) AS seq_scan,
+    pg_stat_get_lastscan(c.oid) AS last_seq_scan,
     pg_stat_get_tuples_returned(c.oid) AS seq_tup_read,
     (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan,
+    max(pg_stat_get_lastscan(i.indexrelid)) AS last_idx_scan,
     ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch,
     pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins,
     pg_stat_get_tuples_updated(c.oid) AS n_tup_upd,
@@ -2112,6 +2115,7 @@ pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2120,8 +2124,10 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,
@@ -2156,6 +2162,7 @@ pg_stat_user_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2164,8 +2171,10 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,

#27

Vik Fearing

vik@postgresfriends.org

over 3 years ago

In reply to: Dave Page (#26)

Re: Tracking last scan time

On 9/7/22 12:03, Dave Page wrote:

Here's a v4 patch. This reverts to using
GetCurrentTransactionStopTimestamp() for the last_scan times, and will
set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp()
is called, thus avoiding multiple gettimeofday() calls.
SetCurrentTransactionStopTimestamp() is removed, as is use
of xactStopTimestamp (except when resetting it to 0).

This patch looks good to me and has much saner behavior than what it
replaces.

As a matter of process, the oid for the new function should be in the
8000-9000 range and the catversion should be bumped by the committer.

Marked as Ready for Committer. Thanks for the patch!
--
Vik Fearing

#28

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Vik Fearing (#27)

Re: Tracking last scan time

Hi,

On 2022-09-30 17:58:31 +0200, Vik Fearing wrote:

On 9/7/22 12:03, Dave Page wrote:

Here's a v4 patch. This reverts to using
GetCurrentTransactionStopTimestamp() for the last_scan times, and will
set xactStopTimestamp the first time GetCurrentTransactionStopTimestamp()
is called, thus avoiding multiple gettimeofday() calls.
SetCurrentTransactionStopTimestamp() is removed, as is use
of xactStopTimestamp (except when resetting it to 0).

This patch looks good to me and has much saner behavior than what it
replaces.

I agree. However, it seems like a significant enough behavioural change that
I'd rather commit it as a separate patch. I agree with Vik's judgement that
the patch otherwise is otherwise ready. Happy to do that split myself, or you
can do it...

Greetings,

Andres Freund

#29

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Andres Freund (#28)

Re: Tracking last scan time

On Fri, 30 Sept 2022 at 18:58, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-09-30 17:58:31 +0200, Vik Fearing wrote:

On 9/7/22 12:03, Dave Page wrote:

Here's a v4 patch. This reverts to using
GetCurrentTransactionStopTimestamp() for the last_scan times, and will
set xactStopTimestamp the first time

GetCurrentTransactionStopTimestamp()

is called, thus avoiding multiple gettimeofday() calls.
SetCurrentTransactionStopTimestamp() is removed, as is use
of xactStopTimestamp (except when resetting it to 0).

This patch looks good to me and has much saner behavior than what it
replaces.

I agree. However, it seems like a significant enough behavioural change
that
I'd rather commit it as a separate patch. I agree with Vik's judgement
that
the patch otherwise is otherwise ready. Happy to do that split myself, or
you
can do it...

Thanks. It's just the changes in xact.c, so it doesn't seem like it would
cause you any more work either way, in which case, I'll leave it to you :-)

FYI, the OID I chose was simply the closest single value to those used for
the other related functions (e.g. pg_stat_get_numscans). Seemed like a good
way to use up one more random unused value, but I don't care if it gets
changed to the 8000+ range.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#30

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Dave Page (#29)

Re: Tracking last scan time

On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:

Thanks. It's just the changes in xact.c, so it doesn't seem like it would
cause you any more work either way, in which case, I'll leave it to you :-)

Okay, I have just moved the patch to the next CF then, still marked as
ready for committer. Are you planning to look at that?
--
Michael

#31

Dave Page

dpage@pgadmin.org

over 3 years ago

In reply to: Michael Paquier (#30)

Re: Tracking last scan time

On Wed, 12 Oct 2022 at 07:40, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:

Thanks. It's just the changes in xact.c, so it doesn't seem like it would
cause you any more work either way, in which case, I'll leave it to you

:-)

Okay, I have just moved the patch to the next CF then, still marked as
ready for committer. Are you planning to look at that?

Thanks. Was the question directed at me or Andres?

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#32

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Dave Page (#31)

Re: Tracking last scan time

On Wed, Oct 12, 2022 at 09:09:46AM +0100, Dave Page wrote:

On Wed, 12 Oct 2022 at 07:40, Michael Paquier <michael@paquier.xyz> wrote:

Okay, I have just moved the patch to the next CF then, still marked as
ready for committer. Are you planning to look at that?

Thanks. Was the question directed at me or Andres?

Apologies for the confusion. This question is addressed to Andres.
--
Michael

#33

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Michael Paquier (#30)

Re: Tracking last scan time

Hi,

On 2022-10-12 15:40:21 +0900, Michael Paquier wrote:

On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:

Thanks. It's just the changes in xact.c, so it doesn't seem like it would
cause you any more work either way, in which case, I'll leave it to you :-)

Okay, I have just moved the patch to the next CF then, still marked as
ready for committer. Are you planning to look at that?

Yep, doing so right now.

I think this should have at a basic test in src/test/regress/sql/stats.sql. If
I can write one in a few minutes I'll go for that, otherwise will reply
detailing difficulties.

+      <para>
+       The time of the last sequential scan of this table, based on the
+       most recent transaction stop time
+      </para></entry>

Related rows seem to say "on this table".

Greetings,

Andres Freund

#34

Andres Freund

andres@anarazel.de

about 3 years ago

In reply to: Andres Freund (#33)

3 attachment(s)

Re: Tracking last scan time

Hi,

On 2022-10-12 12:50:31 -0700, Andres Freund wrote:

I think this should have at a basic test in src/test/regress/sql/stats.sql. If
I can write one in a few minutes I'll go for that, otherwise will reply
detailing difficulties.

Took a bit longer (+lunch). Attached.

In the attached 0001, the patch to make GetCurrentTransactionStopTimestamp()
set xactStopTimestamp, I added a few comment updates and an Assert() to ensure
that CurrentTransactionState->state is TRANS_(DEFAULT|COMMIT|ABORT|PREPARE). I
am worried that otherwise we might end up with someone ending up using it in a
place before the end of the transaction, which'd then end up recording the
wrong timestamp in the commit/abort record.

For 0002, the commit adding lastscan, I added catversion/stats version bumps
(because I was planning to commit it already...), a commit message, and that
minor docs change mentioned earlier.

0003 adds the tests mentioned above. I plan to merge them with 0002, but left
them separate for easier review for now.

To be able to compare timestamps for > not just >= we need to make sure that
two subsequent timestamps differ. The attached achieves this by sleeping for
100ms between those points - we do that in other places already. I'd started
out with 10ms, which I am fairly sure would suffice, but then deciced to copy
the existing 100ms sleeps.

I verified tests pass under valgrind, debug_discard_caches and after I make
pgstat_report_stat() only flush when force is passed in.

Greetings,

Andres Freund

Attachments:

v5-0001-Have-GetCurrentTransactionStopTimestamp-set-xactS.patchtext/x-diff; charset=us-asciiDownload

From 976679bc590560d3c051e30e7576e7d3d03070e6 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 12 Oct 2022 14:43:41 -0700
Subject: [PATCH v5 1/3] Have GetCurrentTransactionStopTimestamp() set
 xactStopTimestamp if unset

Previously GetCurrentTransactionStopTimestamp() computed a new timestamp
whenever xactStopTimestamp was unset and xactStopTimestamp was only set when a
commit or abort record was written.

An upcoming patch will add additional calls to
GetCurrentTransactionStopTimestamp() from pgstats. To avoid computing
timestamps multiple times, set xactStopTimestamp in
GetCurrentTransactionStopTimestamp() if not already set.

Author: Dave Page <dpage@pgadmin.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Discussion: https://postgr.es/m/20220906155325.an3xesq5o3fq36gt%40awork3.anarazel.de
---
 src/backend/access/transam/xact.c | 42 +++++++++++++++----------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c1ffbd89b88..fd5103a78e2 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -263,7 +263,10 @@ static bool currentCommandIdUsed;
 /*
  * xactStartTimestamp is the value of transaction_timestamp().
  * stmtStartTimestamp is the value of statement_timestamp().
- * xactStopTimestamp is the time at which we log a commit or abort WAL record.
+ * xactStopTimestamp is the time at which we log a commit / abort WAL record,
+ * or if that was skipped, the time of the first subsequent
+ * GetCurrentTransactionStopTimestamp() call.
+ *
  * These do not change as we enter and exit subtransactions, so we don't
  * keep them inside the TransactionState stack.
  */
@@ -865,15 +868,24 @@ GetCurrentStatementStartTimestamp(void)
 /*
  *	GetCurrentTransactionStopTimestamp
  *
- * We return current time if the transaction stop time hasn't been set
- * (which can happen if we decide we don't need to log an XLOG record).
+ * If the transaction stop time hasn't already been set, which can happen if
+ * we decided we don't need to log an XLOG record, set xactStopTimestamp.
  */
 TimestampTz
 GetCurrentTransactionStopTimestamp(void)
 {
-	if (xactStopTimestamp != 0)
-		return xactStopTimestamp;
-	return GetCurrentTimestamp();
+	TransactionState s PG_USED_FOR_ASSERTS_ONLY = CurrentTransactionState;
+
+	/* should only be called after commit / abort processing */
+	Assert(s->state == TRANS_DEFAULT ||
+		   s->state == TRANS_COMMIT ||
+		   s->state == TRANS_ABORT ||
+		   s->state == TRANS_PREPARE);
+
+	if (xactStopTimestamp == 0)
+		xactStopTimestamp = GetCurrentTimestamp();
+
+	return xactStopTimestamp;
 }
 
 /*
@@ -891,15 +903,6 @@ SetCurrentStatementStartTimestamp(void)
 		Assert(stmtStartTimestamp != 0);
 }
 
-/*
- *	SetCurrentTransactionStopTimestamp
- */
-static inline void
-SetCurrentTransactionStopTimestamp(void)
-{
-	xactStopTimestamp = GetCurrentTimestamp();
-}
-
 /*
  *	GetCurrentTransactionNestLevel
  *
@@ -1396,9 +1399,7 @@ RecordTransactionCommit(void)
 		START_CRIT_SECTION();
 		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
 
-		SetCurrentTransactionStopTimestamp();
-
-		XactLogCommitRecord(xactStopTimestamp,
+		XactLogCommitRecord(GetCurrentTransactionStopTimestamp(),
 							nchildren, children, nrels, rels,
 							ndroppedstats, droppedstats,
 							nmsgs, invalMessages,
@@ -1422,7 +1423,7 @@ RecordTransactionCommit(void)
 		 */
 
 		if (!replorigin || replorigin_session_origin_timestamp == 0)
-			replorigin_session_origin_timestamp = xactStopTimestamp;
+			replorigin_session_origin_timestamp = GetCurrentTransactionStopTimestamp();
 
 		TransactionTreeSetCommitTsData(xid, nchildren, children,
 									   replorigin_session_origin_timestamp,
@@ -1754,8 +1755,7 @@ RecordTransactionAbort(bool isSubXact)
 		xact_time = GetCurrentTimestamp();
 	else
 	{
-		SetCurrentTransactionStopTimestamp();
-		xact_time = xactStopTimestamp;
+		xact_time = GetCurrentTransactionStopTimestamp();
 	}
 
 	XactLogAbortRecord(xact_time,
-- 
2.38.0

v5-0002-pgstat-Track-time-of-the-last-scan-of-a-relation.patchtext/x-diff; charset=us-asciiDownload

From 5aee38649448e8b1f3c96e64555b59bd02708e3e Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 12 Oct 2022 14:45:39 -0700
Subject: [PATCH v5 2/3] pgstat: Track time of the last scan of a relation

It can be useful to know when a relation has last been used, e.g., when
evaluating whether an index is still required. It was already possible to
infer the time of the last usage by tracking, e.g.,
pg_stat_all_indexes.idx_scan over time. But far from everybody does so.

To make it easier to detect the last time a relation has been scanned, track
that time in each relation's pgstat entry. To minimize overhead a) the
timestamp is updated only when the backend pending stats entry is flushed to
shared stats b) the last transaction's stop timestamp is used as the
timestamp.

Bumps catalog and stats format versions.

Author: Dave Page <dpage@pgadmin.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bruce Momjian <bruce@momjian.us>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Discussion: https://postgr.es/m/CA+OCxozrVHNFVEPkweUHMZje+t1tfY816d9MZYc6eZwOOusOaQ@mail.gmail.com
---
 src/include/catalog/catversion.h             |  2 +-
 src/include/catalog/pg_proc.dat              |  4 +++
 src/include/pgstat.h                         |  3 +-
 src/backend/catalog/system_views.sql         |  3 ++
 src/backend/utils/activity/pgstat_relation.c |  6 ++++
 src/backend/utils/adt/pgstatfuncs.c          | 13 +++++++++
 doc/src/sgml/monitoring.sgml                 | 30 ++++++++++++++++++++
 src/test/regress/expected/rules.out          |  9 ++++++
 8 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index c1af6eaf5ff..e30f0262c59 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -57,6 +57,6 @@
  */
 
 /*							yyyymmddN */
-#define CATALOG_VERSION_NO	202209291
+#define CATALOG_VERSION_NO	202210121
 
 #endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 68bb032d3ea..62a5b8e655d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5252,6 +5252,10 @@
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_numscans' },
+{ oid => '9976', descr => 'statistics: time of the last scan for table/index',
+  proname => 'pg_stat_get_lastscan', provolatile => 's', proparallel => 'r',
+  prorettype => 'timestamptz', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_lastscan' },
 { oid => '1929', descr => 'statistics: number of tuples read by seqscan',
   proname => 'pg_stat_get_tuples_returned', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index cc1d1dcb7d2..9e2ce6f0111 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -242,7 +242,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA8
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA9
 
 typedef struct PgStat_ArchiverStats
 {
@@ -354,6 +354,7 @@ typedef struct PgStat_StatSubEntry
 typedef struct PgStat_StatTabEntry
 {
 	PgStat_Counter numscans;
+	TimestampTz lastscan;
 
 	PgStat_Counter tuples_returned;
 	PgStat_Counter tuples_fetched;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f7ec79e05..2d8104b0907 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -656,8 +656,10 @@ CREATE VIEW pg_stat_all_tables AS
             N.nspname AS schemaname,
             C.relname AS relname,
             pg_stat_get_numscans(C.oid) AS seq_scan,
+            pg_stat_get_lastscan(C.oid) AS last_seq_scan,
             pg_stat_get_tuples_returned(C.oid) AS seq_tup_read,
             sum(pg_stat_get_numscans(I.indexrelid))::bigint AS idx_scan,
+            max(pg_stat_get_lastscan(I.indexrelid)) AS last_idx_scan,
             sum(pg_stat_get_tuples_fetched(I.indexrelid))::bigint +
             pg_stat_get_tuples_fetched(C.oid) AS idx_tup_fetch,
             pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins,
@@ -774,6 +776,7 @@ CREATE VIEW pg_stat_all_indexes AS
             C.relname AS relname,
             I.relname AS indexrelname,
             pg_stat_get_numscans(I.oid) AS idx_scan,
+            pg_stat_get_lastscan(I.oid) AS last_idx_scan,
             pg_stat_get_tuples_returned(I.oid) AS idx_tup_read,
             pg_stat_get_tuples_fetched(I.oid) AS idx_tup_fetch
     FROM pg_class C JOIN
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index a846d9ffb65..55a355f583b 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -789,6 +789,12 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry = &shtabstats->stats;
 
 	tabentry->numscans += lstats->t_counts.t_numscans;
+	if (lstats->t_counts.t_numscans)
+	{
+		TimestampTz t = GetCurrentTransactionStopTimestamp();
+		if (t > tabentry->lastscan)
+			tabentry->lastscan = t;
+	}
 	tabentry->tuples_returned += lstats->t_counts.t_tuples_returned;
 	tabentry->tuples_fetched += lstats->t_counts.t_tuples_fetched;
 	tabentry->tuples_inserted += lstats->t_counts.t_tuples_inserted;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index eadd8464ff2..85ac3e3f04f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -52,6 +52,19 @@ pg_stat_get_numscans(PG_FUNCTION_ARGS)
 }
 
 
+Datum
+pg_stat_get_lastscan(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+	else
+		PG_RETURN_TIMESTAMPTZ(tabentry->lastscan);
+}
+
+
 Datum
 pg_stat_get_tuples_returned(PG_FUNCTION_ARGS)
 {
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 342b20ebeb0..60a2026b0b0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4385,6 +4385,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_seq_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last sequential scan on this table, based on the
+       most recent transaction stop time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>seq_tup_read</structfield> <type>bigint</type>
@@ -4403,6 +4413,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last index scan on this table, based on the
+       most recent transaction stop time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_fetch</structfield> <type>bigint</type>
@@ -4654,6 +4674,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_idx_scan</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+       The time of the last scan on this index, based on the
+       most recent transaction stop time
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>idx_tup_read</structfield> <type>bigint</type>
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 9dd137415e8..bfcd8ac9a06 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1763,6 +1763,7 @@ pg_stat_all_indexes| SELECT c.oid AS relid,
     c.relname,
     i.relname AS indexrelname,
     pg_stat_get_numscans(i.oid) AS idx_scan,
+    pg_stat_get_lastscan(i.oid) AS last_idx_scan,
     pg_stat_get_tuples_returned(i.oid) AS idx_tup_read,
     pg_stat_get_tuples_fetched(i.oid) AS idx_tup_fetch
    FROM (((pg_class c
@@ -1774,8 +1775,10 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     n.nspname AS schemaname,
     c.relname,
     pg_stat_get_numscans(c.oid) AS seq_scan,
+    pg_stat_get_lastscan(c.oid) AS last_seq_scan,
     pg_stat_get_tuples_returned(c.oid) AS seq_tup_read,
     (sum(pg_stat_get_numscans(i.indexrelid)))::bigint AS idx_scan,
+    max(pg_stat_get_lastscan(i.indexrelid)) AS last_idx_scan,
     ((sum(pg_stat_get_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_tuples_fetched(c.oid)) AS idx_tup_fetch,
     pg_stat_get_tuples_inserted(c.oid) AS n_tup_ins,
     pg_stat_get_tuples_updated(c.oid) AS n_tup_upd,
@@ -2107,6 +2110,7 @@ pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2115,8 +2119,10 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,
@@ -2151,6 +2157,7 @@ pg_stat_user_indexes| SELECT pg_stat_all_indexes.relid,
     pg_stat_all_indexes.relname,
     pg_stat_all_indexes.indexrelname,
     pg_stat_all_indexes.idx_scan,
+    pg_stat_all_indexes.last_idx_scan,
     pg_stat_all_indexes.idx_tup_read,
     pg_stat_all_indexes.idx_tup_fetch
    FROM pg_stat_all_indexes
@@ -2159,8 +2166,10 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.schemaname,
     pg_stat_all_tables.relname,
     pg_stat_all_tables.seq_scan,
+    pg_stat_all_tables.last_seq_scan,
     pg_stat_all_tables.seq_tup_read,
     pg_stat_all_tables.idx_scan,
+    pg_stat_all_tables.last_idx_scan,
     pg_stat_all_tables.idx_tup_fetch,
     pg_stat_all_tables.n_tup_ins,
     pg_stat_all_tables.n_tup_upd,
-- 
2.38.0

v5-0003-pgstat-Add-tests-for-last_seq_scan-last_idx_scan.patchtext/x-diff; charset=us-asciiDownload

From 27569cdd1be0a9fd5d4fb672a459da09180eb9ea Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 12 Oct 2022 14:48:48 -0700
Subject: [PATCH v5 3/3] pgstat: Add tests for last_seq_scan, last_idx_scan

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/test/regress/expected/stats.out | 202 ++++++++++++++++++++++++++++
 src/test/regress/sql/stats.sql      |  86 ++++++++++++
 2 files changed, 288 insertions(+)

diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index f701da20697..257a6a9da9a 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -556,6 +556,208 @@ SELECT pg_stat_get_live_tuples(:drop_stats_test_subxact_oid);
 
 DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_test3, trunc_stats_test4;
 DROP TABLE prevstats;
+-----
+-- Test that last_seq_scan, last_idx_scan are correctly maintained
+--
+-- Perform test using a temporary table. That way autovacuum etc won't
+-- interfere. To be able to check that timestamps increase, we sleep for 100ms
+-- between tests, assuming that there aren't systems with a coarser timestamp
+-- granularity.
+-----
+BEGIN;
+CREATE TEMPORARY TABLE test_last_scan(idx_col int primary key, noidx_col int);
+INSERT INTO test_last_scan(idx_col, noidx_col) VALUES(1, 1);
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+COMMIT;
+SELECT pg_stat_reset_single_table_counters('test_last_scan'::regclass);
+ pg_stat_reset_single_table_counters 
+-------------------------------------
+ 
+(1 row)
+
+SELECT seq_scan, idx_scan FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+ seq_scan | idx_scan 
+----------+----------
+        0 |        0
+(1 row)
+
+-- ensure we start out with exactly one index and sequential scan
+BEGIN;
+SET LOCAL enable_seqscan TO on;
+SET LOCAL enable_indexscan TO on;
+SET LOCAL enable_bitmapscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+            QUERY PLAN            
+----------------------------------
+ Aggregate
+   ->  Seq Scan on test_last_scan
+         Filter: (noidx_col = 1)
+(3 rows)
+
+SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+ count 
+-------
+     1
+(1 row)
+
+SET LOCAL enable_seqscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   ->  Index Scan using test_last_scan_pkey on test_last_scan
+         Index Cond: (idx_col = 1)
+(3 rows)
+
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+ count 
+-------
+     1
+(1 row)
+
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+COMMIT;
+-- fetch timestamps from before the next test
+SELECT last_seq_scan AS test_last_seq, last_idx_scan AS test_last_idx
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
+SELECT pg_sleep(0.1); -- assume a minimum timestamp granularity of 100ms
+ pg_sleep 
+----------
+ 
+(1 row)
+
+-- cause one sequential scan
+BEGIN;
+SET LOCAL enable_seqscan TO on;
+SET LOCAL enable_indexscan TO off;
+SET LOCAL enable_bitmapscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+            QUERY PLAN            
+----------------------------------
+ Aggregate
+   ->  Seq Scan on test_last_scan
+         Filter: (noidx_col = 1)
+(3 rows)
+
+SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+ count 
+-------
+     1
+(1 row)
+
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+COMMIT;
+-- check that just sequential scan stats were incremented
+SELECT seq_scan, :'test_last_seq' < last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' = last_idx_scan AS idx_ok
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+ seq_scan | seq_ok | idx_scan | idx_ok 
+----------+--------+----------+--------
+        2 | t      |        1 | t
+(1 row)
+
+-- fetch timestamps from before the next test
+SELECT last_seq_scan AS test_last_seq, last_idx_scan AS test_last_idx
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
+SELECT pg_sleep(0.1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+-- cause one index scan
+BEGIN;
+SET LOCAL enable_seqscan TO off;
+SET LOCAL enable_indexscan TO on;
+SET LOCAL enable_bitmapscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Aggregate
+   ->  Index Scan using test_last_scan_pkey on test_last_scan
+         Index Cond: (idx_col = 1)
+(3 rows)
+
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+ count 
+-------
+     1
+(1 row)
+
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+COMMIT;
+-- check that just index scan stats were incremented
+SELECT seq_scan, :'test_last_seq' = last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' < last_idx_scan AS idx_ok
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+ seq_scan | seq_ok | idx_scan | idx_ok 
+----------+--------+----------+--------
+        2 | t      |        2 | t
+(1 row)
+
+-- fetch timestamps from before the next test
+SELECT last_seq_scan AS test_last_seq, last_idx_scan AS test_last_idx
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
+SELECT pg_sleep(0.1);
+ pg_sleep 
+----------
+ 
+(1 row)
+
+-- cause one bitmap index scan
+BEGIN;
+SET LOCAL enable_seqscan TO off;
+SET LOCAL enable_indexscan TO off;
+SET LOCAL enable_bitmapscan TO on;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+                      QUERY PLAN                      
+------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on test_last_scan
+         Recheck Cond: (idx_col = 1)
+         ->  Bitmap Index Scan on test_last_scan_pkey
+               Index Cond: (idx_col = 1)
+(5 rows)
+
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+ count 
+-------
+     1
+(1 row)
+
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+COMMIT;
+-- check that just index scan stats were incremented
+SELECT seq_scan, :'test_last_seq' = last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' < last_idx_scan AS idx_ok
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+ seq_scan | seq_ok | idx_scan | idx_ok 
+----------+--------+----------+--------
+        2 | t      |        3 | t
+(1 row)
+
 -----
 -- Test that various stats views are being properly populated
 -----
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index eb081f65a42..f6270f7badb 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -290,6 +290,92 @@ DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_t
 DROP TABLE prevstats;
 
 
+-----
+-- Test that last_seq_scan, last_idx_scan are correctly maintained
+--
+-- Perform test using a temporary table. That way autovacuum etc won't
+-- interfere. To be able to check that timestamps increase, we sleep for 100ms
+-- between tests, assuming that there aren't systems with a coarser timestamp
+-- granularity.
+-----
+
+BEGIN;
+CREATE TEMPORARY TABLE test_last_scan(idx_col int primary key, noidx_col int);
+INSERT INTO test_last_scan(idx_col, noidx_col) VALUES(1, 1);
+SELECT pg_stat_force_next_flush();
+COMMIT;
+
+SELECT pg_stat_reset_single_table_counters('test_last_scan'::regclass);
+SELECT seq_scan, idx_scan FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+
+-- ensure we start out with exactly one index and sequential scan
+BEGIN;
+SET LOCAL enable_seqscan TO on;
+SET LOCAL enable_indexscan TO on;
+SET LOCAL enable_bitmapscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+SET LOCAL enable_seqscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT pg_stat_force_next_flush();
+COMMIT;
+
+-- fetch timestamps from before the next test
+SELECT last_seq_scan AS test_last_seq, last_idx_scan AS test_last_idx
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
+SELECT pg_sleep(0.1); -- assume a minimum timestamp granularity of 100ms
+
+-- cause one sequential scan
+BEGIN;
+SET LOCAL enable_seqscan TO on;
+SET LOCAL enable_indexscan TO off;
+SET LOCAL enable_bitmapscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
+SELECT pg_stat_force_next_flush();
+COMMIT;
+-- check that just sequential scan stats were incremented
+SELECT seq_scan, :'test_last_seq' < last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' = last_idx_scan AS idx_ok
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+
+-- fetch timestamps from before the next test
+SELECT last_seq_scan AS test_last_seq, last_idx_scan AS test_last_idx
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
+SELECT pg_sleep(0.1);
+
+-- cause one index scan
+BEGIN;
+SET LOCAL enable_seqscan TO off;
+SET LOCAL enable_indexscan TO on;
+SET LOCAL enable_bitmapscan TO off;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT pg_stat_force_next_flush();
+COMMIT;
+-- check that just index scan stats were incremented
+SELECT seq_scan, :'test_last_seq' = last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' < last_idx_scan AS idx_ok
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+
+-- fetch timestamps from before the next test
+SELECT last_seq_scan AS test_last_seq, last_idx_scan AS test_last_idx
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
+SELECT pg_sleep(0.1);
+
+-- cause one bitmap index scan
+BEGIN;
+SET LOCAL enable_seqscan TO off;
+SET LOCAL enable_indexscan TO off;
+SET LOCAL enable_bitmapscan TO on;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT pg_stat_force_next_flush();
+COMMIT;
+-- check that just index scan stats were incremented
+SELECT seq_scan, :'test_last_seq' = last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' < last_idx_scan AS idx_ok
+FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+
+
 -----
 -- Test that various stats views are being properly populated
 -----
-- 
2.38.0

#35

Dave Page

dpage@pgadmin.org

about 3 years ago

In reply to: Andres Freund (#34)

Re: Tracking last scan time

On Wed, 12 Oct 2022 at 23:52, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-12 12:50:31 -0700, Andres Freund wrote:

I think this should have at a basic test in

src/test/regress/sql/stats.sql. If

I can write one in a few minutes I'll go for that, otherwise will reply
detailing difficulties.

Took a bit longer (+lunch). Attached.

In the attached 0001, the patch to make
GetCurrentTransactionStopTimestamp()
set xactStopTimestamp, I added a few comment updates and an Assert() to
ensure
that CurrentTransactionState->state is
TRANS_(DEFAULT|COMMIT|ABORT|PREPARE). I
am worried that otherwise we might end up with someone ending up using it
in a
place before the end of the transaction, which'd then end up recording the
wrong timestamp in the commit/abort record.

For 0002, the commit adding lastscan, I added catversion/stats version
bumps
(because I was planning to commit it already...), a commit message, and
that
minor docs change mentioned earlier.

0003 adds the tests mentioned above. I plan to merge them with 0002, but
left
them separate for easier review for now.

To be able to compare timestamps for > not just >= we need to make sure
that
two subsequent timestamps differ. The attached achieves this by sleeping
for
100ms between those points - we do that in other places already. I'd
started
out with 10ms, which I am fairly sure would suffice, but then deciced to
copy
the existing 100ms sleeps.

I verified tests pass under valgrind, debug_discard_caches and after I make
pgstat_report_stat() only flush when force is passed in.

Thanks for that. It looks good to me, bar one comment (repeated 3 times in
the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#36

Andres Freund

andres@anarazel.de

about 3 years ago

In reply to: Dave Page (#35)

Re: Tracking last scan time

Hi,

On 2022-10-13 14:38:06 +0100, Dave Page wrote:

Thanks for that. It looks good to me, bar one comment (repeated 3 times in
the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks for the feature Dave and the reviews everyone.

Greetings,

Andres Freund

#37

Dave Page

dpage@pgadmin.org

about 3 years ago

In reply to: Andres Freund (#36)

Re: Tracking last scan time

On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-13 14:38:06 +0100, Dave Page wrote:

Thanks for that. It looks good to me, bar one comment (repeated 3 times

in

the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks!

--

--
Dave Page
https://pgsnake.blogspot.com

EDB Postgres
https://www.enterprisedb.com

#38

Robert Treat

rob@xzilla.net

about 3 years ago

In reply to: Dave Page (#37)

Re: Tracking last scan time

On Fri, Oct 14, 2022 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:

On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:

On 2022-10-13 14:38:06 +0100, Dave Page wrote:

Thanks for that. It looks good to me, bar one comment (repeated 3 times in
the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks!

Hey folks,

I was looking at this a bit further (great addition btw) and noticed
the following behavior (this is a mre of the original testing that
uncovered this):

Normally we populate "last" columns with a NULL value when the
corresponding marker is zero, which seems correct in the first query,
but no longer matches in the second. I can see an argument that this
is a necessary exception to that rule (I'm not sure I agree with it,
but I see it) but even in that scenario, ISTM we should avoid
populating the table with a "special value", which generally goes
against observability best practices, and I believe we've been able to
avoid it elsewhere.

Beyond that, I also notice the behavior changes when adding a table
with a PK, though not necessarily better...

pagila=# create table x (xx int primary key) ;
CREATE TABLE

pagila=# insert into x select 1;
INSERT 0 1

This time the create table both populate a sequential scan and
populates the last_seq_scan with a real/correct value. However an
insert into the table neither advances the seq_scan nor the
last_seq_scan values which seems like different behavior from my
original example, with the added negative that the last_idx_scan is
now populated with a special value :-(

I think the simplest fix which should correspond to existing versions
behavior would be to just ensure that we replace any "special value"
timestamps with a real transaction timestamp, and then maybe note that
these fields may be advanced by operations which don't strictly show
up as a sequential or index scan.

Robert Treat
https://xzilla.net

#39

Dave Page

dpage@pgadmin.org

about 3 years ago

In reply to: Robert Treat (#38)

Re: Tracking last scan time

FYI, this is not intentional, and I do plan to look into it, however I've
been somewhat busy with pgconfeu, and am travelling for the rest of this
week as well.

On Sun, 23 Oct 2022 at 21:09, Robert Treat <rob@xzilla.net> wrote:

On Fri, Oct 14, 2022 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:

On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:

On 2022-10-13 14:38:06 +0100, Dave Page wrote:

Thanks for that. It looks good to me, bar one comment (repeated 3

times in

the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks!

Hey folks,

I was looking at this a bit further (great addition btw) and noticed
the following behavior (this is a mre of the original testing that
uncovered this):

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int);
CREATE TABLE
Time: 2.145 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | [null] |
0 | [null] | [null] | [null] | 0 | 0 |
0 | 0 | 0 | 0 |
0 | 0 | [null] | [null] | [null]
| [null] | 0 | 0 | 0 |
0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | 1999-12-31 19:00:00-05 |
0 | [null] | [null] | [null] | 1 |
0 | 0 | 0 | 1 | 0 |
1 | 1 | [null] | [null] |
[null] | [null] | 0 | 0 |
0 | 0
(1 row)

Normally we populate "last" columns with a NULL value when the
corresponding marker is zero, which seems correct in the first query,
but no longer matches in the second. I can see an argument that this
is a necessary exception to that rule (I'm not sure I agree with it,
but I see it) but even in that scenario, ISTM we should avoid
populating the table with a "special value", which generally goes
against observability best practices, and I believe we've been able to
avoid it elsewhere.

Beyond that, I also notice the behavior changes when adding a table
with a PK, though not necessarily better...

pagila=# drop table x;
DROP TABLE
Time: 2.896 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int primary key) ;
CREATE TABLE

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+-------------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | [null] |
0 | 0 | 0 | 0 | 0 | 0
| 0 | 0 | 0 | [null]
| [null] | [null] | [null] | 0 |
0 | 0 | 0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+-------------------------------+--------------+----------+------------------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | 1999-12-31 19:00:00-05
| 0 | 1 | 0 | 0 | 0 |
1 | 0 | 1 | 1 |
[null] | [null] | [null] | [null] |
0 | 0 | 0 | 0
(1 row)

This time the create table both populate a sequential scan and
populates the last_seq_scan with a real/correct value. However an
insert into the table neither advances the seq_scan nor the
last_seq_scan values which seems like different behavior from my
original example, with the added negative that the last_idx_scan is
now populated with a special value :-(

I think the simplest fix which should correspond to existing versions
behavior would be to just ensure that we replace any "special value"
timestamps with a real transaction timestamp, and then maybe note that
these fields may be advanced by operations which don't strictly show
up as a sequential or index scan.

Robert Treat
https://xzilla.net

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

#40

Dave Page

dpage@pgadmin.org

about 3 years ago

In reply to: Dave Page (#39)

1 attachment(s)

Re: Tracking last scan time

On Mon, 31 Oct 2022 at 07:36, Dave Page <dpage@pgadmin.org> wrote:

FYI, this is not intentional, and I do plan to look into it, however I've
been somewhat busy with pgconfeu, and am travelling for the rest of this
week as well.

Here's a patch to fix this issue. Many thanks to Peter Eisentraut who
figured it out in a few minutes after I spent far too long looking down
rabbit holes in entirely the wrong place.

Thanks for the bug report.

On Sun, 23 Oct 2022 at 21:09, Robert Treat <rob@xzilla.net> wrote:

On Fri, Oct 14, 2022 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:

On Fri, 14 Oct 2022 at 19:16, Andres Freund <andres@anarazel.de> wrote:

On 2022-10-13 14:38:06 +0100, Dave Page wrote:

Thanks for that. It looks good to me, bar one comment (repeated 3

times in

the sql and expected files):

fetch timestamps from before the next test

"from " should be removed.

I was trying to say something with that from, but clearly it wasn't
understandable :). Removed.

With that I pushed the changes and marked the CF entry as committed.

Thanks!

Hey folks,

I was looking at this a bit further (great addition btw) and noticed
the following behavior (this is a mre of the original testing that
uncovered this):

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int);
CREATE TABLE
Time: 2.145 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | [null] |
0 | [null] | [null] | [null] | 0 | 0 |
0 | 0 | 0 | 0 |
0 | 0 | [null] | [null] | [null]
| [null] | 0 | 0 | 0 |
0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16392 | public | x | 0 | 1999-12-31 19:00:00-05 |
0 | [null] | [null] | [null] | 1 |
0 | 0 | 0 | 1 | 0 |
1 | 1 | [null] | [null] |
[null] | [null] | 0 | 0 |
0 | 0
(1 row)

Normally we populate "last" columns with a NULL value when the
corresponding marker is zero, which seems correct in the first query,
but no longer matches in the second. I can see an argument that this
is a necessary exception to that rule (I'm not sure I agree with it,
but I see it) but even in that scenario, ISTM we should avoid
populating the table with a "special value", which generally goes
against observability best practices, and I believe we've been able to
avoid it elsewhere.

Beyond that, I also notice the behavior changes when adding a table
with a PK, though not necessarily better...

pagila=# drop table x;
DROP TABLE
Time: 2.896 ms
pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan |
seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+---------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
(0 rows)

pagila=# create table x (xx int primary key) ;
CREATE TABLE

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+-------------------------------+--------------+----------+---------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | [null] |
0 | 0 | 0 | 0 | 0 | 0
| 0 | 0 | 0 | [null]
| [null] | [null] | [null] | 0 |
0 | 0 | 0
(1 row)

pagila=# insert into x select 1;
INSERT 0 1

pagila=# select * from pg_stat_user_tables ;
relid | schemaname | relname | seq_scan | last_seq_scan
| seq_tup_read | idx_scan | last_idx_scan | idx_tup_fetch |
n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup |
n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |
last_autovacuum | last_analyze | last_autoanalyze | vacuum_count |
autovacuum_count | analyze_count | autoanalyze_count

-------+------------+---------+----------+-------------------------------+--------------+----------+------------------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+-------------------
16400 | public | x | 1 | 2022-10-23
15:53:04.935192-04 | 0 | 0 | 1999-12-31 19:00:00-05
| 0 | 1 | 0 | 0 | 0 |
1 | 0 | 1 | 1 |
[null] | [null] | [null] | [null] |
0 | 0 | 0 | 0
(1 row)

This time the create table both populate a sequential scan and
populates the last_seq_scan with a real/correct value. However an
insert into the table neither advances the seq_scan nor the
last_seq_scan values which seems like different behavior from my
original example, with the added negative that the last_idx_scan is
now populated with a special value :-(

I think the simplest fix which should correspond to existing versions
behavior would be to just ensure that we replace any "special value"
timestamps with a real transaction timestamp, and then maybe note that
these fields may be advanced by operations which don't strictly show
up as a sequential or index scan.

Robert Treat
https://xzilla.net

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Attachments:

fix_no_scan_handling.diffapplication/octet-stream; name=fix_no_scan_handling.diffDownload

diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 96bffc0f2a..63e8f15c59 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -60,6 +60,8 @@ pg_stat_get_lastscan(PG_FUNCTION_ARGS)
 
 	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
 		PG_RETURN_NULL();
+	else if (tabentry->lastscan == 0)
+		PG_RETURN_NULL();
 	else
 		PG_RETURN_TIMESTAMPTZ(tabentry->lastscan);
 }

#41

Michael Paquier

michael@paquier.xyz

about 3 years ago

In reply to: Dave Page (#40)

1 attachment(s)

Re: Tracking last scan time

On Thu, Nov 03, 2022 at 04:44:16PM -0400, Dave Page wrote:

Here's a patch to fix this issue. Many thanks to Peter Eisentraut who
figured it out in a few minutes after I spent far too long looking down
rabbit holes in entirely the wrong place.

FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
fields with a style like the attached. That's a nit, still per the
role of consistency with the surroundings..

Anyway, it seems to me that a regression test is in order before a
scan happens just after the relation creation, and the same problem
shows up with last_idx_scan.
--
Michael

Attachments:

fix_no_scan_handling_2.difftext/plain; charset=us-asciiDownload

diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 96bffc0f2a..ae3365d917 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -56,12 +56,18 @@ Datum
 pg_stat_get_lastscan(PG_FUNCTION_ARGS)
 {
 	Oid			relid = PG_GETARG_OID(0);
+	TimestampTz result;
 	PgStat_StatTabEntry *tabentry;
 
 	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = tabentry->lastscan;
+
+	if (result == 0)
 		PG_RETURN_NULL();
 	else
-		PG_RETURN_TIMESTAMPTZ(tabentry->lastscan);
+		PG_RETURN_TIMESTAMPTZ(result);
 }
 
 
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 257a6a9da9..1d84407a03 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -573,6 +573,12 @@ SELECT pg_stat_force_next_flush();
  
 (1 row)
 
+SELECT last_seq_scan, last_idx_scan FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
+ last_seq_scan | last_idx_scan 
+---------------+---------------
+               | 
+(1 row)
+
 COMMIT;
 SELECT pg_stat_reset_single_table_counters('test_last_scan'::regclass);
  pg_stat_reset_single_table_counters 
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index f6270f7bad..b4d6753c71 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -303,6 +303,7 @@ BEGIN;
 CREATE TEMPORARY TABLE test_last_scan(idx_col int primary key, noidx_col int);
 INSERT INTO test_last_scan(idx_col, noidx_col) VALUES(1, 1);
 SELECT pg_stat_force_next_flush();
+SELECT last_seq_scan, last_idx_scan FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
 COMMIT;
 
 SELECT pg_stat_reset_single_table_counters('test_last_scan'::regclass);

#42

Michael Paquier

michael@paquier.xyz

about 3 years ago

In reply to: Michael Paquier (#41)

Re: Tracking last scan time

On Mon, Nov 07, 2022 at 04:54:07PM +0900, Michael Paquier wrote:

FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
fields with a style like the attached. That's a nit, still per the
role of consistency with the surroundings..

Anyway, it seems to me that a regression test is in order before a
scan happens just after the relation creation, and the same problem
shows up with last_idx_scan.

Hearing nothing, done this way as of d7744d5. Thanks for the report,
Robert. And thanks for the patch, Dave.
--
Michael

#43

Dave Page

dpage@pgadmin.org

about 3 years ago

In reply to: Michael Paquier (#42)

Re: Tracking last scan time

On Tue, 8 Nov 2022 at 04:10, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 07, 2022 at 04:54:07PM +0900, Michael Paquier wrote:

FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
fields with a style like the attached. That's a nit, still per the
role of consistency with the surroundings..

Anyway, it seems to me that a regression test is in order before a
scan happens just after the relation creation, and the same problem
shows up with last_idx_scan.

Hearing nothing, done this way as of d7744d5. Thanks for the report,
Robert. And thanks for the patch, Dave.

Thank you!

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com