Changing the autovacuum launcher scheduling; oldest table first algorithm

Started by Masahiko Sawadaalmost 8 years ago13 messages

sawada.mshk@gmail.com

almost 8 years ago

1 attachment(s)

Hi,

I've created the new thread for the changing AV launcher scheduling.
The problem of AV launcher scheduling is described on [1]/messages/by-id/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05 but I
summarize it here.

If there is even one database that is at risk of wraparound, currently
AV launcher selects the database having the oldest datfrozenxid until
all tables in that database have been vacuumed. This leads that tables
on other database can bloat and not be frozen because other database
are not selected by AV launcher. It makes things worse if the database
has a large table that takes a long time to be vacuumed.

Attached patch changes the AV launcher scheduling algorithm based on
the proposed idea by Robert so that it selects a database first that
has the oldest table when the database is at risk of wraparound. For
detail of the algorithm please refer to [2]/messages/by-id/CA+TgmobT3m=+dU5HF3VGVqiZ2O+v6P5wN1Gj+Prq+hj7dAm9AQ@mail.gmail.com.

In this algorithm, the running AV workers advertise the next
datfrozenxid on the shared memory that we will have. That way, AV
launcher can select a database that has the oldest table in the
database cluster. However, this algorithm doesn't support the case
where the age of next datfrozenxid gets greater than
autovacum_vacuum_max_age. This case can happen if users set
autovacvuum_vacuum_max_age to small value and vacuuming the whole
database takes a long time. But since it's not a common case and it
doesn't degrade than current behavior even if this happened, I think
it's not a big problem.

Feedback is very welcome.

[1]: /messages/by-id/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05
[2]: /messages/by-id/CA+TgmobT3m=+dU5HF3VGVqiZ2O+v6P5wN1Gj+Prq+hj7dAm9AQ@mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

autovacuum_scheduling.patchapplication/octet-stream; name=autovacuum_scheduling.patchDownload

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 21f5e2c..cf6cbb6 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -133,6 +133,12 @@ int			Log_autovacuum_min_duration = -1;
 #define MIN_AUTOVAC_SLEEPTIME 100.0 /* milliseconds */
 #define MAX_AUTOVAC_SLEEPTIME 300	/* seconds */
 
+/*
+ * update datfrozenxid whenever we have become possible to update
+ * it more than this value.
+ */
+#define UPDATE_DATFROZENXID_EVERY_XACT	10000000L	/* 10 million xacts */
+
 /* Flags to tell if we are in an autovacuum process */
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
@@ -171,17 +177,30 @@ typedef struct avw_dbase
 	char	   *adw_name;
 	TransactionId adw_frozenxid;
 	MultiXactId adw_minmulti;
+	TransactionId adw_next_frozenxid;	/* have either next higher
+										 * value or adw_frozenxid */
+	MultiXactId adw_next_minmulti;	/* have either next higher
+									 * value or adw_minmulti */
 	PgStat_StatDBEntry *adw_entry;
 } avw_dbase;
 
-/* struct to keep track of tables to vacuum and/or analyze, in 1st pass */
-typedef struct av_relation
+/* hash entry for mapping between relation and toast relation */
+typedef struct av_relation_hash_entry
 {
 	Oid			ar_toastrelid;	/* hash key - must be first */
 	Oid			ar_relid;
 	bool		ar_hasrelopts;
 	AutoVacOpts ar_reloptions;	/* copy of AutoVacOpts from the main table's
 								 * reloptions, or NULL if none */
+} av_relation_hash_entry;
+
+/* struct to keep track of tables to vacuum and/or analyze, in 1st pass */
+typedef struct av_relation
+{
+	Oid				relid;
+	TransactionId	relfrozenxid;
+	MultiXactId		relminmxid;
+	bool			for_anti_wrap;
 } av_relation;
 
 /* struct to keep track of tables to vacuum and/or analyze, after rechecking */
@@ -229,6 +248,9 @@ typedef struct WorkerInfoData
 	int			wi_cost_delay;
 	int			wi_cost_limit;
 	int			wi_cost_limit_base;
+	bool		wi_for_anti_wrap;
+	TransactionId	wi_next_frozenxid;
+	TransactionId	wi_next_minmulti;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -318,7 +340,14 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
+static int	av_relation_comparator(const void *a, const void *b);
 static void autovac_balance_cost(void);
+static void get_next_xid_bounds(avw_dbase *avdb,
+								TransactionId *datfrozenxid,
+								MultiXactId	*datminmxid);
+static void get_oldest_xid_bounds(Oid relid,
+								  TransactionId *relfrozenxid,
+								  MultiXactId	*relminmxid);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -1117,6 +1146,35 @@ db_comparator(const void *a, const void *b)
 }
 
 /*
+ * qsort comparator for sorting av_relation by chronological order of
+ * relfrozenxid and relminmxid. Note that xid wraparound danger are
+ * more priority than multi wraparound danger.
+ */
+static int
+av_relation_comparator(const void *a, const void *b)
+{
+	av_relation *av_a = (av_relation *) lfirst(*(ListCell **) a);
+	av_relation *av_b = (av_relation *) lfirst(*(ListCell **) b);
+	int32	diff = (int32) (av_a->relfrozenxid - av_b->relfrozenxid);
+
+	/* Compare relfrozenxid first */
+	if (diff > 0)
+		return 1;
+	if (diff < 0)
+		return -1;
+
+	/* If relfrozenxid is the same, compare relminmxid */
+	diff = (int32) (av_a->relminmxid - av_b->relminmxid);
+
+	if (diff > 0)
+		return 1;
+	if (diff < 0)
+		return -1;
+
+	return 0;
+}
+
+/*
  * do_start_worker
  *
  * Bare-bones procedure for starting an autovacuum worker from the launcher.
@@ -1194,6 +1252,13 @@ do_start_worker(void)
 	 * if any is in MultiXactId wraparound.  Note that those in Xid wraparound
 	 * danger are given more priority than those in multi wraparound danger.
 	 *
+	 * For Xid wraparound purposes, we choose a database to connect to with
+	 * consideration for updating oldest relfrozenxid and relminmxid by
+	 * concurrent autovacuum workers.  That way, we can choose a database
+	 * that has the table that is at greatest risk of wraparound.  This can
+	 * avoid continuously choosing one database if the database has a big
+	 * table that is at risk of wraparound and takes a long time to vacuum.
+	 *
 	 * Note that a database with no stats entry is not considered, except for
 	 * Xid wraparound purposes.  The theory is that if no one has ever
 	 * connected to it since the stats were last initialized, it doesn't need
@@ -1215,22 +1280,30 @@ do_start_worker(void)
 		avw_dbase  *tmp = lfirst(cell);
 		dlist_iter	iter;
 
+		/*
+		 * Get the next oldest both relfrozenxid and relminmxid value that
+		 * we will have if available. If not available, these value are set
+		 * the current ids.
+		 */
+		get_next_xid_bounds(tmp, &tmp->adw_next_frozenxid, &tmp->adw_next_minmulti);
+
 		/* Check to see if this one is at risk of wraparound */
-		if (TransactionIdPrecedes(tmp->adw_frozenxid, xidForceLimit))
+		if (TransactionIdPrecedes(tmp->adw_next_frozenxid, xidForceLimit))
 		{
 			if (avdb == NULL ||
-				TransactionIdPrecedes(tmp->adw_frozenxid,
-									  avdb->adw_frozenxid))
+				TransactionIdPrecedes(tmp->adw_next_frozenxid,
+									  avdb->adw_next_frozenxid))
 				avdb = tmp;
 			for_xid_wrap = true;
 			continue;
 		}
 		else if (for_xid_wrap)
 			continue;			/* ignore not-at-risk DBs */
-		else if (MultiXactIdPrecedes(tmp->adw_minmulti, multiForceLimit))
+		else if (MultiXactIdPrecedes(tmp->adw_next_frozenxid, multiForceLimit))
 		{
 			if (avdb == NULL ||
-				MultiXactIdPrecedes(tmp->adw_minmulti, avdb->adw_minmulti))
+				MultiXactIdPrecedes(tmp->adw_next_minmulti,
+									avdb->adw_next_minmulti))
 				avdb = tmp;
 			for_multi_wrap = true;
 			continue;
@@ -1307,6 +1380,7 @@ do_start_worker(void)
 		worker->wi_dboid = avdb->adw_datid;
 		worker->wi_proc = NULL;
 		worker->wi_launchtime = GetCurrentTimestamp();
+		worker->wi_for_anti_wrap = for_xid_wrap | for_multi_wrap;
 
 		AutoVacuumShmem->av_startingWorker = worker;
 
@@ -1751,6 +1825,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_cost_delay = 0;
 		MyWorkerInfo->wi_cost_limit = 0;
 		MyWorkerInfo->wi_cost_limit_base = 0;
+		MyWorkerInfo->wi_next_frozenxid = InvalidTransactionId;
+		MyWorkerInfo->wi_next_minmulti = InvalidMultiXactId;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1862,6 +1938,116 @@ autovac_balance_cost(void)
 }
 
 /*
+ * get_next_xid_bounds
+ *
+ * Get both datfrozenxid and datminmxid in a given database with
+ * consideration for updating these values by concurrent running
+ * autovacuum worker. If workers are running on a database, they are
+ * advertising the next xid lower bounds that we will have after the
+ * tables being vacuumed are finished.  If the next xid bounds are
+ * set, we use them rather than actual current xid bounds as a
+ * datfrozenxid and a datminmxid.
+ */
+static void
+get_next_xid_bounds(avw_dbase *avdb, TransactionId *datfrozenxid,
+					MultiXactId	*datminmxid)
+{
+	TransactionId	frozenxid = avdb->adw_frozenxid;
+	MultiXactId		minmulti = avdb->adw_minmulti;
+	dlist_iter 		iter;
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	{
+		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+
+		if (worker->wi_dboid != avdb->adw_datid)
+			continue;
+
+		/* Get oldest lower bounds */
+		if (TransactionIdIsValid(worker->wi_next_frozenxid) &&
+			TransactionIdPrecedes(frozenxid, worker->wi_next_frozenxid))
+			frozenxid = worker->wi_next_frozenxid;
+		if (MultiXactIdIsValid(worker->wi_next_minmulti) &&
+			MultiXactIdPrecedes(minmulti, worker->wi_next_minmulti))
+			minmulti = worker->wi_next_minmulti;
+	}
+
+	LWLockRelease(AutovacuumLock);
+
+	Assert(TransactionIdIsVaild(frozenxid));
+	Assert(MultiXactIdIsValid(minmutli));
+
+	*datfrozenxid = frozenxid;
+	*datminmxid = minmulti;
+}
+
+/*
+ * get_oldest_xid_bounds
+ *
+ * Get oldest relfrozenxid and relminmxid in the database excluding a
+ * given relation.
+ */
+static void
+get_oldest_xid_bounds(Oid relid, TransactionId *relfrozenxid,
+					  MultiXactId *relminmxid)
+{
+	Relation		rel;
+	HeapScanDesc	relScan;
+	HeapTuple		tuple;
+	TransactionId	oldest_xid  = InvalidTransactionId;
+	MultiXactId		oldest_mxid = InvalidMultiXactId;
+
+	rel = heap_open(RelationRelationId, AccessShareLock);
+	relScan = heap_beginscan_catalog(rel, 0, NULL);
+
+	while ((tuple = heap_getnext(relScan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
+
+		if (classForm->relkind != RELKIND_RELATION &&
+			classForm->relkind != RELKIND_MATVIEW)
+			continue;
+
+		if (classForm->relpersistence == RELPERSISTENCE_TEMP)
+			continue;
+
+		/* Exclude the given relation */
+		if (HeapTupleGetOid(tuple) == relid)
+			continue;
+
+		/* Must have valid both xid and mxid */
+		if (!TransactionIdIsNormal(classForm->relfrozenxid) ||
+			!MultiXactIdIsValid(classForm->relminmxid))
+			continue;
+
+		/* Set the first enties */
+		if (!TransactionIdIsValid(oldest_xid) && !MultiXactIdIsValid(oldest_mxid))
+		{
+			oldest_xid = classForm->relfrozenxid;
+			oldest_mxid = classForm->relminmxid;
+			continue;
+		}
+
+		/* Update oldest xid bounds */
+		if (TransactionIdPrecedes(classForm->relfrozenxid, oldest_xid))
+			oldest_xid = classForm->relfrozenxid;
+		if (MultiXactIdPrecedes(classForm->relminmxid, oldest_mxid))
+			oldest_mxid = classForm->relminmxid;
+	}
+
+	heap_endscan(relScan);
+	heap_close(rel, AccessShareLock);
+
+	Assert(TransactionIdIsValid(oldest_xid));
+	Assert(MultiXactIdIsValid(oldest_mxid));
+
+	*relfrozenxid = oldest_xid;
+	*relminmxid = oldest_mxid;
+}
+
+/*
  * get_database_list
  *		Return a list of all databases found in pg_database.
  *
@@ -1918,6 +2104,9 @@ get_database_list(void)
 		avdb->adw_name = pstrdup(NameStr(pgdatabase->datname));
 		avdb->adw_frozenxid = pgdatabase->datfrozenxid;
 		avdb->adw_minmulti = pgdatabase->datminmxid;
+		/* set the same value for now */
+		avdb->adw_next_frozenxid = pgdatabase->datfrozenxid;
+		avdb->adw_next_minmulti = pgdatabase->datminmxid;
 		/* this gets set later: */
 		avdb->adw_entry = NULL;
 
@@ -1946,7 +2135,7 @@ do_autovacuum(void)
 	HeapTuple	tuple;
 	HeapScanDesc relScan;
 	Form_pg_database dbForm;
-	List	   *table_oids = NIL;
+	List	   *av_relation_list = NIL;
 	List	   *orphan_oids = NIL;
 	HASHCTL		ctl;
 	HTAB	   *table_toast_map;
@@ -1954,6 +2143,8 @@ do_autovacuum(void)
 	PgStat_StatDBEntry *shared;
 	PgStat_StatDBEntry *dbentry;
 	BufferAccessStrategy bstrategy;
+	TransactionId	cur_frozenxid;
+	MultiXactId		cur_minmulti;
 	ScanKeyData key;
 	TupleDesc	pg_class_desc;
 	int			effective_multixact_freeze_max_age;
@@ -2035,7 +2226,7 @@ do_autovacuum(void)
 	/* create hash table for toast <-> main relid mapping */
 	MemSet(&ctl, 0, sizeof(ctl));
 	ctl.keysize = sizeof(Oid);
-	ctl.entrysize = sizeof(av_relation);
+	ctl.entrysize = sizeof(av_relation_hash_entry);
 
 	table_toast_map = hash_create("TOAST to main relid map",
 								  100,
@@ -2115,9 +2306,18 @@ do_autovacuum(void)
 								  effective_multixact_freeze_max_age,
 								  &dovacuum, &doanalyze, &wraparound);
 
-		/* Relations that need work are added to table_oids */
+		/* Relations that need work are added to av_relation_list */
 		if (dovacuum || doanalyze)
-			table_oids = lappend_oid(table_oids, relid);
+		{
+			av_relation *avrel = (av_relation *) palloc(sizeof(av_relation));
+
+			avrel->relid = relid;
+			avrel->relfrozenxid = classForm->relfrozenxid;
+			avrel->relminmxid = classForm->relminmxid;
+			avrel->for_anti_wrap = wraparound;
+
+			av_relation_list = lappend(av_relation_list, avrel);
+		}
 
 		/*
 		 * Remember TOAST associations for the second pass.  Note: we must do
@@ -2126,7 +2326,7 @@ do_autovacuum(void)
 		 */
 		if (OidIsValid(classForm->reltoastrelid))
 		{
-			av_relation *hentry;
+			av_relation_hash_entry *hentry;
 			bool		found;
 
 			hentry = hash_search(table_toast_map,
@@ -2182,7 +2382,7 @@ do_autovacuum(void)
 		relopts = extract_autovac_opts(tuple, pg_class_desc);
 		if (relopts == NULL)
 		{
-			av_relation *hentry;
+			av_relation_hash_entry *hentry;
 			bool		found;
 
 			hentry = hash_search(table_toast_map, &relid, HASH_FIND, &found);
@@ -2200,7 +2400,16 @@ do_autovacuum(void)
 
 		/* ignore analyze for toast tables */
 		if (dovacuum)
-			table_oids = lappend_oid(table_oids, relid);
+		{
+			av_relation *avrel = (av_relation *) palloc(sizeof(av_relation));
+
+			avrel->relid = relid;
+			avrel->relfrozenxid = classForm->relfrozenxid;
+			avrel->relminmxid = classForm->relminmxid;
+			avrel->for_anti_wrap = wraparound;
+
+			av_relation_list = lappend(av_relation_list, avrel);
+		}
 	}
 
 	heap_endscan(relScan);
@@ -2304,6 +2513,24 @@ do_autovacuum(void)
 	bstrategy = GetAccessStrategy(BAS_VACUUM);
 
 	/*
+	 * In anti-wraparound case, the table that has the oldest relfrozenxid and
+	 * relminmxid in the database is vacuumed first. Sort the list by relfrozenxid
+	 * and relminmxid in older order.
+	 */
+	if (MyWorkerInfo->wi_for_anti_wrap)
+	{
+		av_relation	*avrel;
+
+		/* Sort */
+		av_relation_list = list_qsort(av_relation_list, av_relation_comparator);
+
+		/* Get current head(oldest) xids */
+		avrel = (av_relation *) linitial(av_relation_list);
+		cur_frozenxid = avrel->relfrozenxid;
+		cur_minmulti = avrel->relminmxid;
+	}
+
+	/*
 	 * create a memory context to act as fake PortalContext, so that the
 	 * contexts created in the vacuum code are cleaned up for each table.
 	 */
@@ -2314,9 +2541,10 @@ do_autovacuum(void)
 	/*
 	 * Perform operations on collected tables.
 	 */
-	foreach(cell, table_oids)
+	foreach(cell, av_relation_list)
 	{
-		Oid			relid = lfirst_oid(cell);
+		av_relation *avrel = (av_relation *) lfirst(cell);
+		Oid			relid = avrel->relid;
 		autovac_table *tab;
 		bool		skipit;
 		int			stdVacuumCostDelay;
@@ -2342,6 +2570,52 @@ do_autovacuum(void)
 		}
 
 		/*
+		 * For anti-wraparound purposes, the frequently updating datfrozenxid
+		 * is effective for increasing lower bounds in the database cluster.
+		 */
+		if (MyWorkerInfo->wi_for_anti_wrap &&
+			(TransactionIdPrecedes(cur_frozenxid + UPDATE_DATFROZENXID_EVERY_XACT,
+								   avrel->relfrozenxid) ||
+			 MultiXactIdPrecedes(cur_minmulti + UPDATE_DATFROZENXID_EVERY_XACT,
+								 avrel->relminmxid)))
+		{
+				vac_update_datfrozenxid();
+				cur_frozenxid = avrel->relfrozenxid;
+				cur_minmulti = avrel->relminmxid;
+		}
+
+		/*
+		 * Calculate the next higher lower bound that we will have after the
+		 * avrel being vacuumed are finished, and advertise them on shmem.
+		 */
+		if (lnext(cell) != NULL)
+		{
+			/*
+			 * If the current list cell is not the last cell, we use values
+			 * of the next cell.
+			 */
+			av_relation *next_avrel = (av_relation *) lfirst(lnext(cell));
+
+			MyWorkerInfo->wi_next_frozenxid = next_avrel->relfrozenxid;
+			MyWorkerInfo->wi_next_minmulti = next_avrel->relminmxid;
+		}
+		else
+		{
+			/*
+			 * The last table in the av_relation_list can be null if all tables
+			 * on the database are listed. We find the next-higher xid bounds
+			 * excluding the current relation.
+			 */
+			TransactionId next_frozenxid;
+			MultiXactId	next_minmxid;
+
+			get_oldest_xid_bounds(avrel->relid, &next_frozenxid, &next_minmxid);
+			MyWorkerInfo->wi_next_frozenxid = next_frozenxid;
+			MyWorkerInfo->wi_next_minmulti = next_minmxid;
+		}
+
+
+		/*
 		 * hold schedule lock from here until we're sure that this table still
 		 * needs vacuuming.  We also need the AutovacuumLock to walk the
 		 * worker array, but we'll let go of that one quickly.
@@ -2395,8 +2669,18 @@ do_autovacuum(void)
 									effective_multixact_freeze_max_age);
 		if (tab == NULL)
 		{
-			/* someone else vacuumed the table, or it went away */
 			LWLockRelease(AutovacuumScheduleLock);
+
+			/*
+			 * If we are focusing on anti-wraparound, the av_relation_list
+			 * should have been chronological ordered by relfrozenxid and
+			 * relminmxid. Since we are not interested in tables that don't
+			 * require vacuum we no longer vacuum the rest.
+			 */
+			if (MyWorkerInfo->wi_for_anti_wrap && !avrel->for_anti_wrap)
+				break;
+
+			/* someone else vacuumed the table, or it went away */
 			continue;
 		}
 
@@ -2801,7 +3085,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 	if (classForm->relkind == RELKIND_TOASTVALUE &&
 		avopts == NULL && table_toast_map != NULL)
 	{
-		av_relation *hentry;
+		av_relation_hash_entry *hentry;
 		bool		found;
 
 		hentry = hash_search(table_toast_map, &relid, HASH_FIND, &found);

Grigory Smolkin

g.smolkin@postgrespro.ru

almost 8 years ago

In reply to: Masahiko Sawada (#1)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On 02/28/2018 12:04 PM, Masahiko Sawada wrote:

Hi,

I've created the new thread for the changing AV launcher scheduling.
The problem of AV launcher scheduling is described on [1] but I
summarize it here.

If there is even one database that is at risk of wraparound, currently
AV launcher selects the database having the oldest datfrozenxid until
all tables in that database have been vacuumed. This leads that tables
on other database can bloat and not be frozen because other database
are not selected by AV launcher. It makes things worse if the database
has a large table that takes a long time to be vacuumed.

Attached patch changes the AV launcher scheduling algorithm based on
the proposed idea by Robert so that it selects a database first that
has the oldest table when the database is at risk of wraparound. For
detail of the algorithm please refer to [2].

In this algorithm, the running AV workers advertise the next
datfrozenxid on the shared memory that we will have. That way, AV
launcher can select a database that has the oldest table in the
database cluster. However, this algorithm doesn't support the case
where the age of next datfrozenxid gets greater than
autovacum_vacuum_max_age. This case can happen if users set
autovacvuum_vacuum_max_age to small value and vacuuming the whole
database takes a long time. But since it's not a common case and it
doesn't degrade than current behavior even if this happened, I think
it's not a big problem.

Feedback is very welcome.

[1] /messages/by-id/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05
[2] /messages/by-id/CA+TgmobT3m=+dU5HF3VGVqiZ2O+v6P5wN1Gj+Prq+hj7dAm9AQ@mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Hello!
I`ve tried to compile your patch on Fedora24 with gcc 6.3.1 20161221:

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv
-fexcess-precision=standard -g -g3 -O0 -I../../../src/include
-D_GNU_SOURCE -c -o autovacuum.o autovacuum.c
In file included from ../../../src/include/postgres.h:46:0,
from autovacuum.c:62:
autovacuum.c: In function ‘get_next_xid_bounds’:
autovacuum.c:1979:9: warning: implicit declaration of function
‘TransactionIdIsVaild’ [-Wimplicit-function-declaration]
Assert(TransactionIdIsVaild(frozenxid));
^
../../../src/include/c.h:713:7: note: in definition of macro ‘Trap’
if (condition) \
^~~~~~~~~
autovacuum.c:1979:2: note: in expansion of macro ‘Assert’
Assert(TransactionIdIsVaild(frozenxid));
^~~~~~
autovacuum.c:1980:28: error: ‘minmutli’ undeclared (first use in this
function)
Assert(MultiXactIdIsValid(minmutli));
^
../../../src/include/c.h:713:7: note: in definition of macro ‘Trap’
if (condition) \
^~~~~~~~~
autovacuum.c:1980:2: note: in expansion of macro ‘Assert’
Assert(MultiXactIdIsValid(minmutli));
^~~~~~
autovacuum.c:1980:9: note: in expansion of macro ‘MultiXactIdIsValid’
Assert(MultiXactIdIsValid(minmutli));
^~~~~~~~~~~~~~~~~~
autovacuum.c:1980:28: note: each undeclared identifier is reported only
once for each function it appears in
Assert(MultiXactIdIsValid(minmutli));
^
../../../src/include/c.h:713:7: note: in definition of macro ‘Trap’
if (condition) \
^~~~~~~~~
autovacuum.c:1980:2: note: in expansion of macro ‘Assert’
Assert(MultiXactIdIsValid(minmutli));
^~~~~~
autovacuum.c:1980:9: note: in expansion of macro ‘MultiXactIdIsValid’
Assert(MultiXactIdIsValid(minmutli));
^~~~~~~~~~~~~~~~~~
<builtin>: recipe for target 'autovacuum.o' failed
make[3]: *** [autovacuum.o] Error 1

--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: Masahiko Sawada (#1)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

Hi,

On 2018-02-28 18:04:27 +0900, Masahiko Sawada wrote:

I've created the new thread for the changing AV launcher scheduling.
The problem of AV launcher scheduling is described on [1] but I
summarize it here.

This is a new patch submitted to CF 2018-03. As that's the last CF for
v11, and this patch isn't trivial, I propose to instead move this to the
next CF.

Greetings,

Andres Freund

Masahiko Sawada

sawada.mshk@gmail.com

almost 8 years ago

In reply to: Andres Freund (#3)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Fri, Mar 2, 2018 at 4:51 PM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-02-28 18:04:27 +0900, Masahiko Sawada wrote:

I've created the new thread for the changing AV launcher scheduling.
The problem of AV launcher scheduling is described on [1] but I
summarize it here.

This is a new patch submitted to CF 2018-03. As that's the last CF for
v11, and this patch isn't trivial, I propose to instead move this to the
next CF.

Thank you for comment.
I recently added this patch to the current CF but discussion for
design has started from the end of Jan. So I don't think this patch is
design phase but should I move it to the next CF?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Masahiko Sawada

sawada.mshk@gmail.com

almost 8 years ago

In reply to: Masahiko Sawada (#4)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Fri, Mar 2, 2018 at 5:00 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 2, 2018 at 4:51 PM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-02-28 18:04:27 +0900, Masahiko Sawada wrote:

I've created the new thread for the changing AV launcher scheduling.
The problem of AV launcher scheduling is described on [1] but I
summarize it here.

This is a new patch submitted to CF 2018-03. As that's the last CF for
v11, and this patch isn't trivial, I propose to instead move this to the
next CF.

Thank you for comment.
I recently added this patch to the current CF but discussion for
design has started from the end of Jan. So I don't think this patch is
design phase but should I move it to the next CF?

I now see situation of the current CF, and agreed with you. Will move
this to the next CF.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Alvaro Herrera

alvherre@alvh.no-ip.org

almost 8 years ago

In reply to: Masahiko Sawada (#1)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

Hello

I haven't read your respective patches yet, but both these threads
brought to memory a patch I proposed a few years ago that I never
completed:

/messages/by-id/20130124215715.GE4528@alvh.no-ip.org

In that thread I posted a patch to implement a prioritisation scheme for
autovacuum, based on an equation which was still under discussion when
I abandoned it. Chris Browne proposed a crazy equation to mix in both
XID age and fraction of dead tuples; probably that idea is worth
studying further. I tried to implement that in my patch but I probably
didn't do it correctly (because, as I recall, it failed to work as
expected). Nowadays I think we would also consider the multixact freeze
age, too.

Maybe that's worth giving a quick look in case some of the ideas there
are useful for the patches now being proposed.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Masahiko Sawada

sawada.mshk@gmail.com

almost 8 years ago

In reply to: Alvaro Herrera (#6)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Tue, Mar 6, 2018 at 11:27 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Hello

I haven't read your respective patches yet, but both these threads
brought to memory a patch I proposed a few years ago that I never
completed:

/messages/by-id/20130124215715.GE4528@alvh.no-ip.org

Thank you for sharing the thread.

In that thread I posted a patch to implement a prioritisation scheme for
autovacuum, based on an equation which was still under discussion when
I abandoned it. Chris Browne proposed a crazy equation to mix in both
XID age and fraction of dead tuples; probably that idea is worth
studying further. I tried to implement that in my patch but I probably
didn't do it correctly (because, as I recall, it failed to work as
expected). Nowadays I think we would also consider the multixact freeze
age, too.

Maybe that's worth giving a quick look in case some of the ideas there
are useful for the patches now being proposed.

Yeah, that's definitely useful for the patches. I'll look at this and
will discuss that here.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Masahiko Sawada

sawada.mshk@gmail.com

over 7 years ago

In reply to: Masahiko Sawada (#7)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Wed, Mar 14, 2018 at 11:29 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Mar 6, 2018 at 11:27 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Hello

I haven't read your respective patches yet, but both these threads
brought to memory a patch I proposed a few years ago that I never
completed:

/messages/by-id/20130124215715.GE4528@alvh.no-ip.org

Thank you for sharing the thread.

In that thread I posted a patch to implement a prioritisation scheme for
autovacuum, based on an equation which was still under discussion when
I abandoned it. Chris Browne proposed a crazy equation to mix in both
XID age and fraction of dead tuples; probably that idea is worth
studying further. I tried to implement that in my patch but I probably
didn't do it correctly (because, as I recall, it failed to work as
expected). Nowadays I think we would also consider the multixact freeze
age, too.

Maybe that's worth giving a quick look in case some of the ideas there
are useful for the patches now being proposed.

Yeah, that's definitely useful for the patches. I'll look at this and
will discuss that here.

I read that discussion. If I understand correctly, the patch proposed
in that discussion mainly focuses on table-selection algorithm for
each AV worker. But the patch I proposed also changes it but the main
part of this patch is to change the database-selection algorithm by AV
launcher (sorry, maybe the subject leads misleading). The problem I'd
like to deal with is, when there is at least one database needing
anti-wraparound vacuum it concentrates launching new AV workers on one
database.

If we change the table-selection algorithm to the "oldest table first"
algorithm in that case, we can deal with that problem by this patch.
However, I think it's worthwhile to consider another table-selection
algorithm such as the proposal on that old thread. So I think it would
be better to think these issue separately. That is, we can change the
database-selection algorithm in order to prevent the concentration
problem while not changing table-selection algorithm.

This brought me another idea; having AV workers report its autovacuum
results at its AV worker slot so that they can teach autovacuum status
to AV launcher. For example, each AV worker can report the following
information.

* Database oid
* The number of tables needing vacuum.
* The number of vacuumed tables.
* The number of skipped tables due to being processed by other AV workers.
* Timestamp of last update

If there is an up-to-date information meaning either that there is no
tables needing vacuum or that there is only table needing vacuum but
being vacuumed by other worker, AV launcher can launches new one to
other database.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Michael Paquier

michael@paquier.xyz

over 7 years ago

In reply to: Masahiko Sawada (#8)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Thu, Jun 28, 2018 at 04:20:53PM +0900, Masahiko Sawada wrote:

If there is an up-to-date information meaning either that there is no
tables needing vacuum or that there is only table needing vacuum but
being vacuumed by other worker, AV launcher can launches new one to
other database.

I am not completely sure what we want to do with this patch in
particular as there are many approaches and things which can be
discussed. For now, the latest patch proposed does not apply, so I am
moving it to next CF, waiting for its author.
--
Michael

#10

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Michael Paquier (#9)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Tue, Oct 2, 2018 at 4:42 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jun 28, 2018 at 04:20:53PM +0900, Masahiko Sawada wrote:

If there is an up-to-date information meaning either that there is no
tables needing vacuum or that there is only table needing vacuum but
being vacuumed by other worker, AV launcher can launches new one to
other database.

I am not completely sure what we want to do with this patch in
particular as there are many approaches and things which can be
discussed. For now, the latest patch proposed does not apply, so I am
moving it to next CF, waiting for its author.

Nothing changed since then, but also the patch got not enough review to say
that there was substantial feedback. I'll move it to the next CF.

#11

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Dmitry Dolgov (#10)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Thu, Nov 29, 2018 at 06:21:34PM +0100, Dmitry Dolgov wrote:

Nothing changed since then, but also the patch got not enough review to say
that there was substantial feedback. I'll move it to the next CF.

I would have suggested to mark the patch as returned with feedback
instead as the thing does not apply for some time now, and the author
has been notified about a rebase.
--
Michael

#12

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Michael Paquier (#11)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Fri, Nov 30, 2018 at 10:48 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Nov 29, 2018 at 06:21:34PM +0100, Dmitry Dolgov wrote:

Nothing changed since then, but also the patch got not enough review to say
that there was substantial feedback. I'll move it to the next CF.

I would have suggested to mark the patch as returned with feedback
instead as the thing does not apply for some time now, and the author
has been notified about a rebase.

Agreed and sorry for the late reply.

The design of this patch would need to be reconsidered based on
suggestions and discussions we had before. I'll propose it again after
considerations.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#13

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Masahiko Sawada (#12)

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

On Fri, Nov 30, 2018 at 3:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Nov 30, 2018 at 10:48 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Nov 29, 2018 at 06:21:34PM +0100, Dmitry Dolgov wrote:

Nothing changed since then, but also the patch got not enough review to say
that there was substantial feedback. I'll move it to the next CF.

I would have suggested to mark the patch as returned with feedback
instead as the thing does not apply for some time now, and the author
has been notified about a rebase.

Agreed and sorry for the late reply.

The design of this patch would need to be reconsidered based on
suggestions and discussions we had before. I'll propose it again after
considerations.

Well, taking into account this information, then yes, it makes sense and I'll
mark it as "Returned with feedback", thanks!