pgbench filler columns

Started by Pavan Deolaseeover 12 years ago5 messages
#1Pavan Deolasee
pavan.deolasee@gmail.com

While looking at the compressibility of WAL files generated by pgbench,
which is close to 90%, I first thought its because of the "filler" column
in the accounts table. But a comment in pgbench.c says:

/*
* Note: TPC-B requires at least 100 bytes per row, and the "filler"
* fields in these table declarations were intended to comply with that.
* But because they default to NULLs, they don't actually take any
space.
* We could fix that by giving them non-null default values. However,
that
* would completely break comparability of pgbench results with prior
* versions. Since pgbench has never pretended to be fully TPC-B
* compliant anyway, we stick with the historical behavior.
*/

The comment about them being NULL and hence not taking up any space is
added by commit b7a67c2840f193f in response to this bug report:
/messages/by-id/200710170810.l9H8A76I080568@wwwmaster.postgresql.org

But I find it otherwise. On my machine, accounts table can only fit 62
tuples in a page with default fillfactor. The following queries show that
filler column is NOT NULL, but set to empty string. I have tested on 8.2,
8.4 and master and they all show the same behavior. So I don't know if that
bug report itself was wrong or if I am reading the comment wrong.

postgres=# select count(*) from pgbench_accounts where filler IS NULL;
count
-------
0
(1 row)

postgres=# select count(*) from pgbench_accounts where filler = '';
count
--------
100000
(1 row)

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

#2Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Pavan Deolasee (#1)
Re: pgbench filler columns

On Thu, Sep 26, 2013 at 2:05 PM, Pavan Deolasee <pavan.deolasee@gmail.com>wrote:

While looking at the compressibility of WAL files generated by pgbench,
which is close to 90%, I first thought its because of the "filler" column
in the accounts table. But a comment in pgbench.c says:

/*
* Note: TPC-B requires at least 100 bytes per row, and the "filler"
* fields in these table declarations were intended to comply with
that.
* But because they default to NULLs, they don't actually take any
space.
* We could fix that by giving them non-null default values. However,
that
* would completely break comparability of pgbench results with prior
* versions. Since pgbench has never pretended to be fully TPC-B
* compliant anyway, we stick with the historical behavior.
*/

The comment about them being NULL and hence not taking up any space is
added by commit b7a67c2840f193f in response to this bug report:

/messages/by-id/200710170810.l9H8A76I080568@wwwmaster.postgresql.org

On a more careful look, it seems the original bug report complained about
all tables except accounts. And all other tables indeed have "filler" as
NULL. But the way comment is written it seems as if it applies to all DDLs.
Should we just fix the comment and say its applicable for all tables except
accounts ?

Thanks,
Pavan
--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

#3Noah Misch
noah@leadboat.com
In reply to: Pavan Deolasee (#2)
Re: pgbench filler columns

On Thu, Sep 26, 2013 at 03:23:30PM +0530, Pavan Deolasee wrote:

On Thu, Sep 26, 2013 at 2:05 PM, Pavan Deolasee <pavan.deolasee@gmail.com>wrote:

While looking at the compressibility of WAL files generated by pgbench,
which is close to 90%, I first thought its because of the "filler" column
in the accounts table. But a comment in pgbench.c says:

/*
* Note: TPC-B requires at least 100 bytes per row, and the "filler"
* fields in these table declarations were intended to comply with
that.
* But because they default to NULLs, they don't actually take any
space.
* We could fix that by giving them non-null default values. However,
that
* would completely break comparability of pgbench results with prior
* versions. Since pgbench has never pretended to be fully TPC-B
* compliant anyway, we stick with the historical behavior.
*/

The comment about them being NULL and hence not taking up any space is
added by commit b7a67c2840f193f in response to this bug report:

/messages/by-id/200710170810.l9H8A76I080568@wwwmaster.postgresql.org

On a more careful look, it seems the original bug report complained about
all tables except accounts. And all other tables indeed have "filler" as
NULL. But the way comment is written it seems as if it applies to all DDLs.

Agreed.

Should we just fix the comment and say its applicable for all tables except
accounts ?

Please do.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Noah Misch (#3)
1 attachment(s)
Re: pgbench filler columns

On Thu, Sep 26, 2013 at 7:20 PM, Noah Misch <noah@leadboat.com> wrote:

On Thu, Sep 26, 2013 at 03:23:30PM +0530, Pavan Deolasee wrote:

Should we just fix the comment and say its applicable for all tables

except

accounts ?

Please do.

How about something like this ? Patch attached.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Attachments:

pgbench_filler_column_notes.patchapplication/octet-stream; name=pgbench_filler_column_notes.patchDownload
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8c202bf..6a5171d 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -1455,11 +1455,13 @@ init(bool is_no_vacuum)
 	/*
 	 * Note: TPC-B requires at least 100 bytes per row, and the "filler"
 	 * fields in these table declarations were intended to comply with that.
-	 * But because they default to NULLs, they don't actually take any space.
-	 * We could fix that by giving them non-null default values. However, that
-	 * would completely break comparability of pgbench results with prior
-	 * versions.  Since pgbench has never pretended to be fully TPC-B
-	 * compliant anyway, we stick with the historical behavior.
+	 * The pgbench_accounts table complies with that because the "filler"
+	 * column is set to blank-padded empty string. But for all other tables the
+	 * column defaults to NULL and so don't actually take any space.  We could
+	 * fix that by giving them non-null default values.  However, that would
+	 * completely break comparability of pgbench results with prior versions.
+	 * Since pgbench has never pretended to be fully TPC-B compliant anyway, we
+	 * stick with the historical behavior.
 	 */
 	struct ddlinfo
 	{
@@ -1558,12 +1560,14 @@ init(bool is_no_vacuum)
 
 	for (i = 0; i < nbranches * scale; i++)
 	{
+		/* "filler" column defaults to NULL */
 		snprintf(sql, 256, "insert into pgbench_branches(bid,bbalance) values(%d,0)", i + 1);
 		executeStatement(con, sql);
 	}
 
 	for (i = 0; i < ntellers * scale; i++)
 	{
+		/* "filler" column defaults to NULL */
 		snprintf(sql, 256, "insert into pgbench_tellers(tid,bid,tbalance) values (%d,%d,0)",
 				 i + 1, i / ntellers + 1);
 		executeStatement(con, sql);
@@ -1593,6 +1597,7 @@ init(bool is_no_vacuum)
 	{
 		int64		j = k + 1;
 
+		/* "filler" column defaults to blank padded empty string */
 		snprintf(sql, 256, INT64_FORMAT "\t" INT64_FORMAT "\t%d\t\n", j, k / naccounts + 1, 0);
 		if (PQputline(con, sql))
 		{
#5Fujii Masao
masao.fujii@gmail.com
In reply to: Pavan Deolasee (#4)
Re: pgbench filler columns

On Fri, Sep 27, 2013 at 4:03 PM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:

On Thu, Sep 26, 2013 at 7:20 PM, Noah Misch <noah@leadboat.com> wrote:

On Thu, Sep 26, 2013 at 03:23:30PM +0530, Pavan Deolasee wrote:

Should we just fix the comment and say its applicable for all tables
except
accounts ?

Please do.

How about something like this ? Patch attached.

Thanks! Committed.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers