Recommended TPC-DS tools/setup for PostgreSQL benchmarking?
Hi,
I'm planning to run TPC-DS benchmarks on PostgreSQL and wanted to ask the community about the current recommended approach.
Some background: I've been running TPC-DS on Greenplum-based databases for a long time using adapted tooling (modified queries, load scripts, etc. for the MPP environment).
Now I'd like to benchmark on PostgreSQL as well, and I'm wondering whether the community has converged on a standard tool or workflow — or if everyone is still downloading the official TPC-DS kit and doing their own PostgreSQL adaptations.
A few specific questions:
1. Is there a commonly-used, well-maintained TPC-DS toolset for PostgreSQL? I know some tpcds tools for postgres like gregrahn/tpcds-kit(its tpcds version is 2.x, but the newest tpcds version is 4.x) on GitHub, but I'm curious if there's anything more "official" or widely adopted in the community — something that handles data generation, PostgreSQL-compatible DDL, query adaptation, and result collection out of the box.
2. TPC-DS is now at version 4.0. Which version of the specification are people currently using for PostgreSQL benchmarking? Is there a practical reason to prefer an older version (e.g., v2.x or v3.x) over the latest?
Any pointers to repos, scripts, wiki pages, or past threads would be greatly appreciated.
--
Zhang Mingli
HashData
Import Notes
Reply to msg id not found: 8aea1906-bf19-484d-be5c-549d09ce856b@SparkReference msg id not found: 8aea1906-bf19-484d-be5c-549d09ce856b@Spark
On Sun, Feb 15, 2026 at 12:04:18PM +0800, Zhang Mingli wrote:
Hi,
I'm planning to run TPC-DS benchmarks on PostgreSQL and wanted to ask the community about the current recommended approach.
Some background: I've been running TPC-DS on Greenplum-based databases for a long time using adapted tooling (modified queries, load scripts, etc. for the MPP environment).
Now I'd like to benchmark on PostgreSQL as well, and I'm wondering whether the community has converged on a standard tool or workflow — or if everyone is still downloading the official TPC-DS kit and doing their own PostgreSQL adaptations.A few specific questions:
1. Is there a commonly-used, well-maintained TPC-DS toolset for PostgreSQL? I know some tpcds tools for postgres like gregrahn/tpcds-kit(its tpcds version is 2.x, but the newest tpcds version is 4.x) on GitHub, but I'm curious if there's anything more "official" or widely adopted in the community — something that handles data generation, PostgreSQL-compatible DDL, query adaptation, and result collection out of the box.
2. TPC-DS is now at version 4.0. Which version of the specification are people currently using for PostgreSQL benchmarking? Is there a practical reason to prefer an older version (e.g., v2.x or v3.x) over the latest?
Any pointers to repos, scripts, wiki pages, or past threads would be greatly appreciated.
I try to keep this one maintained and as compliant to the current (4.0)
specification as I can: https://github.com/osdldbt/dbt7
I'm not sure how widely used it is, but I believe the OSDL kits are not known
to be the easiest to use. :)
Feel free to contact me directly if you have more questions about the kit.
Regards,
Mark
--
Mark Wong <markwkm@gmail.com>
EDB https://enterprisedb.com
Mark Wong <markwkm@gmail.com>于2026年2月18日 周三00:45写道:
On Sun, Feb 15, 2026 at 12:04:18PM +0800, Zhang Mingli wrote:
Hi,
I'm planning to run TPC-DS benchmarks on PostgreSQL and wanted to ask
the community about the current recommended approach.
Some background: I've been running TPC-DS on Greenplum-based databases
for a long time using adapted tooling (modified queries, load scripts, etc.
for the MPP environment).Now I'd like to benchmark on PostgreSQL as well, and I'm wondering
whether the community has converged on a standard tool or workflow — or if
everyone is still downloading the official TPC-DS kit and doing their own
PostgreSQL adaptations.A few specific questions:
1. Is there a commonly-used, well-maintained TPC-DS toolset for
PostgreSQL? I know some tpcds tools for postgres like
gregrahn/tpcds-kit(its tpcds version is 2.x, but the newest tpcds version
is 4.x) on GitHub, but I'm curious if there's anything more "official" or
widely adopted in the community — something that handles data generation,
PostgreSQL-compatible DDL, query adaptation, and result collection out of
the box.2. TPC-DS is now at version 4.0. Which version of the specification are
people currently using for PostgreSQL benchmarking? Is there a practical
reason to prefer an older version (e.g., v2.x or v3.x) over the latest?Any pointers to repos, scripts, wiki pages, or past threads would be
greatly appreciated.
I try to keep this one maintained and as compliant to the current (4.0)
specification as I can: https://github.com/osdldbt/dbt7I'm not sure how widely used it is, but I believe the OSDL kits are not
known
to be the easiest to use. :)Feel free to contact me directly if you have more questions about the kit.
Hi, Mark
Good to know, thank you very much for sharing.