Easiest way to compare the results of two queries row by row and column by column
Can someone suggest the easiest way to compare the results from two
queries to make sure they are identical?
I am rewriting a large number of views and I want to make sure that
nothing is changes in the results.
Something like
select compare_results('select * from v_old', 'select * from v_new');
I would want this to check that the row count and each row matched
column by column.
I am hoping someone has already written something for this...
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Jason Long-2 wrote
Can someone suggest the easiest way to compare the results from two
queries to make sure they are identical?
First thing that comes to mind:
WITH
before_qry (col1, col2, col3) AS ( VALUES (1,1,1),(2,2,2),(3,3,3) )
, after_qry (col1, col2, col3) AS ( VALUES (1,1,1),(2,2,2),(3,3,3) )
, before_array AS (SELECT array_agg(before_qry) AS before_agg_array
FROM before_qry)
, after_array AS (SELECT array_agg(before_qry) AS after_agg_array FROM
before_qry)
SELECT *, before_agg_array = after_agg_array
FROM before_array CROSS JOIN after_array
Basically turn the resultsets into arrays (of composites) and then see if
the arrays are the same. This has issues with respect to column names and
comparable datatypes (i.e., if one column is bigint and the other is integer
they still compare equally).
One thought would to only allow a view name (and possibly, separately, the
ORDER BY clause). Catalog lookups can be used to check for identical view
output types.
No idea of something like this exists and is readily available.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Easiest-way-to-compare-the-results-of-two-queries-row-by-row-and-column-by-column-tp5760209p5760215.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, Jun 20, 2013 at 3:18 PM, Jason Long
<mailing.lists@octgsoftware.com>wrote:
Can someone suggest the easiest way to compare the results from two
queries to make sure they are identical?I am rewriting a large number of views and I want to make sure that
nothing is changes in the results.Something like
select compare_results('select * from v_old', 'select * from v_new');
I'd run:
select * from v_old
except
select * from v_new ;
And then
select * from v_new
except
select * from v_old ;
Both should return no rows.
However, if the queries can contain duplicate rows this will not detect
differences in the number of times a row is replicated, i.e. if one query
has a row 2 times and the other has it 3 times. If you need to detect such
cases, I'd probably \copy out each query to a file, then use system tools
to sort and diff the files.
Cheers,
Jeff
Thank you. I will give it a try. I have never used WITH before.
Thank you for the tips.
On Thu, 2013-06-20 at 16:05 -0700, David Johnston wrote:
Show quoted text
Jason Long-2 wrote
Can someone suggest the easiest way to compare the results from two
queries to make sure they are identical?First thing that comes to mind:
WITH
before_qry (col1, col2, col3) AS ( VALUES (1,1,1),(2,2,2),(3,3,3) )
, after_qry (col1, col2, col3) AS ( VALUES (1,1,1),(2,2,2),(3,3,3) )
, before_array AS (SELECT array_agg(before_qry) AS before_agg_array
FROM before_qry)
, after_array AS (SELECT array_agg(before_qry) AS after_agg_array FROM
before_qry)
SELECT *, before_agg_array = after_agg_array
FROM before_array CROSS JOIN after_arrayBasically turn the resultsets into arrays (of composites) and then see if
the arrays are the same. This has issues with respect to column names and
comparable datatypes (i.e., if one column is bigint and the other is integer
they still compare equally).One thought would to only allow a view name (and possibly, separately, the
ORDER BY clause). Catalog lookups can be used to check for identical view
output types.No idea of something like this exists and is readily available.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Easiest-way-to-compare-the-results-of-two-queries-row-by-row-and-column-by-column-tp5760209p5760215.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 06/21/2013 01:07 AM, Jeff Janes wrote:
On Thu, Jun 20, 2013 at 3:18 PM, Jason Long
<mailing.lists@octgsoftware.com
<mailto:mailing.lists@octgsoftware.com>> wrote:Can someone suggest the easiest way to compare the results from two
queries to make sure they are identical?I am rewriting a large number of views and I want to make sure that
nothing is changes in the results.Something like
select compare_results('select * from v_old', 'select * from v_new');
I'd run:
select * from v_old
except
select * from v_new ;And then
select * from v_new
except
select * from v_old ;Both should return no rows.
This is my solution as well.
However, if the queries can contain duplicate rows this will not
detect differences in the number of times a row is replicated, i.e. if
one query has a row 2 times and the other has it 3 times. If you need
to detect such cases, I'd probably \copy out each query to a file,
then use system tools to sort and diff the files.
No need, just use EXCEPT ALL.
Basically, the following query should return nothing:
(TABLE v_old EXCEPT ALL TABLE v_new)
UNION ALL
(TABLE v_new EXCEPT ALL TABLE v_old);
--
Vik
Vik, note that your new suggestion for a query might be more
readable/efficient than Jeff's, but IT STILL DOESN'T SOLVE THE PROBLEM OF
DETECTING DUPLICATE ROWS.
--
View this message in context: http://postgresql.nabble.com/Easiest-way-to-compare-the-results-of-two-queries-row-by-row-and-column-by-column-tp5760209p5916761.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Wed, Aug 17, 2016 at 4:58 AM, gilad905 <gilad905@gmail.com> wrote:
Vik, note that your new suggestion for a query might be more
readable/efficient than Jeff's, but IT STILL DOESN'T SOLVE THE PROBLEM OF
DETECTING DUPLICATE ROWS.
I've done this in the past and in my use-case it was easy enough to
export the rows to CSV with a sort, and then use a diff tool to ensure
they're identical. My data had 100's of thousands of rows, but
obviously with massive tables this may not be appropriate.
If you want more details on this I can find the exact psql client
query form I used to get a CSV file of the results.
Hope that helps.
Branden
--
View this message in context: http://postgresql.nabble.com/Easiest-way-to-compare-the-results-of-two-queries-row-by-row-and-column-by-column-tp5760209p5916761.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 17/08/16 10:58, gilad905 wrote:
Vik, note that your new suggestion for a query might be more
readable/efficient than Jeff's, but IT STILL DOESN'T SOLVE THE PROBLEM OF
DETECTING DUPLICATE ROWS.
Yes, it does.
And please don't shout at me.
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Didn't mean to give the impression I'm 'shouting' :) just wanted to make sure
other users will notice my reply.
But you're right! it does. It even disregards the order of the rows. For
some reason in a previous test I looked to me like it didn't.
--
View this message in context: http://postgresql.nabble.com/Easiest-way-to-compare-the-results-of-two-queries-row-by-row-and-column-by-column-tp5760209p5917059.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
select * from (select id, my_age, ur_age as urage from datatable where
agegp != 'S') m where m.myage <> m.urage
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html