1 of 1 people found this helpful
Welcome to the forum!
This seems best solved outside Tableau or with a Custom SQL although I am curious to see if some table calculation or data blending expert has a good solution inside Tableau. I can picture this is possible within Tableau with 2x sheets using data blending on ID and a calculated field comparing the columns, but don't have time right now to check it out.
One way that might work is to make a union of Tableau A and B and thereafter a simple table with all columns and a filter on Number of Records = 1.
FULL OUTER JOIN
What's your data source? I ask, because Jet SQL (Excel, Text File, Access) doesn't support FULL OUTER JOINS which would make it possible to this with 1x FULL JOIN query. If Jet SQL, it can be done with a UNION ALL of 2x LEFT JOIN queries.
I am also curious to see how successfully this can be done with tip 3 (unpivot columns) in The Cross Join Collection and therefore something I might try as soon as I have some free time either later today or another day.
Ps. I like your question.
Thank you very much for your welcome, and sharing insights!
My data resides in excel sheets, and unfortunately not in any DB. Thank you for your suggestions, I will see if I can employe one of them, unless someone has a solution on how to work the solution.
Here is a UNION ALL solution where 2 (Number of Records) means that everything is equal and 1 that something is different. If 1 row without subrow then it is a row (ID) only found in one table. If 1 with subrow, then 1 or more columns are different. Hovering [Number of Rows] shows table name (when 1). One thing to notice is that I changed Analysis > Table Layout > Advanced from 6 to 16 rows/columns. See more in screenshot and attached files.
Ps. I will share a CROSS JOIN solution later. and possibly also some of the other suggestions in my first comment.
I have a feeling (and hope) there will come suggestions from several helpers, because this question (sought solution) is attractive in itself.
This is definitly very helpful!
Say that you have a dataset of 10K rows - a table will not be the best way of displaying. So Im thinking that It would be best to display
- Total row count of changes
- Total row count of changes per supplier
- Changes pr column
- Total impact of changes - for columns where it makes sense - e.g. with price
What is your view on this?
Hope you don't mind me asking
I am glad you are asking
A common issue on this forum is that questioners provide too little information making it almost impossible to help without asking for more info. The opposite happens also, that no one touches a question because it has too much information.
I will take a closer look at this later ... hopefully this evening.
The attached workbook contains all the SQL techniques I was thinking to share. Hopefully some of them is good enough to earn a correct answer. I have made a reference to this thread under Tip 3 in The Cross Join Collection.
FULL JOIN SQL
SELECT a.[Avg hrs] AS [a Avg hrs], a.[ID] AS [a ID], a.[Mean time] AS [a Mean time], a.[Price] AS [a Price], a.[Reoccurence] AS [a Reoccurence], a.[Supplier] AS [a Supplier], a.[Time] AS [a Time], b.[Avg hrs] AS [b Avg hrs], b.[ID] AS [b ID], b.[Mean time] AS [b Mean time], b.[Price] AS [b Price], b.[Reoccurence] AS [b Reoccurence], b.[Supplier] AS [b Supplier], b.[Time] AS [b Time] FROM ['table A$'] a LEFT JOIN ['table B$'] b ON a.ID = b.ID UNION ALL SELECT a.[Avg hrs] AS [a Avg hrs], a.[ID] AS [a ID], a.[Mean time] AS [a Mean time], a.[Price] AS [a Price], a.[Reoccurence] AS [a Reoccurence], a.[Supplier] AS [a Supplier], a.[Time] AS [a Time], b.[Avg hrs] AS [b Avg hrs], b.[ID] AS [b ID], b.[Mean time] AS [b Mean time], b.[Price] AS [b Price], b.[Reoccurence] AS [b Reoccurence], b.[Supplier] AS [b Supplier], b.[Time] AS [b Time] FROM ['table B$'] b LEFT JOIN ['table A$'] a ON b.ID = a.ID WHERE a.ID is null
CROSS JOIN SQL
SELECT a.[ID] , l.[Column Name] , CStr( IIF(l.[Column Name] = 'Time', a.[Time], IIF(l.[Column Name] = 'Price', a.[Price], IIF(l.[Column Name] = 'Reoccurence', a.[Reoccurence], IIF(l.[Column Name] = 'Supplier', a.[Supplier], IIF(l.[Column Name] = 'Mean time', a.[Mean time], IIF(l.[Column Name] = 'Avg hrs', a.[Avg hrs] ))))))) AS [a Value] , Cstr( IIF(l.[Column Name] = 'Time', b.[Time], IIF(l.[Column Name] = 'Price', b.[Price], IIF(l.[Column Name] = 'Reoccurence', b.[Reoccurence], IIF(l.[Column Name] = 'Supplier', b.[Supplier], IIF(l.[Column Name] = 'Mean time', b.[Mean time], IIF(l.[Column Name] = 'Avg hrs', b.[Avg hrs] ))))))) AS [b Value] FROM (['table A$'] a INNER JOIN ['table B$'] b ON a.ID = b.ID), [lookup$] l
UNION ALL SQL
SELECT 'Table A' AS [Table Name], ['table A$'].[Avg hrs], ['table A$'].[ID], ['table A$'].[Mean time], ['table A$'].[Price], ['table A$'].[Reoccurence], ['table A$'].[Supplier], ['table A$'].[Time] FROM ['table A$'] UNION ALL SELECT 'Table B' AS [Table Name], ['table B$'].[Avg hrs], ['table B$'].[ID], ['table B$'].[Mean time], ['table B$'].[Price], ['table B$'].[Reoccurence], ['table B$'].[Supplier], ['table B$'].[Time] FROM ['table B$']
Thank you Kettan - you have been very helpful!
The "Cross Join SQL" is definitely the best solution for the kind of comparison that Im trying to do. I do, however, have a few problems getting it to work with my column names, even though they work with yours.
I have been trying too workout how your code was written, but cannot find any examples to help me understand the code. Is it possible that you could explain it? Im trying to understand the CSTR function, as well as "[lookup$] l" at the end of line 20. When I try to replicate the code, using my own column names, I get this error:
"Database error 0x80040E10: No value given for one or more required parameters."
"Database error 0x80040E14: Syntax error in FROM clause."
"Database error 0x80040E14: Syntax error in query expression 'CStr("
I look up the code, and found this Troubleshooting 80040e10 errors - adOpenStatic.com - which was very helpful, and would be even more so, If I have a better understanding of your code . I thought it might have something to do with illigal characters, but when I try to remove any characters that is not in the alphabet I still get an error, I thought it might be because my column names were to long - but I cant find info on that either.
If you have the time - I would be very greatful you could explain your code
Thank you for your time and help.
It is difficult to correct code without seeing it. If you could share some dummy data and the code that doesn't work, it will probably be quick for me to correct it. The best would be a packaged workbook, because it includes both data (Excel) and code.
CStr changes various data types in the IIF to one common data type, namely text. This is because a column can only have one data type and text is the only one all (text & dates & numbers) have in common.
[lookup$] is a table (a separate sheet in the same Excel workbook) with all column names (except primary key ID) and is what is used to create one row for each column. See more about this in tip 3 in The Cross Join Collection. I would gladly explain this more fully, but have not enough free time to do that now.
Thank you very much! That would be very helpful. I will make some dummy data, so that you can see my problem
Thanks a lot for the explaination too.
I've just been through your work again and I'm very very thankful for your help! It's amazing! Thank you so much! I ended up using the FULL JOIN SQL as I could not get the CROSS JOIN SQL to work in my SQL. Is there any way to add a lookup table to the full join, as with the cross join?
Thank you again for your time and help! Its been absolut great!
1 of 1 people found this helpful
I just clicked on a reference to Stack Overflow in Re: Helper Evolution.... and saw a link to following question at the top: sql server - Compare multiple columns in new table with only three columns
I have modified the cross join query so it works in SQL Server. (SQL Server is my prime SQL tool, because I use it for a data warehouse I am developing. Jet SQL is something I recently learned myself as a necessity to help with data issues on the Tableau forum.)
Hopefully this is a query that works in your work situation:
SELECT org.[ID] , l.[Column Name] , CASE l.[Column Name] WHEN 'Pos' THEN CAST(org.[Pos] AS VARCHAR(11)) WHEN 'Remarks' THEN org.[Remarks] WHEN 'Candidate name' THEN org.[Candidate name] END AS [Q2 Value] , CASE l.[Column Name] WHEN 'Pos' THEN CAST(new.[Pos] AS VARCHAR(11)) WHEN 'Remarks' THEN new.[Remarks] WHEN 'Candidate name' THEN new.[Candidate name] END AS [Q4 Value] FROM [DataOutput_org] org INNER JOIN [DataOutput_new] new ON org.ID = new.ID CROSS JOIN ( SELECT 'Pos' AS [Column Name] UNION ALL SELECT 'Remarks' UNION ALL SELECT 'Candidate name' ) l
The CAST([Pos] AS VARCHAR(11)) is done because I defined [Pos] as INT in SQL Server. You can the same write VARCHAR(100) and thus be on the safe side
I used a table-less lookup table because it was easier to document its content in this way. You can of course change it to a permanent table to shorten the number of code lines and perhaps also slightly better performance.
Thank you so much for your help. I tried Stackoverflow - as you say the answer is a SQL problem and I thought it would be better to ask there then. Your solution is perfect! It was just what I was looking for. Thank you so much for your explanation, it makes it so much easier to understand the answer and be able to apply the new knowledge elsewhere
You are welcome. It was a pleasure to solve this attractive task. There might come a time when I need the solution myself. Without your quality question I would not be ready. Now I am
Thank you so much for your explanation, it makes it so much easier to understand the answer and be able to apply the new knowledge elsewhere
As for using cross joins in different scenarios, I recommend reading The Cross Join Collection.