1 2 Previous Next 28 Replies Latest reply on Oct 3, 2016 4:38 AM by Yuriy Fal

# 1 dimension cross tab & count of common values

Hi,

I once came across an example when the goal was to show a number of common UIDs which occurred in 2 years.

I could produce the count only if there were such records but never managed to replace blanks with 0 and It is still bugging me. Is it possible?

Here is another example with 5 distinct values of X and corresponding Y

 X Y 1 A 2 B 3 C 4 D 5 E 2 A 4 A 2 C 4 C 1 D

the following cross table shows the count of common Y across X and its copy - I filtered 4C pair: How to display zero instead of blank cells?

• ###### 1. Re: 1 dimension cross tab & count of common values

Hi Yuri Fal - maybe you find my question interesting/challenging?

Thanks,

Łukasz

• ###### 2. Re: 1 dimension cross tab & count of common values

Your question is a good one for the next  Open Tableau Table Calculation World Championship  in November 2016 in Austin TX.  To filter out amateurs like me, every challenger must first solve your question to qualify for the championship.

• ###### 3. Re: 1 dimension cross tab & count of common values

Man - yet another thing I did not know existed: the championship

I'm still getting surprised by how vivid the tableau community is!

• ###### 4. Re: 1 dimension cross tab & count of common values

Here are two solutions, both based on an earlier post of mine at Counting Pairwise Similar Votes in Tableau | Drawing with Numbers. I haven't been able to find an unmodified single-source solution to this. The reason is that since densification is adding marks then calcs like SIZE() don't work anymore (because they count densified marks) and regular aggregates like COUNT() and MIN() don't work because they only work on non-aggregated data.

The join solution uses a self-join on Y to effectively pad out the data, then an index calc to trigger domain completion and ZN(SUM(Number of Records))) to get the results.

The blend solution uses a INDEX() table calculation with a compute using on X (copy) & Y to trigger domain completion (this is a simplified setup from what I'd done in the post since I've learned more about densification since then), then a self blend on X & Y and a table calc with the formula IF FIRST()=0 THEN ZN(WINDOW_SUM(SUM([Number of Records]))) END and a compute using on Y to count the number of records.

Workbook is attached, let me know if you have any questions!

Jonathan

5 of 5 people found this helpful
• ###### 5. Re: 1 dimension cross tab & count of common values

Couldn't add anything after Jonathan.

Besides, for the diagonal matrix  to look

as in the original Sheet 4 (with C,4 filtered),

the self-join solution needs a pair of filters.

Yours,

Yuri

2 of 2 people found this helpful
• ###### 6. Re: 1 dimension cross tab & count of common values

Thank you Jonathan and Yuri for your answers.

I'm more convinced now that probably it is not possible with unchanged source and table calculations only. But what still seems vague to me is this:

1. Is densification taking place in my setup (24 marks)? 2. Why the densification is complete across all the cells here (50 marks)? 3. My understanding is the following: The underlying data consists of 10 rows and real data lies on the diagonal only. The first addressing is partitioned by [Y] so it is padding every potential combination of X&X per Y hence marks are displayed only in cells where Y exists in both [X] & [X copy]. The second just does not partition so it densifies to all potential combination of X & (XY pair) even if such XY pair/triple does not exist.
• ###### 7. Re: 1 dimension cross tab & count of common values

@Yuriy - thanks for the catch!

@Lukasz -

You didn't specify the addressing of the table calculations so I can't say exactly what's going on for your #1 and #2, however I can replicate those views with various table calculation settings and describe those results, if yours are different then let me know and I'll explain.

The start is that there are 10 records in the data with 10 unique combinations of X & Y, so when we have X and Y in the view we get 10 marks.

Given the pill layout a compute using for the Index calc on a) X, b) X (copy), c) Table (Across), or d) Table (Down) will all result in the same densification out to 24 marks. No matter what the Index calc is set to it's addressing on one of the two X dimensions and partitioning on the combination of Y and the other X dimension, here's an example: There are 24 marks because Tableau is effectively padding out all combinations of X & X (copy) for each Y, this results in their being values of X, X (copy) being padded in that don't exist in underlying data.

In the second view I can get those results with 50 marks by setting the Compute using to an Advanced compute using on one of the X dimensions *and* Y, so it partitions on the remaining X dimensions: So in this case as you surmised Tableau is padding out all combinations of X & X (copy) & Y, leading to 50 marks.

In both cases the underlying trigger for densification is in two parts:

1) There are discrete dimensions on Rows & Columns.

2) There's a table calculation that is addressing on something other than none/all the dimensions on Rows and Columns.

Joe Mako and I call this "crosstab domain completion" because of the 1st requirement. If we rearrange the pills such that we don't have opposing discrete dimensions on Rows & Columns then this domain completion goes away.

Jonathan

3 of 3 people found this helpful
• ###### 8. Re: 1 dimension cross tab & count of common values

You're right. I think it was the first time I came across such dimension arrangement where data densification is used and it was a little bit surprising to me.

I am not sure if it always was like this but am wondering if it is necessary that all the table calculations share the same domain. If they were independent I could use the addressing resulting with 50 marks for displaying my calculation addressed to 24 marks where size() is what I want. But now the situation is that once the domain completion is triggered - regardless of dependency order in nested calculations or if they are separate - the domain is densified to the highest level prior to evaluating any of the table calculations.

Or maybe I am talking nonsense now... (it is Saturday night )

• ###### 9. Re: 1 dimension cross tab & count of common values

As Tableau currently exists all densification is completed before data blending which is completed before table calculations are computed, so yes, all table calculations share the same domain.

While enabling multiple sets of domains of densification would be one route to get to your desired result, what I've asked Tableau to do is to a) first make what it's doing for densification more obvious and b) enable us to do the densification further upstream in the data source so that way we can use Tableau's already built-in downstream functionality. For example if I could tell Tableau to pad out every combination of X and X (copy) as a record-level value then filling it with 0s could be done using Tableau's Format->(measure)->Special Values->Text functionality or just using COUNT([Y]).

Jonathan

• ###### 10. Re: 1 dimension cross tab & count of common values

<snip>

For example if I could tell Tableau to pad out every combination of X and X (copy) as a record-level value

</snip>

Would it mean "add some rows to the datasource on the fly"?

• ###### 11. Re: 1 dimension cross tab & count of common values

To be explicitly clear, I think of this feature request as “add rows to the Tableau data source on the fly” and not “add rows to the underlying data source on the fly."

Jonathan

• ###### 12. Re: 1 dimension cross tab & count of common values

I was wrong - it can be done Data source is unchanged though method is different. Densification is full (125 marks) and count of common [Y] is looked up as simple size() cannot be used anymore.

4 of 4 people found this helpful
• ###### 13. Re: 1 dimension cross tab & count of common values

I do not like to do that but had to mark my last answer correct as in fact it fulfills the goal. The valuable input from Yuri and Jonathan are very much appreciated!

What is interesting about this solution is that in order to have data densification done to all potential combinations of the dimension values (5x5x5) you have to use two table calculations addressed properly (here 2x index()).

• ###### 14. Re: 1 dimension cross tab & count of common values

Lukasz, you've came with the right answer,

Your solution is a viable alternative to a common (self-join) approach

of counting Bins when starting a Market Basket analysis.

Self-join performance depends greatly on the underlying DBMS of choice,

and it is not an option with extract-based datasources (as of now, but hope that

Tamas Foldi could tell some).

Yours,

Yuri

1 of 1 people found this helpful
1 2 Previous Next