Parameters operate independently of your data connections and your visualizations, so you cannot create a parameter which is dynamically updated based on the results of a query or a calculation. However, you may still be able to solve the problem you described using calculated fields alone.
Your calculation for the first sheet requires a specific level of detail -- in this case 'Identifier' -- to define the window scope used in the WINDOW_MEDIAN function. Is it possible for you to use that same table calculation and level of detail in your second sheet that involves a secondary data source?
If possible, please post a sample workbook that demonstrates your data and sheet configuration so we can try to work through a solution.
Thanks for the quick response - I've attached a sample workbook that illustrates what I'm trying to do - I have two sheets, one for a set of apartments and one for a single apartment. The first sheet is the average months rented by apartment for the larger set. I would like to take the median of that sheet and bring it onto the second sheet where I've calculated the average months rented for the single apartment to use in a formula (I've created the formula but get a data error when I try to bring it onto a view). In addition, is it possible to have both worksheets operate off the same filter (I've linked them based on the "floor" column, and ideally if I changed the filter for floor on sheet2, the filter on sheet1 (which is from a different data source) would also change.
Calculation Demo.twbx 13.5 KB
I've attached a workbook that produces what you want. I'll post separately with an explanation. This kind of analysis required some tricks because:
* the two data sets have no natural join column (the Person/PersonID fields aren't useful for your needs).
* table calculations operate at the level of detail of the worksheet, so trying to compute at a different LOD than you display requires some tricks.
I've attached a detailed workbook that walks through the steps. This was an interesting challenge, but do note that the mixture of very disparate data sources and levels of detail is not a common analytical use case. I thought it would be worth describing these steps so the power users on this forum can get to know some of the deeper power of table calculations.
In this example, we'll focus on the fact that the two data sources are comparing entirely separate apartment units. Furthermore, you want to compare the average occupancy duration of the "Target Building" against its peers, as measured by the median of average occupancy duration across four other apartments.
The steps below are associated with the numbered worksheets in the attached workbook.
1. First, we need to link the two data sources but we have no natural join field. We can fake this by creating a calculated field called "Link" in both data sources, each with the value 1. Customize your data source relationships so that the two Link fields are joined. We must use the 'Peers' data source as the primary because it has a higher level of detail. In general, data blending should incorporate secondary data sources which are at the same LOD or more coarse-grained than the primary data source.
2. Now, we need to incorporate the two fields required in your final calculation: the average number of months rented in the target apartment "E"; and the median of average months rented across the peer apartments "A", "B", "C", and "D". You can see in this sheet how the peer median relates to the average months rented for the peer apartments. From the secondary data source we also see the apartment "E" and its average, which is replicated across each of the peer apartments since the data sources are joined on the Link field.
3. This sheet shows the calculation you wanted to perform: comparing the target average occupancy against a median of peer average occupancy. The calculated field is called [Desired Field] and its formula is:
Avg([Target Building].[Months Rented])/[Peer Median]. Note how it is replicated for each of the peer apartments, again because the target apartment is replicated due to the join. It's also important to note the level of detail used for this table calculation: right-click on the "AGG(Desired Field)" pill in Rows and see that I've set "Compute using" >> Apartment. This means that we're computing the underlying window table calculation for [Peer Median] across the apartments in the peer group; it's a median of average occupancy rates, where the average is computed separately for each apartment.
4. Next, we only care to show one result for the single target apartment, unit "E". To do this, we have to be careful not to filter any of the apartment units from the peer group, since this would affect the table calculation. Instead we can make a new table calculation that identifies a single row, for example the last row of the result set. In this case I arbitrarily chose the last row. Since the table calculation function LAST() will return an offset of 0 for the last row, the formula for [LastRecord] is
LAST() = 0. I put this field in the color encoding to demonstrate its effect.
5. Finally, let's make use of this filter and only keep the last record. We can hide the header for the primary field [Apartment] since it's not needed in the display, but it must be a part of the sheet's level of detail since that the detail used in the table calculation as described in step 3. All done!
Great explanation, Robert, this will be a very useful resource.
> note that the mixture of very disparate data sources and levels of detail is not a common analytical use case
Isn't it? I seem to be hitting examples like this quite often (hence finding your explanation so useful).
By disparate I mean the data sources have no relation to each other other than the type of data type contain. Are you finding many use cases for data blending where there is no meaningful join column? If so this is interesting and unexpected to me and I'd love to hear more about these examples.
The case I hit very frequently, which is perhaps conceptually slightly different from the case in this thread, is where I am trying to find visual correlations between data sources where the only common dimension is time. But as time is a continuous dimension, with measurements in my disparate sources at very different levels of granularity (down to the micro-second in the most extreme cases), frequently not synchronised and fairly often with significant drift to adjust, it's no use just joining on the raw time dimensions. Some of the data sources list discrete events, others are inherently aggregated already (though often at different and perhaps also incompatible granularity of aggregation).
Jock showed me some tricks that give a reasonable approximation using data blending in some cases by joining on some form of bucketed view of time, retaining finer-grained detail using LoD. The other thing I sometimes do is combine the disparate data sources into a single combined data source (e.g. with a UNION and lots of NULL columns) to force a common time dimension.
The more I think about it the more I think this sort of case isn't really a good fit for data blending. Really all I'm trying to do here is to combine disparate data sources on a common drawing surface, with a synchronised time axis. Sort of half way between a combo chart and a dashboard.
Happy to provide detailed examples of this (on- or off-line), but I suspect this may be a little way away from your current data-blending focus.
Looking more closely at the original question in this thread and your solution I had a couple of thoughts about this case, too.
It seems to me that really the two data sources are just subsets of the same data, with the objective being to compare data for one instance (in this case one apartment) against a benchmark derived from all the others. In one form or another, that's a very common type of analysis.
It seems to me that it would be really useful to be able to do that analysis off a single data source, selecting the target and comparing with its peers dynamically.
There are a couple of things about the way this example is framed which make this tricky. It is comparing an aggregate (average) for the chosen apartment against an aggregate (median) of aggregates (average) for all other apartments (i.e. excluding the target from the benchmark).
I initially thought that might be possible without even using data blending at all, but I haven't managed to figure out a way to achieve that if I exclude the target from the benchmark. If I include the target apartment in the benchmark it doesn't need multiple data sources, just window functions and some calculated field tricks.
The attached workbook is driven off a single table combining the target and peer tables from the original. It has three sheets, which all use a parameter to select which is the target apartment.
The first sheet uses a single data source and includes the target in the benchmark. This may not be what is wanted in this case, but is a very common analytical case, so I thought it was worth seeing how easily that could be done.
The second one produces the same result, but this time joining the data source back to another copy of itself.
The third also uses data blending to join back to a copy of itself, but this time excludes the target from the benchmark, so compares the target with its peers in exactly the same way as the original example. It even gives the same answer if you select apartment E.
Window functions are definitely very powerful - but they still do my head in. I'm going to have to have a lie-down now. ;-)
DataBlendingRL.twbx 12.3 KB
Thanks Robert - this looks like a great solution and I'm going to see if it will work for implementation with my actual data model (it looks like it should). I was looking at it from the wrong point of view (trying to bring the peer group onto the target rather than bring the target onto the peer group). One last question - would it be possible to set a filter that works for both data sets if they have a common dimension with the same members (in this case floor)?
@Richard - In this case your solution would be perfect. However, my real use case is unfortunately more complicated (I'll have a bunch of "apartments" that need to be compared, with more coming in hourly/daily). My ideal setup will be a SSAS cube for the "targets" and a Tableau extract for my peer data set (also, this makes updating the peer benchmarks over time really easy).
You can filter a common dimension in both data sources if that dimension is used as join key. The other possibility (which I haven't tried) is to use a Parameter, which is global to the workbook, and have calculated fields in each data source which test if the floor# equals the Parameter value... then filter on true/false for each data source field.
I have two sheets and two table calc , is there nay way i can create a year parameter to control these two ? so that i can use common year filter for two worksheets.
1sheet : Table calc (Year L)
2 sheet : Table calc (Year L) different level of dim
Please suggest , this would be of great help.