If weight is just a single value in each instance you could use sum([weight]) or max([weight]) - it doesn't matter, but will allow the equation to be valid.
No, the weight will be the same for each strate, but the sales are different for each case.
I must multiply each individual Sales by the weight it has.
Any other suggestion? Is there any way to "disaggregate" a measure?
can you please provide a sample workbook of what you have so far, or some sample data?
Ok, you are rigth. With an example you could understand me better.
Here you have an Excel File where I compute Weighted Sales as the Sales by Weigth. And Wheigth is computed as expected sample divided by real sample (number of different customers I have in the sample). The Excel's computations begin by E_ to distinguish them.
In the workbook I can compute Expected Sample and Weigth. But I couldn't compute Weigthed Sales because I have to multiply sales (a non aggregated measure) by weigth (an aggregate measure).
What can I do?
So if you change your calc field "Weigthed Sales" from
[Sales] * [Weigth]
SUM([Sales]) * [Weigth]
Does that provide the result you are looking for?
No, that do not provide the result I am looking for. It works just in the case I analyze Sales By Region, but if I would like to analize another dimension it doesn't work. Remenber, Weigth is constant by region, but it is not a constant for any other dimension you would like to analize.
Has Tableau a function that says:
For each case
WHEN Region="North" THEN SUM[Wheigth] END
or something similar.
I think there should be a formula to do that, don't you? It will simplify our life and give much more opportunities to Tableau computing.
This is a surprisingly subtle problem. I'm fairly sure that it can be done with table calculations, by pulling all rows in the sample back to Tableau and performing all of the calculations locally, but I can't see any way to achieve it with standard calculations (for which the aggregate functions happen in the back-end database).
I haven't got time to try to put an example together now, but I'll just outline what I think is needed. If that doesn't give you enough clue (and Joe doesn't leap in first with a video explaining it) I'll have a look when I get a chance. (I actually think this would be a nice example for one of your video tutorials, Joe).
Briefly, this is (roughly) what I think you need to do.
1) Add a unique identifier to your original data source - this is needed so that Tableau can actually retrieve all rows from the database.
2) Put the new ID column on the Level of Detail shelf (so that you have all rows in the sample returned for your table calculations to work with.
3) Define a Table Calculation to calculate the weights, partitioned by Region. That will avoid the weights changing as you slice and dice by different dimensions.
I think this is a bit harder than it sounds - the count of distinct customers by region needs to happen in a table calculation and there is no WINDOW_COUNTD() function. I *think* that should be possible with some careful use of LOOKUP() and PREVIOUS_VALUE() and by sorting the partitions by IdCustomer. But I haven't tried. ;-)
4) You should now be able to define your weighted measures in terms of this weight to give the answer you are after. I think. ;-)
5) You may also have to resort to some additional table calculation trickery to summarise down to the aggregates you want - as you will have one row (i.e. one mark) for each row in the sample, rather than one per Region, or whatever. So you probably need to suppress all but the first row in each partition.
Hope that helps. No guarantees, but I think that should work.
This is really too much harder than I expected. I only want to weight a sample. I think weighted data is very interesting if we would like to use Tableau to analize Surveys either because the sample is not proportional to the population or because you would like to elevate your sample size to the population size.
Obviosly I can compute weight before to read data in Tableau. But if Weigths are different from one year to another, for example, I could use Tableau filters to select the year of analysis and don't worry about these changes. I loose some Tableau potential if I could not weigth data.
So, please, could you consider the idea to introduce Weights sometimes in Tableau or just introduce computations "For each case" as I suggested in my anterior post (is it so difficult?)
Thank you very much for your comments.
Meanwhile if Joe find an easy way to do it with Tableau, I will be grateful.
Can I request a few more rows be added to your data set so I can see how the formula that I had proposed earlier breaks down with additional rows?
Additionally, since I am not fully understanding the concepts of weights and how you are applying it, can you please provide a before and after example, maybe in Excel so I can trace the formulas? These additional fields with the "E_" prefix may be confusing me, because I am still not quite sure what you are looking for.
I've got 4 hours sitting on an inter-island ferry tomorrow - I'll have a play with the data then and post a sample workbook if I come up with anything.
Sorry for my delay, but I would like to give you a complete example of the problem.
In the attached Excel file, you will find 3 sheets with data simulate a sample, the aggregate computations I need, and the analysis I would like to perform.
Let's go step by step.
1) We make a survey asking some customers how many they buy by brand. Also we know the year of study (we repeat the survey every year) and the customer's region. To perform the computations, I add a Boolean variable that say if every case below to a distinct customer or not.
2) As we can see at the Pivot Table in Aggregates' sheet, we have got a sample of 17 distinct custoners at East, 8 at North and 15 at South at 2009 and 20, 8 and 22 customers at East, North and South at 2010.
3) On the other hand, we know by foreing data that 45% of total customer are in Region East, 15% in North and 40% in South at 2009 and respectively 50%, 20% and 30% at 2010.
4) With these data we can compute how many customer we need if we want to make a proportional's population sample. In 2009 as we get a total of 40 customer, we expected to have 45% of them in East region. That is 18 customers. In North we expected to have 40 * 15% = 6 customers and in South 40 * 40% = 16 customers. We can do the same calculations for 2010 by taking account that the total sample we have this year is 50 and the percent of customers in each region has change.
5) Then we can compute the weight I need in each strate (Region by Year) to give the sample we have the same proportion that we have in population. So If I weigth my 17 2009East's customers by 1.059 they will represent the 18 custoners we should have in this strate.
6) Now is time to weigth each case. By the strate it belongs, I can obtain the weight I have to use in each case and I can multiply my Sales by this weight. THIS IS THE STEP I CAN NOT DO BY TABLEAU.
7) To show you the effect of weighted Sales I do two Pivot Tables at Analysis' sheet. The fisrt one compute the Sales and Weighted Sales by Year and Region. As you can see in Ratio column, between them there are exactly the weigth magnitude.
But if we would like to compute the Weighted Sales by Year and Brand we can see how the ratio of Weighted Sales by Sales doesn't much the weights.
This is the reason we can not use the formula SUM([Sales]) * [Weigth] to compute Weigthed Sales.
I hope this exemple clarify my problem.
Thank you very much for your time and effort,
weigth.xlsx 32.1 KB
i put an example together with your original data yesteeday showing how to do it - I just haven't been on the internet since (except from my phone). I'll update with your new data and post when I get a chance -which may be a day or two.
BTW - the COUNT DISTINCT using table calcs will make a nice little calculation for the TCRL.
Here's your sample data with all the calculations done in Tableau.
I actually created a new spreadsheet with just your raw sales data and expected proportions by year and region as two sheets, so I've attached that as well.
I pretty much stuck with the 5 steps I described the other day.
The calculation of the count of distinct customers is probably just worth a few comments. There are two versions of that - one partitioned just by year to get the total customers in that year's sample and one partitioned by year and region. The calculation is the same in each case - just partitioned differently, so I don't actually need two calculated fields, I could just use it twice with the partitioning set differently - but I think it's clear to include the partitioning in the field name.
The calculation uses a WINDOW_SUM(). This won't currently work very quickly on large data sets (i.e. thousands of rows or more) - there's a nasty performance issue which makes the window functions slow right down on large data sets at the moment. There is a workaround for that which I've described elsewhere, but I haven't bothered to include that in the calculations here - the workaround makes it much harder to see what it's actually doing.
I think I'll probably write up that distinct count function for the Tableau Calculation Reference Library sometime - so I'll included the quick version in there when I do.
The answers come out exactly the same as in your sample Excel file with one exception. The Grand Total rows give the wrong answer for the weighted averages (they work fine for the simple sums). I haven't had a chance to try to figure out what's happening there. I suspect it might be a Tableau bug - but it could just as easily be me missing something - but I thought I'd just post it like this to give Joe something to think about. ;-)
Let me know if you need more explanation. I'm sure if you ask nicely Joe would be happy to turn it into a video.