
1. Re: R integration: basic calculations & graphs (Part2)
Rollie Parrish Dec 22, 2013 10:00 AM (in response to Rollie Parrish)OK, appears to be something different about how Tableau is sending/receiving the data when working with bar graphs vs. line graphs.
In the attached workbook, a simple debugging function has been added to the calculated fields that are using R. It captures the data that is coming and going to a logfile in the R working directory. The output shows that very different values are being sent and received for bar graphs vs. line graphs.
In this example, the values for LOS are being sent to R to calculate the mean for each Group.
These are the expected results for one point on a bar graph:
[1] "######## START ####"
[1] "######## Arithmetic LOS"
[1] "Sun Dec 22 09:05:11 2013"
[1] "incoming from Tableau"
[1] 1 2 3 10
[1] "back to Tableau"
num 4
[1] "######## END ####"
The these are the results for the same point using a line graph.
[1] "######## START ####"
[1] "######## Arithmetic LOS"
[1] "Sun Dec 22 09:07:52 2013"
[1] "incoming from Tableau"
[1] 1.000000e+00 2.000000e+00 3.000000e+00 1.000000e+01 9.654043e321
[6] 9.654043e321 9.654043e321 9.654043e321 9.654043e321 9.654043e321
[11] 9.654043e321 9.654043e321 9.654043e321 9.654043e321 9.654043e321
[16] 9.654043e321
[1] "back to Tableau"
num 1
[1] "######## END ####"
Any ideas what's going on here?

TableauR Basic Calculations.twbx 36.6 KB


2. Re: R integration: basic calculations & graphs (Part2)
Jonathan Drummey Dec 23, 2013 10:30 AM (in response to Rollie Parrish)Hi Rollie,
I love your R logging script, that's brilliant! There's supposed to be some
way to get Rserve to log to the console, but I haven't been able to get
that working in my setup, so being able to turn it on & off per function is
really helpful!
The problem you're experiencing is due to what Joe Mako calls "Mark Type
Filling"  we don't know the Tableau term for it  where for Line, Area, &
Polygon mark types when there are 2 or more dimensions in the view and a
table calc, Tableau will complete the domain of the combinations of
dimensions. That's why in the calls to R you're seeing 16 arguments instead
of just 4 for each Group, and why the bar charts work and the lines don't.
In the attached I set up a walkthrough of the problem. I don't have a good
idea of how to make Tableau do the "right thing" (i.e. not do the mark type
filling). I've got an idea of how to get R to do the right thing with the
extra data that Tableau is sending, theoretically you could add a couple of
arguments to explain the padding and current index to be able to filter out
the padded data in R, but that would be a lot of extra work in R. However,
I do know how to get Tableau to compute the arithmetic mean LOS and
geometric mean LOS without using R, so I set those up in the attached as
well, in the "All in Tableau crosstab," "Bar," and "Line" charts.
I'm going to send this one in to Tableau tech support, it's an interesting
(and painful) sideeffect of mark type filling that is new to me.
Jonathan
On Sat, Dec 21, 2013 at 1:45 PM, Tableau Community Admin <

TableauR Basic Calculations jtd.twbx 106.4 KB


3. Re: R integration: basic calculations & graphs (Part2)
Rollie Parrish Dec 23, 2013 2:37 PM (in response to Jonathan Drummey)Thanks Jonathan. I probably will end up using the Tableau calculations for GMLOS.
This exercise has been more about learning how Tableau works with R. I wanted to start with something a little more basic than the predictive model examples that are currently online.

4. Re: R integration: basic calculations & graphs (Part2)
Jonathan Drummey Dec 23, 2013 6:11 PM (in response to Rollie Parrish)Hi Rollie,
I figured out (remembered, actually) a solution for the Mark Type Filling, I'll type it up and get it to you tomorrow.
Jonathan

5. Re: Re: R integration: basic calculations & graphs (Part2)
Jonathan Drummey Dec 24, 2013 5:20 AM (in response to Rollie Parrish)All it took was some preChristmas dinner with family to get my brain working again and remember Joe's solution for unwanted densification. To take care of that and finish getting the lines to draw in your workbook, there are two major steps, seen in Line Fixed part 1 and Line fixed part 2 in the attached workbook:
1) We can turn off the Mark Type Filling by only having aggregate measures on Columns and Rows. Where we need a discrete (blue) pill to generate headrs, we can use a discrete aggregate like MIN(), MAX(), or ATTR(). So I put Group on the Level of Detail and MIN(Group) on Columns. (For performance, when I'm sure about the level of detail in a view I'll use MIN() or MAX() because ATTR() is somewhat slower).
However, this leaves 4 individual marks that are not connected into a line:
There are 2 related issues with getting the marks to draw a line:
a) We have 4 plotted marks and 12 Nulls. Ordinarily we can filter out those Nulls and Tableau will connect the 4 marks, for example by putting a copy of hte Geometric LOS on the Filters Shelf and filtering for nonNull values, but that won't work without some other changes, because...
b) There are two dimensions (ID and Group) on the Level of Detail Shelf. Tableau can generally connect marks into a line across one dimension, but not when there are multiple dimensions on the same Shelf  that creates panes/partitions. The Path Shelf can be an option to connect marks, but it also won't cross the multiple dimension boundary (as far as I know)
The solution is to create a Tableau Combined Field of ID & Group, and put that on the Level of Detail, and apply the Geometric LOS filter from a)  we could generate a sort of the Combined Field, but using Geometric LOS filter is faster. Tableau Combined Fields can still address & partition on the individual dimensions in the Combined Field, while being treated as one dimension for the line:
Rollie, I don't know your volumes, if you have tens of thousands of records or more (ideally hundreds of thousands of records) I'd be curious as to which is faster, the Tableau & R calcs or the Tableauonly calcs. My hunch is that with a decently fast data source such as a Tableau data extract or tuned database that the Tableauonly solution would be faster because there is less data that has to be pushed across the wire from the data source to Tableau and Tableau into R & back again.
Also, I personally think this is more complicated than it needs to be. Even though we have a workable solution to get a line, it requires knowing about multiple undocumented features of the software.