12 Replies Latest reply on Sep 22, 2016 6:34 AM by Toby Erkson

    Calculate normal distribution

    Steve Martin

      Hi guys,

       

      I know this has been asked before and answered very well in most cases.

       

      I'm trying to work-out the calculation required to match the Norm.Dist function provided in Excel which takes the z-score, mean, standard deviation and cumulative as its input values to give an example:

       

      =Norm.Dist(1.07,0,1,1) = 0.857690 or 85.77%
      
      

       

      Scott Tennican provided a very good calculation in To achieve T.Dist() function of Excel in Tableau to get the p-value of a T-Test

       

      But when I use the calc it is very slightly out .

      .5+(.1*1.07)*(4.4-1.07) = 0.8563100 or 85.63%

       

      Is the Excel function wrong or do I need to alter the calculation Scott provided? My stats skill in this area is very weak

       

      Any ideas?

       

      Steve

        • 1. Re: Calculate normal distribution
          Szymon Urbański

          The second formula you are using is an approximation.

          Most probably you will see that the difference is getting greater or smaller depending on where you are along the curve (or z-score).

          Besides what you are referring to is a cumulative distribution. The actual formula is using an integral so there is no good solution.

          One way around would be calculating norm.dist in excel for all z-scores (say in 0.01 increments) and using it as a data source in Tableau (which would be equivalent to generating a statistical table like http://homes.cs.washington.edu/~jrl/normal_cdf.pdf).

          The other solution would be using R integration to do the calculation: Tableau and R Integration

           

          Cheers

           

          S

          1 of 1 people found this helpful
          • 2. Re: Calculate normal distribution
            Steve Martin

            Thanks Szymon,

             

            We already have this calculated in Excel and using it as a data-source for lookup has been bandied around though my preference was to calculate this instead largely as it is only used for lookup - the value itself is to show in a tool-tip rather than to be used for charting.

            Not sure on the best method to use to lookup the resultant value across the different sources, as no relationship can be defined as the z-score is calculated

             

            Steve

            • 3. Re: Calculate normal distribution
              Scott Tennican

              There are pretty serious downsides to any of the currently possible solutions.

              A longer term option would be to create an idea in the Ideas forum that Tableau provide a Local Calculation function which provides the common theoretical distributions used to test hypothesis such as the normal, student-t and chi-squared and Snedecor's f distributions in at least cumulative density function (CDF) and perhaps probability density function (PDF) forms. There are currently only ten of the around 500 ideas with at least 13 votes which might be considered statistical features. Most of these ideas request more integration with statistical packages. However, providing this simple functionality within Tableau would be easy, secure and much faster performing.

              • 4. Re: Calculate normal distribution
                Szymon Urbański

                Hi,

                 

                1. Install R.

                2. Install package Rserve

                3. Load package Rserve

                4. Start Rserve

                
                install.packages("Rserve")
                library(Rserve)
                Rserve()
                
                

                5. Go to Tableau -> Help -> Settings and Performance -> Manage R connection

                6. Put the desired parameters. If it is running on the same machine it will simply be localhost:6311

                More about that if you search for "R and Tableau integration"

                7. I'm not sure how your data is structured, BUT:

                8. You could create a parameter that is your z-score. Let's call it zscore.

                9. Create a calculated field

                SCRIPT_REAL("pnorm(.arg1)",[zscore] )
                
                

                10. Move the calculated field to the worksheet.

                 

                What we have done is we have setup the R running as a server, with which Tableau is communicating.

                It is passing the .arg1 value to R and using pnorm (which is Normal CDF) and returning the result data back.

                We set that .arg1 should be fed with the value from the parameter ([zscore]).

                 

                I attach the simplest workbook.

                And it does return .8577 for 1.07 zscore.

                Obviously it will work AFTER properly setting the R connection.

                zscore.png

                Well I guess this kind of things are why we need R and Tableau integration.

                Also vote for ability to pass data frames between R and Tableau. Life will be easier.

                 

                Cheers,

                 

                Simon

                1 of 1 people found this helpful
                • 5. Re: Calculate normal distribution
                  Steve Martin

                  Thanks for this Simon, this I think would solve most people's problems however whilst R would be available for desktop work, we would struggle (the sun would probably burn-out first) to get R server installed; and we are already calculating z-score.

                   

                  I am in the process of manually creating a cdf calculation (taking a while) so I may have something to show but more to the point, I shall be raising this as an idea per Scott's comment above.

                   

                  On a side note, it is obviously frustrating that Tableau seriously lacks any functionality beyond basic maths (a colleague and I had to create a bitwise function earlier today for other work), especially when Windows calculator does have many statistical functions built in and has done for quite some time.

                   

                  Steve

                  • 6. Re: Calculate normal distribution
                    Scott Tennican

                    The fluidity and flexibility with which you can visualize simple statistics like sum in Tableau has advantages. Slightly more complex computations may be difficult or impossible to fully weave into the matrix of features which composes Tableau. Little used functions can reduce the discoverability of feature people use all the time. That said, providing local functions which avoid you having to call pnorm, pt, pf and pchisq remotely seems to fit cleanly into the current Tableau feature set.

                    • 7. Re: Calculate normal distribution
                      Szymon Urbański

                      Steve,

                       

                      You can try a different solution.

                      If you are able to run ane exe file, then use R Portable | SourceForge.net

                      It will extract wherever you want (including Documents folder).

                      All the packages shall be downloaded to the same folder.

                       

                      Now the only thing is the connection - if you have the possibility to extract the files wherever on the same machine Tableau Server is running - no problem. If it is a different machine you would need to test if the firewalls are blocking the connection.

                       

                      You could also try some different approximation - the one you posted seems very straigh-forward. I would think of polynomial or log-polynomials that would have a much better fit. For example see page 119 here with comparisons of different approximations and errors they are giving: http://www.jiem.org/index.php/jiem/article/viewFile/60/27

                       

                      Cheers,

                       

                      Simon

                      • 8. Re: Calculate normal distribution
                        Steve Martin

                        Thanks again Simon but still we have the problem of not having server access or permissions to extract to the server.

                         

                        I should perhaps have been a little clearer in that this is an enterprise system so we are dealing with a tightly regulated system; anything that cannot be handled direct in Tableau / Tableau Server cannot be used and this too means R despite us as a company and BI team having both R and sas installed.

                         

                        I have marked your second answer as helpful as I'm sure others in a similar position would be able to use this though I have gone out and created the function (with the help of our in-house mathematician) in the attached workbook.

                         

                        I am using just 4 terms in my calculation of T, which for our purposes, and given that this is still an approximation given that Tableau has no such function, this is within our acceptance criteria.

                         

                        I should here as mentioned above, this is for display in a tooltip rather than for plotting a chart.

                         

                        I am using zscore in a parameter here but for the actual work, this is calculated.

                        • 10. Re: Calculate normal distribution
                          Toby Erkson

                          Steve Martin, did you create this function in Tableau as a calculation?  If so, please share!

                          Steve Martin wrote:

                          ...

                          On a side note, it is obviously frustrating that Tableau seriously lacks any functionality beyond basic maths (a colleague and I had to create a bitwise function earlier today for other work), especially when Windows calculator does have many statistical functions built in and has done for quite some time...

                          • 11. Re: Calculate normal distribution
                            Steve Martin

                            Hi Toby Erkson,

                             

                            Yes, I uploaded the to a post in the calculation library: Cumulative Distribution Function (Normal Distribution)

                             

                            Steve

                            • 12. Re: Calculate normal distribution
                              Toby Erkson

                              Nice work!  It wasn't the type of bitwise functionality I was looking for (binary AND, OR, NOT) but, still, good job