1 Reply Latest reply on Apr 14, 2018 7:44 PM by patrick.byrne.0

    Querying on Standard Deviations across 2 Variables

    Ethan Elias

      Hi,

       

      I have a dataset that shows Products and their signal strengths in different Countries. 

       

      I want to be able to surface anomalies in the data to investigate for potential sources of error.  For example, if Product X has a trait that leads to an abnormally high signal strength in Bolivia/Canada/etc, I want to know about it.

       

      I'm working on an approach and need some help with STDEV.  I have one chart that shows Product signal strength per Country, and the other shows Country signal strength by Product.  My current idea is as follows:

      - Some products will rightfully have much higher counts than others, in all countries (example: Wordpress vs a newly developed niche CRM software)

      - Some countries will rightfully have much higher counts than others, for all products (example: United States vs Antarctica)

      - If a Product or Country (like Wordpress/US) is consistently outside of X standard deviations from the mean, for BOTH the Country and Product sheet, this is fine because it is to be expected. 

      However, if a Product or Country is outside X standard deviations from the mean for ONLY ONE Country or Product, this is a true outlier and needs to be investigated.
           - example: Tableau in French means "Table", so Tableau would be outside the mean in France, and France would be outside the mean for Tableau.  But Tableau is not an exceptionally popular product in other countries (would exclude US perhaps), and France is not an exceptionally common install country in other products

       

      How can I accomplish this in Tableau?  I haven't had success using the STDEV function.  I've had success using the Analytics pane and applying a distribution with 2stdev, but I don't know how to query on the data points that fall outside the std dev margin. My sheets are below, and the gray area is 2stdev.  I want to be able to isolate and query on the blue dots ABOVE the gray area.

       

      Thanks in advance!

       

      country.PNGprod.PNG