14 Replies Latest reply on May 1, 2014 9:15 AM by Noah Salvaterra

    multi-dimensional data

    eli.goldner

      Hello,

       

      Brief overview,

      We have data for sales leads which can be viewed in a tabular format as seen below. The columns represent the age of the lead, the rows represent the source of the leads (site, meaning website). This entire table is a snapshot in time, and I would like for a way to have this data compared over time.

      The issue seems to be the fact that I have data with 3 facets; something which can be hard to represent in a two dimensional screen:

      1. Lead age
      2. Lead source
      3. amount of leads

       

       

      Source \ AgeLeads < 10 DaysLeads 11-30 DaysLeads 31-60 DaysLeads > 60 DaysTotals
      Site 158571615146

      Site 2

      221813
      Site 342208
      Site 41916250186
      Totals832232423353

       

       

      Any help or examples of a 3 dimensional data in a viz would be greatly appreciated!

       

      Eli

        • 1. Re: multi-dimensional data
          Matt Lutton

          Do you have some raw sample data or a packaged workbook you can upload--is the data shown in your question representative of the raw data in your data source, or what you see inside Tableau?

           

          This type of analysis should be possible in Tableau, but a lot depends on your raw data structure.

           

          I'm not sure what problem you are having, exactly--can you explain?  Posting a sample packaged workbook, and some details/screenshots about where you are having trouble might help a bit.  Cheers!

           

          As a side note, I recommend this blog post for some examples of "slicing by aggregate" in Tableau..  Your situation seems to apply, as you are wanting to compare aggregated time values with aggregated categorical "bins".  Slicing by Aggregate | VizPainter

          • 2. Re: multi-dimensional data
            eli.goldner

            Matthew,

             

            Thanks for you response.

            The table which you see is a representation of data in our system, however it has been pivoted for easy human readability. The data structure on our end (SQL Server) has all of these data points in a single row (20 columns, as we store some of the totals as well) per day.

             

            I had no issue importing this data into Tableau, though I can only show two of the three dimensions without creating a mess! My troubles revolve around trying to find a presentable format.

             

            I can either post some raw data (CSV or otherwise) or post a packaged workbook. Is it safe to assume that the latter would be easier for you to work with, I'll post it later on today.

             

            Thanks for your quick reply, and please let me know if I missed clarifying anything.

             

            Eli

            • 3. Re: multi-dimensional data
              Matt Lutton

              Yes, a packaged workbook with extracted data would be best, as I can view the data and any work you've done inside Tableau.

               

              Typically, Tableau works best with non-aggregate data stored in a "Taller" format. Tableau can then compute your totals and other aggregations you may need.   Have you read through:

              Preparing Data for analysis with Tableau | Tableau Public

               

              When you say a single row, do you mean a row per site, or literally one row?  Best of luck.

              • 4. Re: multi-dimensional data
                Noah Salvaterra

                Agreed with Matthew, for concrete help a dataset is important. Tableau is data driven, so how things get setup is a function of how the underlying data is structured.

                 

                That said, this may be an academic discussion. How about shape for site, color for age, x-axis (column) for time and y-axis (row) for number of leads? When showing this much at once, it is a good idea to add some ability to drill down and/or filter. I attached a workbook that shows a couple ways of doing this. Clicking on the legend will highlight a particular property, the stacked bar charts will formally filter the data.

                 

                N.

                • 5. Re: multi-dimensional data
                  eli.goldner

                  Matthew,

                   

                  I'll post a packaged workbook later on today.

                   

                  I understand that non-aggregated data would work best, and that is how it is stored in our system. I just read through the link which you provided, and it proved to be helpful; note, I've already created several visualizations in Tableau, and I've often had to "unpivot" data which I've received, as I understood that it worked better in the raw form.

                   

                  I understand where your question is headed... and the answer is literally one row.

                  Explanation: these numbers represent leads which haven't commenced in a sale nor haven't been labeled as closed. By definition, the data which is gathered daily is aggregated data. However the data stored is historical data, and we therefore cannot import the raw data for this purpose. Each morning we have a job run which inserts one row into our table with the counts of all open leads across all of these factors (by age, and by site), we also store the totals of the open leads by age (we don't need to import those tows into Tableau, as this can be computed). Essentially the table you see above is a representation of one row of data in our system.

                   

                  Thanks!

                   

                  Eli

                  • 6. Re: multi-dimensional data
                    eli.goldner

                    Hi Guys,

                     

                    Attached is a workbook with the data of two days.

                    I omitted the totals, as this will allow for the calculations to be done in Tableau.

                     

                    The worksheet looked pretty pathetic, so I cleared it and left you with a blank slate to work with.

                     

                    Thanks so much!

                     

                    Eli

                    • 7. Re: multi-dimensional data
                      Noah Salvaterra

                      Hmmm... So it looks like all the data for each day has all been crammed into a single row for some reason. That will make it hard to use. Not impossible, but compared to the effort in reshaping I think that is the way to go. Shouldn't be too bad, but if you're not handy with SQL you may want someone to sit with you who is. The new query will be something like:

                       

                      (Select create_date,

                           'Site1' as Site,

                           '<10' as Days,

                          Open_tix_by_tenants_10_Days_Or_Less as Leads

                      From

                          CASA_Open_Tickets_By_Type)

                       

                      Union All

                       

                      (Select create_date,

                           'Site1' as Site,

                           '11-30' as Days,

                           Open_tix_by_tenants_11_TO_30_Days as Leads

                      From

                          CASA_Open_Tickets_By_Type)

                       

                      Union All

                      ...

                       

                      There will be 16 chunks like this, but they will all be similar. Each may also need a where clause, but I assumed that was for cutting the data down to a sharable size. nix the order by, it is just slowing you down.

                       

                      N.

                      • 8. Re: multi-dimensional data
                        eli.goldner

                        Noah,

                         

                        Like your idea; however, I'm quite handy at SQL, and was thinking of going the route of unpivoting (which will include breaking the column name in half). I'll dump the new data into another table and try to work with that. I will still need help making 3 dimensions look presentable.

                         

                        Thanks for looking into it

                         

                        Eli

                        • 9. Re: multi-dimensional data
                          eli.goldner

                          Noah/Matthew,

                           

                          Please check out the attached worksheet.

                          I unpivoted the data, and broke the column header into two parts (site and age) programmatically. I'll admit, it looks much better, and it quite manageable!

                           

                          I played around with different layouts, and I think I've arrived at the best solution for the moment (sans adding labels, and touching up the tooltip). I'm not crazy over the looks of the view, though I do think that this is a good start.

                           

                          Please keep in mind, that we'll need the ability to view 2 or more days worth of data in a way which will allow for a user to visually compare them. Have a better layout? Please show me how!

                           

                          Thanks,

                           

                          Eli

                          • 10. Re: multi-dimensional data
                            Noah Salvaterra

                            The scale of the sites differ by an order of magnitude so it makes it hard to compare age mix. I might move site to rows and edit the axes to allow them to scale independently, or better yet consider using percent of total (using age). Another worksheet could be used to represent volume by site. Volume could also be represented by a single line overlaid on top of the percentage view, but my experience is that people will find this confusing.

                             

                            There is a temptation to put everything in one chart, because it isn't hard to do so, but the best visualizations are the ones that effectively communicate the underlying data. Often this is better accomplished through several simpler charts. My process is typically to explore the data and the business need well and to decide on a viz (or a collection of them) afterwards . What are the questions that might be answered by this viz? Does your viz answer those questions? Does it distract from these? Will the end users understand what they are looking at? Sometimes after spending time with a dataset I come to understand an underlying story, and I build a viz to tell it. Sometimes I'm working to discover what that story might be, I may provide a platform for others to explore as well but I am still making judgement calls on what they are likely to find (or there is a discussion about it). I try to be flexible and iterate, iterate, iterate.

                             

                            Some of us on the forum can comment on the effectiveness of various aspects of a viz, but it can come down to lipstick on a pig if there isn't a thoughtful setup to begin with.

                             

                            N.

                             

                            P.S. Sorry if I insulted your SQL skills. I didn't mean to presume that they were poor.

                            • 11. Re: multi-dimensional data
                              eli.goldner

                              Noah,

                               

                              P.S. Sorry if I insulted your SQL skills. I didn't mean to presume that they were poor.

                              Don't worry, I didn't perceive it as such and didn't take offense (in fact, it didn't occur to me that there was potential for offense until I saw your note). No need to apologize.

                               

                              The scale of the sites differ by an order of magnitude so it makes it hard to compare age mix. I might move site to rows and edit the axes to allow them to scale independently, or better yet consider using percent of total (using age). Another worksheet could be used to represent volume by site. Volume could also be represented by a single line overlaid on top of the percentage view, but my experience is that people will find this confusing.

                              Wouldn't the different values on the opposing axes distort the image? The only difference between the axes would be the age of the lead.

                               

                              As for the story to be told, it's quite simple. I'd like to show the users what the counts are for the current day, and compare/contrast the numbers to data from the past. This way users can see how current performance stacks up to past performance.

                              Have an idea, and would like to hear your take on it... how about if I create a table showing the current days performance in tabular format, and I color code the cells to reflect the change from the previous day (red for got worse, and green for got better)? I can put the previous days numbers into a tool tip.

                               

                              Your thoughts?

                               

                              Eli

                              • 12. Re: multi-dimensional data
                                Noah Salvaterra

                                I suppose an argument could be made that differently scaled axes distort the overall image, if that was all that you presented. But if it is part of a larger context I don't think it is an issue. The first thing that struck me looking at even just 2 days of data was that the sites differed greatly in terms of lead volume. I'm not suggesting that fact be hidden, rather that could be separated from the rest of the story.

                                • Chapter 1. Site 1 and 4 generate a lot more volume than Sites 2 and 3.
                                • Chapter 2. What else is different between sites? Sites 1 & 2 have a large >60 segment which is absent from 3 & 4. This fact is more clear from a percentage standpoint. At one point leads from Site 4 is more than 87% 11-30 age group. These anomalies are interesting...
                                • Chapter 3. You're on your own. But with more days there should be more interesting patterns.

                                 

                                Tables are OK, worth a try I guess. I tend to avoid tables unless they are specifically requested. Some people can get a lot from grid of numbers given enough time, I'm not one of those people... plus it seems like an awful waste of Tableau.

                                 

                                N.

                                • 13. Re: multi-dimensional data
                                  eli.goldner

                                  Noah,

                                   

                                  Thanks for your help and input, it helped me clarify things.

                                   

                                  Eli

                                  • 14. Re: multi-dimensional data
                                    Noah Salvaterra

                                    No problem. Feel free to ping me later on when you have more of your presentation together.

                                     

                                    N.