10 Replies Latest reply on Feb 5, 2013 2:27 PM by Toby Erkson

    BigData analysis, is Tableau suitable ?

    S?kine Coulibaly

      I'm in the process of evaluating BI analysis tools. I came across Tableau, which looks promising.


      I can see that there are no BigData connectors/drivers, and as a consequence I don't feel very confident whether this would suit my needs.


      I'm expecting my database growing quickly over 10 TB, and needing easy scaling. The data is made of log events documents, around 400 bytes each.


      In the past, I used to use Postgres for 10GB databases, but for this shot, I guess I'll preferably go into BigData solutions (Amazon SimpleDb, Cassandra, Hadoop et al.)


      I found out that Tableau could connect to Vertica or Teradata but I guess these products are expensive (BTW, if anyone have a rough idea of the license cost, i'm interested).


      My real question is, is Tableau suited for high volume databases ?


      Thank you !

        • 1. Re: BigData analysis, is Tableau suitable ?
          Robert Morton

          Hi Sékine,


          It sounds like you're asking specifically about cheap Big Data systems -- more on that in a moment. Tableau is very well suited for Big Data, including support for powerful systems such as Teradata, Aster Data, Vertica and ParAccel. Because Tableau compiles your visual canvas into SQL queries, Tableau can leverage the power of your Big Data system by pushing the computation close to the data.


          You can find other systems for working with Big Data on a reduced budget, though you won't have the same query responsiveness, easy of administration, etc. as with the vendors I mentioned earlier. One option is InfoBright, which is a column-store extension of MySQL and offers a free Community Edition. Recently Vertica also announced a free Community Edition.


          Last, Tableau has a Hadoop connector coming soon! we gave a preview of this recently on our blog: http://www.tableausoftware.com/about/blog/2011/10/sneak-peek-hadoop-and-tableau-demo-13849


          Stay tuned for more info on this soon!


          • 2. Re: BigData analysis, is Tableau suitable ?
            guest contributor

            Hi Robert,


            Thank you for this interesting reply.


            This is not a "low-cost" project, but you know that you gain competitive advantage by having all your costs down. So I try to avoid overkilled infrastructure/software costs.


            My concern is that I have the feeling that products like Vertica and Teradata can be extremely expensive for "small" (under PB size) instances like ours. I expect 10 TB in the few coming years, which is not that much and might be handled by GreenPlum or Postgres-XC, although these are not column oriented. Have you heard of such Postgres-XC/Tableau setups involving high volume data ?


            You have a Greenplum driver. Do you have any figures, any use case to share ?


            From what I've seen, all major actors now offer a "Community Edition" (GreenPlum, Teradata, Vertica, and Infobright as you said), but scaling is somewhat limited to 1 or 3 nodes.


            Thank you !

            • 3. Re: BigData analysis, is Tableau suitable ?
              Robert Morton

              Hi Sékine,


              Postgres-XC looks interesting, but I don't know of anyone actively using Tableau with it and it doesn't appear to be ready for full release until March 2012.


              Regarding Greenplum, we do offer a first-class connection. As a software engineer I focus on partner technology integration, not comparative analysis, so I don't have any direct use cases to share about any database vendor. While we have a number of whitepapers on our site, you'll also find that your Sales rep at Tableau can give you a clear picture based on their experience with other customer success stories.


              Let me know if you would like to continue this conversation by email and I'll contact you via your forum email address.



              • 4. Re: BigData analysis, is Tableau suitable ?
                Sekine Coulibaly

                Hi Robert,


                Clearly Postgres-XC is not that mature yet but seems very promising.


                I'm checking your Hadoop sneak preview, and I'm really willing to get involved in the Hadoop Beta and Tableau 7 beta ! What is the registration process ? We can further discuss this by email if you wish.


                Thank you !

                • 5. Re: BigData analysis, is Tableau suitable ?
                  Robert Morton

                  Hi Sékine,


                  Please check with your Sales rep about being added to the 7.0 Beta. Alternatively you can begin using the Tableau connector for Cloudera Hadoop Hive in our 6.1.4 maintenance release. You will need to contact your Sales rep to get early access to that connector in 6.1.4. Here's the release announcement: http://www.tableausoftware.com/about/blog/2011/10/tableau-61-now-supports-hadoop-13921



                  • 6. Re: BigData analysis, is Tableau suitable ?
                    Sekine Coulibaly

                    Hi Robert,


                    I'll do this, thank you !

                    • 7. Re: BigData analysis, is Tableau suitable ?
                      Sekine Coulibaly



                      After having checked your short video regarding Hadoop, I'm investigating Hadoop further since my applications typically write logs that I want to be analyzed by Tableau. From what I've read, I understand that this is the perfect use case for Hadoop.


                      But my big concern is about latency. My idea is to deploy both Desktop for analysts to create their dashboards, and Tableau Server to allow marketing/sales people to consult the dashboards. I want the dashboards to be render quickly, and including fresh data, I mean we can't offer to have them wait 10 seconds for them to be able to view the rendered dashboards (2 to 5 seconds would be ok).


                      I don't imagine asking my customer analysts to tune their SQL requests, although going into optimization as an administrator is something we can do.


                      It is said (and you confirm!) that using Hadoop with Tableau involves big latency. How much ? What is the time taken by your (4?) nodes to render the "Flights delays" demo ? Any idea of how many nodes would be needed to render a 2/3 seconds result for an equivalent 2 TB set ? For these datasets size, do you still advice using extracts ? Using extracts (that must be periodically updates right?) will "ignroe" the most fresh data, won't they ?


                      Hadoop is such a monster for a newbie like me. It's difficult to put figures (how many nodes, what would be the hardware cost, what about the latency) on it. Furthermore, setting it up looks complex, I'll take any information/help you have about it ! It looks like more recommended for 100+ CPU cores and  10 TB+ datasets, doesn't it ?


                      Sorry if this question is not 100% Tableau relevant ;)



                      • 8. Re: BigData analysis, is Tableau suitable ?
                        Robert Morton

                        Hi Sekine,

                        I'll try to respond to you later, but I won't be able to for about a week. In the meantime, I encourage you to look at the resources that Cloudera provides on their site and consider speaking to a representative of theirs. They will be the best source of information on how to plan your cluster according to your data analysis expectations and needs.


                        • 9. Re: BigData analysis, is Tableau suitable ?
                          Cristian Vasile

                          Short answer YES it is.


                          The  viz was created with tableau desktop v8beta 6, google bigquery connection, public dataset natality, (https://developers.google.com/bigquery/docs/dataset-natality) 140M records processed in less than 60 seconds over the net on my vintage laptop.

                          natality Sheet 7.png




                          • 10. Re: BigData analysis, is Tableau suitable ?
                            Toby Erkson

                            I know that Yahoo! has been using Tableau against their BigData (terabytes daily).  Now that Tableau is over a version newer (holy back from the grave, Batman!) I'm sure it's performance is even better than when this thread was originally created.