7 Replies Latest reply on Jan 7, 2013 4:04 AM by Avinash Patel

    Connect to cube vs. extract from underlying DB

    Sean Mullane

      We have an OLAP cube that is updated regularly from a set of databases. Querying large amounts of data from the cube is slow, so I'd like to create an extract. Apparently you can't extract from a cube, so as an alternative I'd like to create an extract of the underlying DB and bypass the cube. My question is: would creating an extract from 1TB+ of data (1 billion + rows) likely result in faster workbooks than if I connected to the OLAP cube? I know there are some variables here, so if it helps let's say the cube is on a fast server and reasonably well-optimized, and I'd be creating similar workbooks with either source.

       

      Also, what kind of server specs would be required to handle such an extract?

        • 1. Re: Connect to cube vs. extract from underlying DB
          Tracy Rodgers

          Hi Sean,

           

          Since there are a lot of different factors to consider, it is hard to say generically whether taking the extract or connecting the to Cube will be more beneficial. However, generally extracts are faster than most live connections. One way to make the extract faster is to hide any unused fields so that only they will be included in the the extract. The following KB article may also be of some use to you:

           

          http://kb.tableausoftware.com/articles/knowledgebase/database-query-performance

           

          Hope this helps a little bit!

           

          -Tracy

          • 2. Re: Connect to cube vs. extract from underlying DB
            Matt Comstock

            Sean,

             

            I am trying to do the same thing.  Did you get any where with this?

             

            Thanks,

             

            Matt

            • 3. Re: Connect to cube vs. extract from underlying DB
              Sean Mullane

              No, I haven't pursued this any further. I'm loathe to go down that road if I can possibly avoid it. Sorry I couldn't help!

              • 4. Re: Connect to cube vs. extract from underlying DB
                Matt Comstock

                Sean,

                 

                Thanks for the reply. 

                 

                We are going to try two options.  1) Use SSIS to extract data from the cube into relational tables (seems backwards, I know).  2) Point tableau to the data views that are used to populate the cube. Once we figure out the best approach we will use data extracts to push the data into tableau.

                 

                Make Sense? Thoughts?

                • 5. Re: Connect to cube vs. extract from underlying DB
                  Sean Mullane

                  My issue with connecting to the underlying DB is that I would have to replicate all the calculations that we do for the cube exactly so that both the extract and the cube show exactly the same data. We do enough ETL that it would be a big pain. Keeping the Tableau calculations updated whenever we change the cube would be a pain too. Depending on how much processing you do for the cube you may have enough resources to do that dev work.

                   

                  SSIS sounds like a more feasible approach, even if it is kind of backwards. I could see that as a next step for us - use a live connection to the cube to develop workbooks and then pull the necessary fields and granularity levels into the relational DB so we can create an extract. Let me know if you go this route and how it works out for you.

                   

                  I'm still working on getting our Salesforce data into Tableau without using Excel, so we won't be doing anything with the cube for a while. It's too bad there's not more documentation on using cubes, because a lot of caveats seem to apply. Any idea whether they'll add the ability to do cube extracts in 8.0?

                  • 6. Re: Connect to cube vs. extract from underlying DB
                    Russell Christopher

                    Writing MDX against the cube (via SSIS) will be a pretty cumbersome to bulk dump data from the cube. Option #2 will be the best way to go unless you're doing some calculations in the cube that you don't want to have to recreate in Tableau.

                     

                    Message was edited by: Russell Christopher Whoops! Replies crossing in the night. If the calcs are difficult to recreate w/o MDX, then method #1 becomes more reasonable (but still a pain!)

                    • 7. Re: Connect to cube vs. extract from underlying DB
                      Avinash Patel

                      HI Sean,

                       

                      i know i m coming little late but anyways,

                      If u want answer for who is faster tableau extract OR Cubes ? The answer is Tableau Extract.

                       

                      I will not say that cube's are not fast, they are also fast but when u have a filter in your workbook which has datasource as cube (processed by billions of rows). Every change of that filter will take at least 5 to 10 seconds to give back the changes(That timing is quite scary when your presenting your report to your boss like Regional Manager, Director, CTO or CEO) where as Tableau extract will works like magic.

                       

                      But if you look at technical aspects, TBWX with cubes are smaller in size as that of tableau extract. Also single cube can use for multiple reports or say workbooks but in case of tableau extract, u have to create specific extract

                      every single time.

                       

                      If you look at business aspect, u definitely will not show all billions of rows in a single workbook. And (sorry i m just assuming too many if's) if its a Management report, u will be showing some month aggregated trend and some top 10 to 100 customer,vendors details. So if you can restrict the cube data to contain only Top 100 customers multiples by different business filters (example - Product Type, Transaction Type). Your data inside the cube will reach max to max few thousand. Now that amount of data is like child's play for Cube + Tableau.

                       

                       

                      Regards,

                      Avi