1 Reply Latest reply on Mar 7, 2013 7:09 AM by Toby Erkson

    How to bring in huge data sets without delay and also make proper relations?

    jake pasini

      We are experimenting with Tableau and have run into a few issues.


      1) We have a huge database and when we try to pull in "customer ID" for example, it takes over a minute to load in.  Then if we try something else, depending on how large, it can take literally minutes for it to try to display.  This is a huge issue, especially when we don't know exactly what we're doing yet so it's very difficult to learn.  Any recommendations?


      2) I'm not an IT guy, but we have a standard relational database for all our data.  When linking to the data it seems none of dimensions and measures are correct.  For example color is important to us, but when we drag it from dimension down to measure it makes it a "count" instead of keeping the abc type of "black" "red" "blue" etc.  It seems a large amount of our data is qualifiers like that, and not just #'s.  We also need these to all relate through multiple reports with different info and it doesn't seem that is possible even with a constant customer ID.


      Any insight is greatly appreciated.  It would seem there have to be other very large companies with this issue as well.

        • 1. Re: How to bring in huge data sets without delay and also make proper relations?
          Toby Erkson

          First, you're gonna need to read the manual.  It talkes about dimension vs. measures and why what it's doing is correct.

          Second, keep your data pulls concise (get only the data needed & nothing more) and work from an extract instead of live data.  There are other data-tuning things that can be done (like aggregating and optimizing the extract) but I'm not experienced enough to explain those varied methods and it's in the manual.  The initial pull of data could take time but once it's in extract form the visualization speeds should dramatically increase.

          Finally, don't expect every single visualization to take 1-2 seconds.  That's unrealistic.  Even with a smaller data source, if the dashboard/worksheet is complex, many filters, many conditional expressions, etc., it's gonna take processing time.


          I've worked (and am working) for companies with very large data sets and they're doing just fine.  So it depends on how well the database is set up, indexed, statistics/optimizer, network status, ODBC vs. native connection, SQL coding, RAM, CPU version, OS, custom fields, number of operations on the dashboard, etc.  Without a specific scenario (with plenty of details!) we can't give you specific solutions.


          Message was edited by: Toby Erkson:  http://community.tableau.com/docs/DOC-1251