3 Replies Latest reply on Aug 7, 2013 12:35 PM by Robin Kennedy

    Calculated dates and speed problem

    Jamal Qureshi

      Hi, I have a problem with calculated dates.  I'm a touble newbie to BI I should start with saying.  My raw data comes in MMMYYYY format (for example, APR2003).  I can't figure out how to read it in directly as a date (I note with dismay that TIBCO Spotfire does but I don't like their user interface as much as Tableau).  So I followed these instructions:

       

      http://kb.tableausoftware.com/articles/knowledgebase/fixing-date-fields

       

      So fine, now I have a calculated date field. The only problem is, my data manipulation now goes unusably slow. Anyone have a solution? I'd really like to use Tableau, but without sufficient speed (I'm going to be handling tens of millions if not hundreds of millions of rows of data) I just can't.  Thanks for any help.

        • 1. Re: Calculated dates and speed problem
          Robin Kennedy

          Jamal,

           

          I would suggest the following, in order of priority

           

          1. Ask for your data to be received in a more standard format and for your database to have that field recognised at a date datatype, rather than a string. How is the data collected? surely it's not being entered like that?? (DBA should be shot if it is)

           

          2. Use Tableau extracts - when you make an extract, some calculated fields are 'materialised' into the extract file so no longer have to be computed. I don't know for sure whether your calc field would be materialised though.

           

          3. Try a different calculation... the one referenced in the article is a bit string heavy, which are expensive to calculate. The trouble is, because it's got text in it we need to convert the month names and can't do pure math...  I would suggest trying

           

          dateadd('month',

          CASE left([Date],3)

          when "JAN" then 1

          when "FEB" then 2

          when "MAR" then 3

          when "APR" then 4

          when "MAY" then 5

          when "JUN" then 6

          when "JUL" then 7

          when "AUG" then 8

          when "SEP" then 9    

          when "OCT" then 10

          when "NOV" then 11

          when "DEC" then 12

          END -1

          , dateadd('year',int(right([Date],4))-2000,#1-1-2000#)

          )

          • 2. Re: Calculated dates and speed problem
            Jamal Qureshi

            Thanks for the reply. To respond in order:

             

            1. Yes the DBA should be shot. Unfortunately getting him shot or made to fix the data is impossible, it's from a giant international inter-governmental agency. Way beyond my pay grade.
            2. No idea how extracts work, is there a good tutorial somewhere?
            3. The code worked, but still too slow. This something an extract could really speed up?
            • 3. Re: Calculated dates and speed problem
              Robin Kennedy

              Using an extract can make a massive difference in performance... queries that take several minutes can end up sub-second once Tableau is running off an extract. Tableau Extracts are columnar based data stores and are optimised for aggregation, rather than row based, traditional databases which are great for row addition / deletion etc but can hit bottlenecks when running aggregating queries. This is particularly true if you're connected to a slow source such as Excel or Access or a database with a lot of data contention problems.

               

              You know when you first connect to your data source and it says do you want to connect live or do an extract? That's where you usually decide, but you can also switch from a live connection to an extract (and vice versa) by right clicking your data connection in the top left and selecting Extract Data.

               

              Disadvantages to using an extract? Well, it can take a little while to first populate it, and once extracted, if your source database changes then your Tableau view will not update until you refresh the extract (right click the data source and select Extract > Refresh)

               

              As mentioned, I don't know whether your calculated field will get materialised into the extract or not -- the algorithm for Tableau to use to work out whether to do this or not takes into account the fact there will be additional disk seek time etc. but I reckon it's worth a go...

               

              You mentioned that Spotfire automatically recognized this field as a date, but I think essentially what Spotfire has done is the same as we are doing here... turning the string into something that's recognised as a date and then the data is loaded into the tool (in tableau, this is the extract). I'm not a Spotfire expert though...

               

              Hope that helps... more here: Tableau Extracts–What / Why / How etc - The Information Lab