1 Reply Latest reply on Mar 30, 2020 7:44 AM by Jonathan Drummey

    Tableau Prep run query elapsed time

    Wafi Wahab

      I am currently running a flow to output a hyper on to my native computer. The elapsed time is currently at 31mins 32secs. Should I give up on this flow? I am not sure why it is taking this long. I don‘t normally encounter problems, but I am dealing with a new data set I have never dealt with before- a survey data set that is very wide, which I have pivot-ed several times so that Tableau can work effectively on Tableau (as per tableau video on youtube “putting my data on a diet”).

       

      In my flow, i have 7 joins, that resulted in number of rows rising exponentially from 30K to 2M. So I am guessing, the aforementioned problem could be due to hardware limitations (memory and cpu) and/or my flow needs to be changed as a workaround. So the main question is, should I give up on this flow which is still running? And, does anyone have any thoughts on potential solutions to move forward on this.

       

      Thank you

       

      *update* run flow completed at 56mins but file nowhere to be found in file.

       

      *update #2* tried flow to publish as data source on to online server. after 56+mins of running error message shows “System error: Problem with IO: {0}

        • 1. Re: Tableau Prep run query elapsed time
          Jonathan Drummey

          Hi,

           

          There are a lot of unknowns here so that's probably why none of the other volunteers have responded, I'll do the best I can to help.

           

          You'd written "very wide"...that's a relative term, is that 10s, hundreds, thousands of columns? Personally I find that I start having problems in Prep when the row count gets into the hundreds and I've got multiple steps involved.

           

          Also you haven't said anything about your hardware (particularly RAM and available disk), that can also have a big impact.

           

          At this time Prep attempts to turn the flow into as few queries as possible that can run on the internal Hyper DB. Given the combination of Hyper & Prep's current (as of 2020.1.4) capabilities this can cause Prep to create massive queries that consume large amounts of memory and make flows really slow (or throw query too large errors or just plain fail). This is in contrast to other tools that can process in a more step-wise fashion and consume smaller amounts of resources at each step (and therefore run faster). This points at the present workaround for Prep, which is to break down the Prep flow into a series of smaller flows. This is more complicated to run, but it's a known workaround.

           

          If you can I'd also suggest submitting a support ticket to help give the Prep developers more information about where Prep is having performance issues.

           

          Jonathan