8 Replies Latest reply on Feb 8, 2011 9:51 AM by . Vladm

    boost scatter plot perfomance

    . Vladm



      I routinely render fairly big scattter plots , panel or matrix, with 4-10 scatter plots with 20-50K marks on each individual plot. So rendering performance become the issue. For example it take about 1 minute to render 2x3 matrix without coloring, bsizing or shaping. I would appreciate any suggestion how to speed up rendering. What hardware part  (video card, memory bus,..?) is bottlelneck. The size is not issue, because I have much more than tableu processes occupy. I have seen the suggestion to speed up scatter plots by removing overploted marks. it is not solution for me, because my prinamy use of scatter plot is "picking" data points, I select intersting  objects on one plot and highlight them (by ID dimension) on other scatter plots. Essentially I need to see how same object behave cross multiple scatter plots





        • 1. Re: boost scatter plot perfomance

          Try using Tableau's Data Engine (extracts) instead of live connection to your data source - it is usually much faster, especially with large datasets.

          Also, Richard Leeke, an advanced Tableau user, (you can find many of his posts on this forum) uses a technique that batches similar data points together into one point to reduce the load for Tableau's rendering engine. Maybe search the forum or ask Richard?

          • 2. Re: boost scatter plot perfomance
            Richard Leeke

            I think the technique Dimitri mentions that I use (discussed here is probably what you are referring to when you say that you've seen the suggestion about avoiding overplotted marks.


            There may still be ways to pick data points on one plot and highlight on another, though, so I wouldn't dismiss that out of hand.  I often use a filter action on a bucketed summary chart to select the underlying detail on another view (i.e. selecting one or a few summary marks and then filtering to all underlying marks that correspond to the selected summary ones).  That's slightly different to the way you are working, but there might be options.  Explain a bit more or post a simple sample if you like.


            In terms of you original question, I've done a bit of work on establishing where the time goes and where the constraints are.  As Dimitri says, the first thing is to make sure your data source isn't the constraint - but from your post it sounds as if you have already established that rendering time is the key.  I can't give you a definitive answer on the hardware question, but I do have a few observations.


            I don't know how much difference the video card makes (though it makes sense that it should be important), but main processor speed is definitely important.  One observation is that I don't believe multiple cores help, as the rendering is only done by a single thread, as far as I know.  I didn't see all that much difference when I upgraded my laptop from a 3 year old core 2 duo to an 8 core i7, for example.


            Having enough memory for the Tableau (and if you are using an extract the Data Engine) processes to be entirely in memory is important.  On a 32 bit system the Tableau process will never get more than about 1.3 GB, so 2 GB of memory is generally fine if you just want one workbook open.  If you have several workbooks open you can start getting constrained by physical memory, and things can slow down quite dramatically, though.


            The maximum marks you can render is limited by virtual memory limits.  I've certainly done more than a million, probably several millionon a 32 bit machine.  On a 64 bit O/S you can get about three times as many marks (even though Tableau is still a 32 bit application).  By the time you are approaching the maximum addressable memory, the sheet refresh time will probably be measured in minutes, though.


            Hope that helps.

            • 3. Re: boost scatter plot perfomance
              . Vladm



              Thanks for suggestions. All data are in tableau extracts, no live connections. I assume that bottleneck is at the rendering part, not data access,  because the "computing view layout" and next "sand clock" steps take most of the time.  I have 4GB memory 4-core machine and there are  plenty of free memory. it is Windows 2003 Server 32 bit. Monitoring via windows task manager i can see that tableau occupies ~200M and use only one thread. I might switch to 64 bit windows 2008. But I don't expect that it will speed up the things since it is not the 32 bit 2GB memory limit  problem. I understand that it is possible to create other views that can be used as filters. But it would be ways around, not really solutions... I have attached image that hopefully illustrate what I want to achieve. In this example I higlight subjects anti-correlated between Comparasion 3 and 4, and check them at other measurments.  I still wonder if better graphic card or memory bus can help in this situation.




              • 4. Re: boost scatter plot perfomance
                Richard Leeke

                Nice example - that clarifies what you are doing a lot.  I thought it was something like that.  Off the top of my head I think you're right that you can't use the reduced number of marks trick - but I'll have a think about it in case I can suggest anything else.  I can think of an old workbook of mine with multiple scatter panes like that and lots of marks, so I'll have a play with that and see if I can think of anything.


                At face value your best bet is faster hardware, as you say.


                As you are nowhere near memory constrained I agree that moving to a 64 bit O/S is unlikely to help.  It's possible that it could even make it slower, by increasing the amount of raw data to move around - though I have never done any direct comparisons with Tableau on equivalent hardware.  When I went from 32 bit XP on an old laptop to 64 bit Windows 7 on a new one I got a slight speed improvement, but I can well believe that some of the hardware difference was lost in moving to 64 bit.  It would be good to try it on a dual-boot machine.


                The "computing view layout" and "sand clock" phases are definitely mainly to do with rendering the view - but I'm not sure what happens where.  I know that one substantial task that has to happen in there is the anti-aliasing (that may be what is happening when the "sand-clock" is showing), which I believe is typically done on the graphics card, but that is pure speculation.


                Do report back if you try a faster graphics card - whether or not it makes a big difference.

                • 5. Re: boost scatter plot perfomance
                  . Vladm

                  Thanks for the "anti-aliasing" hint! the "sand clock" phase actually take more time in my case than "computing view layout"

                  • 6. Re: boost scatter plot perfomance
                    Joe Mako

                    If you are going to go the hardware route to improve performance, I recommend a SSD.

                    • 7. Re: boost scatter plot perfomance
                      Richard Leeke

                      An SSD will certainly help with some classes of performance issues, but I doubt it makes much difference to scatter plot rendering time.  A large scatter plot from a data extract may take a couple of seconds to retrieve the data from disk and a couple of minutes of processing to render all the points.


                      It's worth seeing what you can tell about the split of time by studying the Tableau logs - but from your comments about the time being spent in the "computing view layout" and "sand clock" phases I don't think much of it will be I/O.

                      • 8. Re: boost scatter plot perfomance
                        . Vladm

                        I disabled antialiasing at video card ATI Catalysts Control center and change other optinons (smoothing etc.) to maximum performance vs. quality  . It didn't change anything at Tableau. I wonder what "sand clock" really mean... Will ask Tableau support