That sounds like unexpected behavior.
I would check first to see if you have any joins that are potentially turning those 40k rows into hundreds of millions or some similar transformation that may be unintended. But even that should result in sampling behavior and not a system freeze.
I'd definitely open a ticket on this one and see if there is an issue that can be resolved.
I am having the same issue. This is information I received from Tableau Support. They say it has the same memory requirements as Desktop, but I have never had this issue in Desktop doing identical (or more) data changes.
In order to properly execute some of the functions, Tableau Prep may require disk space multiple times
larger than the size of the original data source.
The initial solution for this is to Change your temp file location to a drive with more space.
For Mac, open Terminal and enter the following command:
export TMPDIR=<desired path for new temp directory>
Then, launch Tableau Prep from the command prompt/Terminal window.
Tableau Desktop has the same requirements and so you may need to make sure the temp file has been
emptied and other processes that use a lot of memory.
The TEMP folder might need as much as the square of the final size of the extract. For example, a 32GB extract
may require up to 1024GB (1TB) of disk space to create the extract or publish.
If you are running both this can very much lead to needing more memory.
Thanks for the question! Is there specific workflow that results in this problem? Could you give me additional details on what you are doing in the app when you hit this issue?
The Tech Support team is looking into this under Case# 03834004.
Basics... the source file is a excel spreadsheet (49Mb), approx 60 columns and 40k rows. The problem arises when I add a step to cleanse the data. It appears java becomes bogged down trying to render the data into the interface ("running flow" in upper corner, blue bar cycling above where the columns show up). Once the columns of data finally show up under the cleanse step, I was performing a number of individual tasks. Deleting a few columns, removing nulls or converting nulls to "none". Where I kept getting the issues was when I was trying to perform a few manual grouping tasks to combine values under certain columns. Example one column is comprised of around 270 unique values that in the end are combined into about 230 unique values. After performing several of these grouping tasks, everything would slow down or stop. As a side note, the interface to group data still needs some work it is not the most user friendly. The method under Tableau Desktop is much more user friendly.
While trying to recreate this incident for the Tech Support Team I observed another issue through watching Windows task manager. When Tableau Prep is slowing down, locking up and then closing down from the issue noted above, the processes are not releasing. Java.exe and tabprotserv.exe (and some others) remain open, and worse yet at a high memory consumption, even though the application was closed. Re-initiating Tableau Prep opens those processes a second time and doubles up on the memory consumption.
I am looking forward to what the Tech Support team is able to decipher from the log files.
I am working with an Oracle database and I am experiencing the same issues you outlined with the Tableau Prep tool. I union my tables together, go to implement a 'clean' step and it spikes the CPU as I exclude Nulls and other values. Then it freezes up and I have to close it. I have also noticed the java.exe and tabprotserv.exe are remaining open and consuming memory when the app is closed. I am curious what explaination the tech support team has provided, if one has been provided. Have you developed/used a work-around in the meantime? I appreciate anything you can provide me with. Thanks,
I haven't received any explanation from the support team on this. Based on when the system is locking up my guess is there is a java scripting issue that needs to be fixed.
As far as a work around, this isn't elegant but it works. I have been running Windows Task Manager in the background and watching memory use of the processes. When I see Tableau and Java spike and performance beginning to drastically slow down in Prep, I save my file and close the application. Wait until the processes shut-down, then reopen the flow I was working on. When you reopen the file the memory use is normal again. If Prep locks up before you save and close that is when the processes never shut down. You will have to force them to close.
Not that it is a badge of honor but using this method I have been able to set-up a flow within Prep that imports from 15 sources, 4Gb in total, and still get the flow to output a hyper file. It took a while to set the flow up with all the cleansing because of the number of times I had to close the application, but now that it is set up it works fine.
I am experiencing similar issues while trying to migrate Power BI to Tableau Prep. At this point I believe 8gb of RAM is part of the problem and I'm going up to 32gb.
However, I have also noticed the same zombie processes noted above and have also started saving and shutting down periodically. Right now I have 83 Columns, and will have to scrub this down , but I am only running a sample of 10k rows. I think I read somewhere that the Columns actually dictate the memory usage and not the rows due to how the indexing occurs.
Also, if anyone else is migrating Power BI queries into Tableau Prep, to save on RAM, for now I am going into the Advanced Editor and copying out the code from Power BI and I'm pasting that into Notepad++ to decipher what was done before and translate it.
Thank you to whoever said that a large temp folder might be needed. I'll get an eye on that. Looks like the corperate standard laptop might need an upgrade there too.
I have a similar issue - although prep isn't crashing - memory usage is very high when I run prep. I have 8GB RAM and with only outlook and chrome (as well as prep), my memory usage is at 6.2GB. Looks like I'll have to bump up to 16 or 32GB unless it's a prep bedding in issue.
Hi yes same issue for me - I did raise a support request and got a response mentioning temporary tables, caching and system requirements.
I work for a software company, the spec should be quite adequate for anything like this.
I'm trying to join two tables but because they come from a database and are quite large - the caching of temporary tables in Tableau is causing high memory useage.
I think we need something to be able to filter the result by a value, do your JOINS and then remove the filter at the end - exactly as you would when writing and testing custom SQL.
I did try this but found that I couldn't go back and remove the filters - I think the use for this software is writing custom SQL and then adding cleaning steps but that's it if you have a large data set or database