There are several documents and trainings available related to Tableau Server and scalability.
and this Interworks series - 8 tips for deploying Tableau at scale | Tableau Software is helpful as well.
I will move this thread to our Server Administration area where other Tableau Server Administrators often reply to threads of this sort.
I hope this helps.
1 of 1 people found this helpful
Hey There -
- what would be the best way to manage this data? The database would be Hadoop, with an Impala connector.
In a perfect world, you will partition your Impala tables and do all the "Performance Tuning" work necessary to allow Impala to answer questions quickly. In other words, Tableau data extracts aren't a replacement for solid administration on the Hadoop side of the equation
- Is it okay to just do a daily incremental refresh into Tableau? If so, how to handle cases where historical data gets updated? (might happen a handful of times in the year)
You probably don't want to simply copy "all rows" from Impala into a TDE. You want to focus on making Impala generally performant and then then use TDEs as "spot solutions" to do things like aggregate lots of leaf-level data into a relatively small resultsets that can be leveraged by vizzes which show trends and highly summarized answers. Impala is your acceleration layer on Hadoop, not Tableau If you go this route, you won't be collecting lots of data in your TDE's anyway, so incremental vs. full-refresh becomes a non-issue.
Incremental extracts will not pick up changes to rows which have already been ingested by it.
- would using another tool in front of Tableau, like Alteryx, help?
No. Do your work in the data system, make sure the data system (Impala) is fast, and then all is well.
- is there any way in which we can segregate out a data source into a moving 1 month data that always FULL refreshes + another that INCREMENTALLY refreshes daily? And UNION them?
Still sounds like you're thinking about taking an "extract everything" approach, which really isn't appropriate here