2 Replies Latest reply on Jul 18, 2017 11:00 AM by patrick.byrne.0

    Tableau Server & Hadoop (Hive) Connection Strings (in a multi-tenanted and HA environment!)


      Hi All,


      I'm currently looking to connect Tableau Server with Hive. The challenge is the setup of the target system. From reading I am unsure how to maintain the below requirements when integrating with Tableau Server:


      The environment setup is as follows:


      (1) Hive instances run in High Availability (2 Hive Server instances in one cluster). Zookeeper discovery operates to pick out the preferred Server running Hive Service instance (these can change in failure or planned downtime scenarios).

      Tableau Server should integrate with this setup.


      (2) Hive has multiple separate business units using it; each with different YARN queues guaranteeing % cluster performance to each business unit.

      Tableau Server should obey these conditions.


      (3) Access to data for all Tableau Server connections should be restricted via Ranger policies specific to each business unit' Service Account (kerberos) (this prevents one business unit accessing another business units data).

      Tableau Server should respect these security controls.


      The native Tableau Server connection GUI is limiting.

      To address (1) and (2) is it possible to create custom connection string to cater for multiple Hive Service Instances and different YARN queues like:  odbc:hive2://hiverserverinstance1:port,hiveserverinstance2: hiverserverinstance3:/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2?tez.queue.name=<queue_name>


      Is it possible to setup multiple service accounts for the same Tableau Site - this would allow multiple different connection strings and importantly the ability to restrict access to cluster data per Service Account via Apache Ranger thus addressing (3) and ensuring a Business Unit cannot see another unit's data.



      Probably not the easiest question!