I have a client that wants to use Tabeau their EMR (spark) cluster. I have used Tabeau for other purposes, but not in this case. The documentation seems straightforward, but I'm getting errors when I try to connect.
Here is the setup:
1. EMR cluster's master doesn't have a public IP, but from the Tableau desktop EC2 instance I am able to ping and telnet into the port 10001 where thift is running.
2. I am able to test thrift with beeline and it connects fine
3. I am not using ssl or authentication given the limit access the cluster has.
4. I have installed both data direct 8.0 and simbaodbc and I'm using
Hadoop distribution:Amazon 2.8.3
The error is
"Unable to connect to the ODBC Data Source. Check that the necessary drivers are installed and that the connection properties are valid.
[Simba][ThriftExtension] (5) Error occurred while contacting server: No more data to read.. This could be because you are trying to establish a non-SSL connection to an SSL-enabled server.
Unable to connect to the server "IP". Check that the server is running and that you have access privileges to the requested database."
I simply followed the documentation provided by tableau which says to install the driver only (not mess with odbc), then us it in tableau. I have verified that I have set no ssl, no auth, before trying to connect.I also verified by running datagrip and doing a query from the tableau ec2 instance, which works as expected.
resolved the issue by ignoring the documentation and just setting up the odbc driver, then choosing it instead of sparksql as a source.