2 of 2 people found this helpful
Update 2013-01-24: I don't recommend these steps any more, as I've come to understand that the PostgreSQL driver is the one that Amazon expects for connecting to RedShift.
Amazon Redshift appears to be powered by ParAccel, though Amazon suggests using the standard PostgreSQL ODBC driver. To some extent this is reasonable since ParAccel used the original PostgreSQL front end for handling SQL syntax and client/server communication (and several other database companies have done the same). However the first-class Tableau connector to ParAccel is specifically tuned for ParAccel connections and specialized syntax that are not present in PostgreSQL. Likewise, the first-class Tableau connector to PostgreSQL is tuned to PostgreSQL, and may not work for you in all cases since Amazon Redshift does not appear 100% compatible with PostgreSQL syntax. And of course the general-purpose ODBC connector cannot provide a first-class experience, and can only support the SQL and ODBC standards (which few database vendors follow very closely).
I suggest experimenting with the ParAccel connector in Tableau if possible, though I do not know if this will work or has ever been tested (note: you may need to experiment with the default port). In the meantime, we'll consider Amazon Redshift along with many other opportunities we are exploring to help connect our users to their data. I cannot give you a timeframe or discuss our roadmap, but do consider discussing this with your Tableau sales rep if you have a strong business need to use Amazon Redshift.
I hope this helps,
I've not had much luck with the paraccel odbc driver either. I have had success with dbvisualizer connecting to redshift over a basic postgres connection, yet tableau won't connect when I use the postgres connector to connect to redshift.
I tried the ParAccel driver, but get an error that Tableau is "unable to connect to the server. Check that the server is running and that you have access privileges to the requested database".
I also tried the Postgres driver with the port changed from 5432 to 5439. I get a Database error 0x80004005: Unspecified error, error while executing the query. Unable to connect to the server. Check that the server is running and that you have access privileges to the requested database.
I'm reaching out to Amazon to get their advice as well. Thanks.
This problem is not unique to Tableau. I cannot connect to a test instance of Redshift using any tool.
It turns out that indeed the PostgreSQL driver is the only appropriate choice. However I cannot get that connection to work even outside of Tableau using other tools. I can't connect using the JDBC driver and their recommended third-party SQL tool. I suggest that you contact Amazon and try to determine if they have some sort of service or networking outage that would cause these connection failures.
My colleague has been able to connect Tableau with Redshift by using the ParAccel connection (the latest ParAccel driver was installed from the drivers section http://www.tableausoftware.com/support/drivers).
However we have tried to do the same connection on other PCs and most of the time it don't work. We get the error :
" The connection to the data source might have been lost.
Communication with the Tableau Protocol Server process was lost.
Unable to connect to the server "myserver.us-east-1.redshift.amazonaws.com". Check that the server is running and that you have access privileges to the requested database.
Unable to connect to the server. Check that the server is running and that you have access privileges to the requested database."
Note: myserver is not the real address of the server.
I have to mention that it always works if I try to connect using SQLWorkbench and the PostgreSQL driver suggested by Amazon (http://jdbc.postgresql.org/download/postgresql-8.4-703.jdbc4.jar),
A reliable way to connect between Tableau and Redshift would be greatly appreciated on our side.
I have been able to connect to redshift by installing the odbc driver for postgres 8.4.
Download and install psqlodbc_08_04_0200.zip
Create a new ODBC System DSN by running C:\Windows\SysWOW64\odbc32.exe
Select the postgres driver you just installed
Enter in Redshift server information and credentials
Open up Tableau and use the ODBC connector on the bottom left.
Select the ODBC connection you created above.
- If you setup the test tables in the Redshift, it should work great with tableau. In my case we have already imported a decent amount of data into a table on Redshift. When I connect to my redshift table using Tableau, the initial connection can take up to 30 minutes to execute, but my computer normally crashes before it completes.
- To get around this, I have that found it is best to run a custom SQL query which limits the records returned to 10,000 or so records. Obviously this is not ideal for working with large datasets, but at least I can work with Tableau and Redshift in a limited manner.
Thanks Matt. I found that using the default ODBC connection from Tableau and referencing the System DSN I set up for Redshift using the Postgres driver works. I also noticed the same problem with the initial connection taking a really long time.
The solution I've been using is to run the following SQL in my query tool
select pid, query
where status = 'Running';
That will tell me what queries are running on Redshift. Then I can find the PID of the one that is coming from Tableau and kill it using this command.
cancel 1234 -- where 1234 is the PID of the query to kill
Then my tableau workbook becomes immediately responsive again and runs fine from that point forward.
ODBC connection is the way that helped me connect Redshift to Tableau. Along with issues of slow response time, few functionalities related to tableau inbuilt functions like ABS are lost.
Tableau extracts is the way to work around above mentioned issues but this has it's own set of restrictions too.
Thanks Shawn and Anusha
Here are a few more details on my experience:
OK, here are the specifics that I see when creating a new connection to our 900,000,000+ row table named actions in redshift using Tableau’s single table option:
- Image 1.png shows my Tableau settings for this connection.
- Image 2.png shows the running “select * from actions” command on our redshift instance which is instantly called when I click OK on image 1.png
- Select * from actions is not viable due to the table’s size – 900MM+ records.
- I can cancel the query (cancel pid) and that releases Tableau and allows me to create/edit my Tableau worksheet.
- Image 3.png show a Tableau worksheet with a filtered date range. I would expect tableau to send a query reflecting that filtered date range. Instead I continue to see “select * from actions”.
The overall summary is that I can’t get Tableau to stop using the “select * from actions” query. This problem is most obvious when using large tables.
If I query a smaller table, I do see the same “select * from users” but since the users table is much smaller at 50K rows so it is easier to work with this data.
Thanks Mark for the details.
Just curious on what ETL tool you used to load redshift with 900 millions. Howz the performance of load?
We just exported out our data into csv format and uploaded into S3 in 700MB chucks. We then brought those chunks into Redshift. We had a developer devote most of a week scripting this out
I've tried using the default ParAccel connection, and get the following error
"The drivers necessary to connect to this database server are not properly installed. Visit http://www.tableausoftware.com/drivers to download driver setup files."
Unfortunately, there is no link to the latest ParAccel drivers, just text saying "Contact your ParAccel Account team". Since I'm not directly a ParAccel customer, this isn't much help. Are the drivers available elsewhere online?
I've updated my earlier reply -- we should actually be using the PostgreSQL driver / connector, not ParAccel.