Info from Canvas...
Canvas Data Community: Canvas Data | Canvas Community
We are still attempting to connect to data via Redshift, but might have to use API instead.
Will update post when we get connected to our data.
Did either of you find a solution to this? We'd love to know your process. Thanks in advance!
I'm using the Canvas Data CLI tool which has saved me hours of development work trying to go through the API route. I'm happy to give you any help that I can.
That would be helpful.
How did you install the CLI tool(is it the one available on GitHub?) and is it downloading data in files or into a database?
1 of 1 people found this helpful
The CLI tool is the one available on GitHub. You have to install NodeJS too. The installation instructions are on GitHub. Once you have NodeJS installed, you use the node package manager (npm) to install the files. It's pretty easy if you are comfortable running things from the command prompt. The CLI downloads the files to a directory and then unpacks them. We are on Oracle so I use a combination of tools to load the files into the database. Oracle has SQL Loader. Sql Server has a similar product called bcp. Some data doesn't play nicely with the upload tools so I wrote my own script for a few of the files.
I would recommend loading the files into a database instead of loading them as a text file in Tableau. With a database, you get a very significant performance improvement through indexing.
This is very Nice! Thank you for your inputs Sam. I have 4 follow up questions.
1. Right now I am relying on external Team and their Java code to extract these files and I have my ETL processes to Load those extracted delimited files to Oracle DB. I am having backup of those files for 7 days and purging them after that. But I always wonder if we ever need any back up of extracts as each day we will have entire set of data.
2. I am still not sure how to be deal with Requests file. each day the incremental data keeps increasing and so also the ETL load processing time. we may not have appropriate use case for the data from that file for now but I guess its importance will eventually increase down the line.
3. Some column values(especially where it allows user to key in paragraphs) are LONG datatypes and Oracle limitation on use of LONG is not allowing all the data. Hence I have to either trim upto 4000Char or ignore those column values. Do you have any suggestion on such columns?
4. Is it possible to share your Table Definitions (especially Indices). I still need to do data profiling and come up with my own strategy.
Eventually I want to bypass java code based api and your inputs CLI will be very helpful. Thanks Again!