If I left join the book1 with prod_info across prod and product....it cuts off the dates it joins to only october and not oct-dec.
With left join I assume you mean data blending where book1 is primary and prod_info is secondary data source. In the attached I did join in this way by first renaming prod_info prod to product and thereafter joined the "tables" on that field.
Ps. I think the maximum number of CSV tables in a join is 6. Have you tried to join all 5 as one combined source? It should be possible to INNER JOIN the first 4 and LEFT JOIN those with the 5th as custom sql and possibly also in Tableau's interface (I haven't tried the latter).
The link button is not what I need. I need it to JOIN as ONE FILE. Join as in from the "join tab" as shown below. JOIN not link.
an inner join doesn't make sense, I don't think. The three separate csv "book" files are all separate important rows of information. they are added to make one source of data. each books appends onto the first.
once all three books are appended, I want to JOIN the prod_info csv to that file. I want each row in the book csvs to have data appended onto it from the product info csv.
each books appends onto the first
Is it correctly understood if Book1 has 100 rows, Book2 has 100 rows, and Book3 has 100 rows, you first generate a table with 300 rows? And afterwards you want to left join these with prod_info?
correct. is that not what i've done in the example workbook?
Your explanation is clear enough. The reason I asked some clarifying questions is because I could not see what was done by studying the workbook itself. I see only an extract from Book1 and another extract for prod_info.
As for the issue at hand, it is to my knowledge not possible to join extracts. If it is possible for you to share a masked sample of the 5 csv files, I could show how both appending and left joining can be done with custom sql.
Ps. You might like to vote up some of the ideas that request the ability to join extracts:
wow what a bummer! I will vote them up. In the meantime, here's my cvs files. please show me how to append the book csvs and then left join the book csv (with all book files) with the product info. I would really appreciate it!
Desktop.zip 2.7 KB
Here is a custom sql that works on my computer. I had some localization issues, which were solved by creating a schema.ini file in the same folder.
SELECT d.date, d.store, d.qty, d.product, p.height, p.length, p.width, p.[case qty] FROM ( SELECT [date], [store], [qty], [product] FROM [Book1#csv] UNION ALL SELECT [date], [store], [qty], [product] FROM [Book2#csv] UNION ALL SELECT [date], [store], [qty], [product] FROM [Book3#csv] ) d LEFT JOIN [prod_info#csv] p ON d.product = p.prod
schema.ini.zip 326 bytes
1 of 1 people found this helpful
I don't know SQL really, but I know R so I decided to use R to do this as well and post the code here in case someone else needs it.
(there wasn't a syntax highlighting for R, bummer)
data1 <-read.csv("Book1.csv", as.is=TRUE)
data2 <-read.csv("Book2.csv", as.is=TRUE)
data3 <-read.csv("Book3.csv", as.is=TRUE)
info <-read.csv("prod_info.csv", as.is=TRUE)
#not necessary for all data. just removed NA's
#good to check the data you read in
#now bind the data, combine it into 1 file
comp_data <- rbind(data1,data2,data3)
#join the prod info with the combined data
#new name for product column, to match name in comp_data
names(info) <- "product"
#check that it changed
final_data <- merge(comp_data,info,by=("product"))
#write the final csv file
write.csv(final_data, file="final_data.csv", sep=",",col.names=TRUE)
I was thinking to advice you to automatically combine the 3 files to one with scripting, but see you already did that. Seeing that you used R is a big surprise. I didn't know that R is an ETL tool
Thanks for the correct answer, especially in the light that your answer in this context is the better solution being the easier to re-use and automate.