Sounds like you probably need to UNION them (ie not join). When you do this, Tableau will create another column (called SheetName from memory, but I might be wrong) and that will let you filter to one file, multiple files or all files.
But if I union the files I am going to double counting some loans.
I am looking at Loans in June, July and August.
I want to be able to fully compare the months without double counting any data. Some loans will be in June, July and August, some only in June, some only July, etc.. If I union the 3 files i will double counting some loans when comparing everything. Is there a way not to do that?
Appreciate all the help.
That all depends on your data structure and at the moment we're (sorta) guessing a solution based on the information that you provided.
Are you using Tableau 2018.x ? Then you might be able to use Tableau Prep in order to help the situation and shape of the data.
If you're using an older version of Tableau, consider upgrading or providing some DUMMY data along with requirements of what you want to achieve and someone will be able to help you
Thanks Chris. I am currently using Tableau 10.5, which is the offered by company.
So, I have attached 3 dummy files (June, September and December). Keep in mind that these are very very small compared to what i plan to use.
I am trying to compare the state of the portfolio between June, September and December without double counting anything. For example some loans are in the 3 files but their characteristics change over time. Some loans are only on a few filles.
Example of a few comparisons (trying to make it in the same chart)
- number of loans delinquent in each period
- number of loans past due in each period
- avg balance outstanding in each period
- avg APR in each period
- number of loans in each credit category
Just to name a few. But like I said the main thing is having the ability to use the data within double counting and adding different data sources within same worksheet.
Thanks for all the help
Miguel, could you give an example of what the 'periods' are that you may look at? A single month, multiple months, a whole year..
I've attached a file with all three periods as sheets - you don't need to do this to union them, but it makes it easier for us here.
June Data.xlsx 10.2 KB
A period is going to be each month. I cannot add the files in each sheet of excel because they are too big for excel. I just use dummy data. My actual files are in txt.
So i need to be able to add 3 or more files and analyze them completely separate.
Thanks for the help.
So, the fact they are three large text files is no issue - you just use the wildcard union capability (Union Your Data). If there's no date identifier in there, Tableau adds a column anyway with the filename - this is what I've used in the example below.
There's nothing complicated in here - I'm using countd to count the loans - if there's truly only one row per loan per file, then even that isn't necessary if you break out the data by the time periods defined by the files.
If you wanted different time periods, or there are multiple rows per loan per file, that gets more complex (and we'd need to know more - e.g. what defines delinquent if it was only delinquent for part of the time period)
I've also assumed 1 to 30 days as past due, >30 for delinquent.
loan data.twbx 17.1 KB
Thanks Alex. I could not open your table sheet for some reason. Maybe if I see your Tableau that will help!
Each loan is a row all the time. But the problem is that multiple loans will be there in each file and should be analyze differently. So if union the files i am double counting and cannot analyze simultaneously. For example, if a loan is past due at June but current in September.
Appreciate all your help!
I tried this one and still give me the error about created in a newer version.. weird!
Alex - thanks so much. This has helped. I see what you did here.
I found another way using what you did without adding calculated fields as well.
Wonder how quick this work if add about 10M lines.. any idea? Each of my files has about 1.6M lines and a lot of fields. I am going to use Alteryx to skim the data but was curious how Tableau would react with so much data
Miguel Cortez wrote:
I am going to use Alteryx to skim the data but was curious how Tableau would react with so much data
Why didn't you say that in the beginning ?
Alteryx can do the union and everything else. Tableau will easily handle 10M rows, but if you can reduce the number it will just make Tableau faster