-
1. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Chris McClellanJul 3, 2018 5:33 PM (in response to Miguel Cortez)
Sounds like you probably need to UNION them (ie not join). When you do this, Tableau will create another column (called SheetName from memory, but I might be wrong) and that will let you filter to one file, multiple files or all files.
-
2. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Alex Kerin Jul 3, 2018 5:39 PM (in response to Chris McClellan)I think you're right: Union Your Data
-
3. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Miguel Cortez Jul 3, 2018 10:26 PM (in response to Alex Kerin)But if I union the files I am going to double counting some loans.
For example:
I am looking at Loans in June, July and August.
I want to be able to fully compare the months without double counting any data. Some loans will be in June, July and August, some only in June, some only July, etc.. If I union the 3 files i will double counting some loans when comparing everything. Is there a way not to do that?
Appreciate all the help.
-
4. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Chris McClellanJul 3, 2018 10:31 PM (in response to Miguel Cortez)
That all depends on your data structure and at the moment we're (sorta) guessing a solution based on the information that you provided.
Are you using Tableau 2018.x ? Then you might be able to use Tableau Prep in order to help the situation and shape of the data.
If you're using an older version of Tableau, consider upgrading or providing some DUMMY data along with requirements of what you want to achieve and someone will be able to help you
-
5. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Miguel Cortez Jul 4, 2018 9:18 PM (in response to Chris McClellan)Thanks Chris. I am currently using Tableau 10.5, which is the offered by company.
So, I have attached 3 dummy files (June, September and December). Keep in mind that these are very very small compared to what i plan to use.
I am trying to compare the state of the portfolio between June, September and December without double counting anything. For example some loans are in the 3 files but their characteristics change over time. Some loans are only on a few filles.
Example of a few comparisons (trying to make it in the same chart)
- number of loans delinquent in each period
- number of loans past due in each period
- avg balance outstanding in each period
- avg APR in each period
- number of loans in each credit category
Just to name a few. But like I said the main thing is having the ability to use the data within double counting and adding different data sources within same worksheet.
Thanks for all the help
-
June Data.xlsx 8.3 KB
-
December Data.xlsx 8.4 KB
-
September Data.xlsx 8.3 KB
-
-
6. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Alex Kerin Jul 9, 2018 7:03 AM (in response to Miguel Cortez)Miguel, could you give an example of what the 'periods' are that you may look at? A single month, multiple months, a whole year..
I've attached a file with all three periods as sheets - you don't need to do this to union them, but it makes it easier for us here.
-
June Data.xlsx 10.2 KB
-
-
7. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Miguel Cortez Jul 9, 2018 7:28 AM (in response to Alex Kerin)A period is going to be each month. I cannot add the files in each sheet of excel because they are too big for excel. I just use dummy data. My actual files are in txt.
So i need to be able to add 3 or more files and analyze them completely separate.
Thanks for the help.
-
8. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Alex Kerin Jul 9, 2018 9:09 AM (in response to Miguel Cortez)So, the fact they are three large text files is no issue - you just use the wildcard union capability (Union Your Data). If there's no date identifier in there, Tableau adds a column anyway with the filename - this is what I've used in the example below.
There's nothing complicated in here - I'm using countd to count the loans - if there's truly only one row per loan per file, then even that isn't necessary if you break out the data by the time periods defined by the files.
If you wanted different time periods, or there are multiple rows per loan per file, that gets more complex (and we'd need to know more - e.g. what defines delinquent if it was only delinquent for part of the time period)
I've also assumed 1 to 30 days as past due, >30 for delinquent.
-
loan data.twbx 17.1 KB
-
-
9. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Miguel Cortez Jul 9, 2018 4:04 PM (in response to Alex Kerin)Thanks Alex. I could not open your table sheet for some reason. Maybe if I see your Tableau that will help!
Each loan is a row all the time. But the problem is that multiple loans will be there in each file and should be analyze differently. So if union the files i am double counting and cannot analyze simultaneously. For example, if a loan is past due at June but current in September.
Appreciate all your help!
-
10. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Alex Kerin Jul 9, 2018 4:21 PM (in response to Miguel Cortez)Ah yes, I'm on 2018.1. The multiple rows is not an issue as you're splitting by the time period created by each file. This version should work:
-
loan data_v10.5.twbx 18.1 KB
-
-
11. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Miguel Cortez Jul 9, 2018 7:00 PM (in response to Alex Kerin)I tried this one and still give me the error about created in a newer version.. weird!
-
12. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Alex Kerin Jul 9, 2018 7:25 PM (in response to Miguel Cortez)Alrighty, 10.2 it is then...
-
loan data_v10.2.twbx 18.1 KB
-
-
13. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Miguel Cortez Jul 9, 2018 11:01 PM (in response to Alex Kerin)Alex - thanks so much. This has helped. I see what you did here.
I found another way using what you did without adding calculated fields as well.
Wonder how quick this work if add about 10M lines.. any idea? Each of my files has about 1.6M lines and a lot of fields. I am going to use Alteryx to skim the data but was curious how Tableau would react with so much data
-
14. Re: Analysis of Time Series - i have multiple sets of the data that have all the same fields and contain duplicate loans in this case. I want to analyze the data in the same sheet without joining any of the different files.
Chris McClellanJul 10, 2018 2:28 AM (in response to Miguel Cortez)
Miguel Cortez wrote:
I am going to use Alteryx to skim the data but was curious how Tableau would react with so much data
Why didn't you say that in the beginning ?
Alteryx can do the union and everything else. Tableau will easily handle 10M rows, but if you can reduce the number it will just make Tableau faster