Is there a different way I should present this issue that would make more sense, and easier to get some hints or ideas?
I think I understand what you are trying to get to...So I think to do this you need to also create an LoD on the MAX date in the data, and so each 'ticket' knows how long it's been open (assuming it doesn't have a corresponding close date)...so we may need this and couple of IF statements to create the 'number of days open' dimension.
One question....As you want to show this by month how would you handle this dimension? for example a ticket was opened on the 25th March 2016, and has no close date....when reporting for march would this fall into the 6 day bin (Open date to end of March) or 21 day bin (25th March to Today)?...Essentially I'm asking if an open 'ticket' has a single bin for all months, or if can be in multiple bins depending on which month we're looking at. I'm pretty sure we can handle both, but just want to get some clarity on this.
If you can (possibly manually) work up for a single (or few) tickets what you'd expect to see, I can ensure my solution does that.
One last Question...are you connecting live to the SQLDB, or are you creating a TDE from the SQLDB?...just want to know if I have the full Tableau calculation arsenal at my disposal!!
Thanks for helping out! In this simplified data, I think I would want (I put ticket ids in parenthesis for reference):
Date # of tickets open # of tickets open 2+ days 1/1/16 1 (2) 0 1/2/16 3 (2,3,5) 1 (2) 1/3/16 3 (2,3,6) 2 (2,3) 1/4/16 1 (6) 1 (6) 1/5/16 1 (6) 1 (6) 1/6/16 0 0
Ultimately, this would need to work for cases like 14+ or 90+ days. In terms of the database, this is built from a live SQL connection from our Tableau server.
Apologies for not getting back sooner on this...a combination of 'real' work (booo!!) and this actually being a very complicated problem!...I think you may love and hate this solution in equal measure (it works, but it's very complicated!)...btw I've learned a lot doing this, so many thanks for the challenge!! (this is why I love the community!)
So the real problem here is not so much one of LoD (as I originally thought), but one of data-densification...Let me (hopefully!!) explain
So the trouble here, is that the data is 'sparse' and by that I mean we only have the dates when something happens (either an id is open or closed) and not the intervening dates (and for the 1/5/16 we have no data at all) but we still want to report on these dates, and use them to work out 'days open'. If I was doing this, I would be doing this in the ETL/DB view so the data would like (notice I have all the dates for id = 2)
Id Type Date 1 open 01/01/2016 1 close 01/01/2016 2 open 01/01/2016 2 open 02/01/2016 2 open 03/01/2016 2 close 04/01/2016
...with data like this, we wouldn't need any densification...In most projects I do, I tend to spend around 80% of the time on the data (getting in right shape, densifying if required...etc.) and 20% on the Tableau work...worth considering.
However if you are stuck with the data as is, I'll try my best to explain how I've got it to work (if you're interested, this is an excellent video on data-densification Tableau Request Live - Data Densification on Vimeo which is an un-documented feature). From this video you'll see how we can access the densified dates using Table Calculations.
So the first thing I do is bring in date, and (as a Range Aware Pill) I'm able to ask Tableau to show me 'missing values' (Range Aware Pills, which are dates or bins, know their Min/Max and increment...so are able to 'fill' any missing values
however to access these densified values, I have to use Table Calcs (as they don't really exist in the data)
So the first thing I need is a formula to access these densified dates...for this I created
[Date Dense - SR]
I then create a field to sum up the number of days open, using my dense date
[Days Since Open Dense - SR]
DATEDIFF('day',WINDOW_MIN(MIN([Date Opened (LOD)])),[Date Dense - SR])+1
btw the +1 is just so the first day gets counted
...this causes a problem, as the densified dates are references, so it keeps counting even after the id has been closed...sort of like this
So id=1 [first column] has days open 1-6, even though it closed on the 1st
So we need to fix this...unfortunately we can't use out LoD Closed for this, so we have to create our MAX date using a Table Calc
So I create
[Window Max Date for id - SR]
So now I can create an 'Is Open Flag' calculation off this and my dense date, which I'll reference in other formulas
[Still Open Flag - SR]
IF [Date Dense - SR] >= [Window Max Date for id - SR] THEN 1 ELSE 0 END
So we're nearly ready to start actually counting the >2 Days open!
First I create a calculated dimension to only bring me back a True if the id is Open on that date, and has been open for >=2 days
[Final Count - 2+ Days SR DIM]
[Still Open Flag - SR] = 0 AND [Days Since Open Dense - SR] >=2
and then finally we can count these
WINDOW_SUM((IIF([Final Count - 2+ Days SR DIM],1,0)))
however, we have to keep the id in the vizLod (so the table calcs can access that level), which means although I get the right answer it is repeated multiple time...a little bit like the image above where I have multiple marks per date. This is easily fixed by wrapping the formula in IF FIRST()=0...END
So our final formula is
[Final Count - 2+ Days SR]
IF FIRST()=0 THEN WINDOW_SUM((IIF([Final Count - 2+ Days SR DIM],1,0))) END
and I also remove the 'stack marks' in the options (else you get odd column widths, as although everything but the first value is NULL - it still needs room for the NULL plots)
As you also wanted a count of open tickets I've also created the following calculation
[Open Tickets - SR Cumulative]
RUNNING_SUM(zn(COUNTD(IIF([Date] < [Date Closed (LOD)],[Id],NULL))))
RUNNING_SUM(zn(COUNTD(IIF([Date] > [Date Opened (LOD)],[Id],NULL))))
and then need to aggregate this by date, so it runs over all ids, and so we use the final version
[Open Tickets - SR Cumulative TC]
IF FIRST()=0 THEN WINDOW_SUM([Open Tickets - SR Cumulative])END
using similar logic to the above.
Now when we bring these fields onto the canvas, we need to run different table calculations on different 'compute using' (some by date and some by id). By splitting out the table calculation formulas, Tableau gives us this option...I've grabbed a screen shot of a few of them, but you can see how they are all set to run in the attached
...so hopefully this makes sense (of sorts!), and shows you can do (almost) anything in Tableau...but I would advise to get the 'densification' done in the data (a good DBA could do this in an hour, or a bit of VBA script in Excel...depending on how much data you have).
Thank you for one of the (if not the) most complicated problem I've come across on the Community!!...especially one that, at first glance, looks so simple!
ticket logging simple - SR.twbx 101.2 KB
wow thanks, that works! I can follow about half of that before I start to get cross-eyed trying to keep the window and LOD parts straight.
I'll definitely try to get our DBA to give me a row per date, though that will likely explode the data quite a bit (we have thousands of tickets, and the average time to close is on the order of 7 days, with some extreme outliers that end up being several months old). Is there a way to get this with a custom SQL query?
I can follow about half of that before I start to get cross-eyed trying to keep the window and LOD parts straight.
Yes it will increase the data, in terms of rows, but I've had 20-30 million rows in a Tableau TDE and she still flies (maybe not flies, but certainly usable!)...so the increase in rows is offset by the simplicity of the calculations.
Is there a way to get this with a custom SQL query
Yes you could, but it is no different to writing a view in SQL (Tableau is just passing the CustomSQL as if it was written in a view in the DB)...also (and I'm no DBA...as my DBA will attest to...she describes me as 'having just enough knowledge to be dangerous'!!) this kind of thing might best be done with a Stored Procedure which would need to be written in the DB.
One further thought I had on this, and is something I do more and more...Is to have 2 data-sources (in fact I sometimes have 3 or 4), where each datasource is at the 'grain' required for the Viz. So in your example most of your Vizes might well only need your current datasource, so build them off that, and then create your densified one and use if just for the above problem.
In one of my Vizes the user can go right down (we work in retail btw) to Store/Product/Day level (around 120M rows) but most of the Vizes don't need this level (so no need for Tableau to process 120M rows everytime), so I have one datasource at Store/Category/Day level (around 8M rows), and then (using Actions) they only 'flip' to the 120M row datasource for the Vizes that need that level.
...just a thought...again it's all in the data-prep
I have a similar use case, but I am using Tableau 8.3, so the LoD functionality is not available. How could I achieve this?
Please note upgrading is not an option--my company uses Windows XP!
I would definitely mention to your IT that Microsoft no longer support XP, so are no producing security updates...etc....so there is a potential risk to continue using this (btw I have just been doing some work for a Client, who is still on XP and they did manage to install T9 desktop).
However in the solution here the only LoDs we have are for the Date Opened LoD and Date Closed LoD...so these could be written in to your data view (as part of the ETL)...btw due to the performance issues with using FIXED LoD (although since Tableau 9.3 this has go much much better) I often develop (PoC) a model using LoDs, but then get these formulas written back into the data when we move to production. But as we already have both Date and Id in our VizLoD already, we can get away with replacing the 2 LoD calculations with
these would then set to compute using [id], and then the above method would/(should!) still work...although will add a little more complexity (as we have another set of table calculations)
Thanks a lot for the reply. I have realized the structure of my data is different, though, as I do have the open and close dates in the same record.
I have watched the training videos on table calculations and aggregations, but I am still stuck. I have also tried following this article Show Records That Fall Within a Period of Time | Tableau Software
Attached is a sample of the data. Unfortunately, modifications at the source are not an option.
I am not sure what I am missing, but I suspect it is my lack of understanding of table calculations.
Issues.xlsx 18.0 KB
So this does make things a little trickier, and may not give you the perfect results (that will depend on your data). The reason I say this (and I suspect it is the densification that confusing, rather than the Table Calcs) is that this method relies on densifying the data (within Tableau), and we can only do this (in this situation) using a Range Aware pill (a range aware pill is a date or bin where it knows it's MIN, MAX and increment). In the original example, as we only had one date field the dates were densified from the MIN to the MAX, but for you we have to use one field (the Open Date) so the last date we can report on is the last Open date....so if you have lots of issues every day, so the open date would include today (or yesterday's) date, it'll be fine. If not you could look at adding a 'dummy' row of todays date into the Open Date column and allow the densification up to today. Hopefully that makes sense?
With that caveat, the rest if fairly similar to the original solution...btw I've added an extra calc to the Close Date field, so if it's NULL the close date is today()...this just means we can count open issues to today. The 'Gannt Verification' tab shows how this works. I've also filtered to the data to 4 issues, so it's easier to verify the results
So we add the [Open Date] to the Rows (and select 'Show Missing Values'), and also add [issue] to the VizLoD
Like before we can only access the 'densified' dates through table calcs, so I've created the following 3
//gives access to densified dates in calculated fields
[Dense Open Date]
IFNULL(LOOKUP(MIN([Open Date]),0),PREVIOUS_VALUE(MIN([Open Date]))+1)
//Lets us access the issue opening date, in all densified dates
[Opening Date - Window Max]
//Lets us access the issue closing date, in all densified dates
[Closing Date - Window Max]
WINDOW_MAX(MAX([Close Date (to Use)]))
As before Tableau populates all the densified cells, regardless of whether the issue was open at that point, or had closed previously, so we need a formula (using the above calculations) to only equate to true if the issue was open
[Ticket Open Indicator]
[Dense Open Date] >= [Opening Date - Window Max]
[Dense Open Date] <= [Closing Date - Window Max]
NOT(ISNULL([Dense Open Date]))
We then want to turn this into a 1/0 so we can sum them (btw I could have done this all in one step, but have broken it out to help the explanation)
[Ticket Open Indicator [1/0]
IF [Ticket Open Indicator] THEN 1 ELSE 0 END
and then finally we want to Window_Sum these
[Ticket Open Window_Sum]
WINDOW_SUM([Ticket Open Indicator [1/0]]])
Now, like the original solution, we need to run the different table calcs at different levels of compute using/partitioning, so we set the [Ticket Open Window_Sum] to compute using Issue (as we want a single return over the issues: i.e. if there were 3 issues open for a particular day, we want all issue marks to show 3)
but all other table calcs are run by Compute using = Open Date.
The results of this are shown in 'how it works 2' tab (I've also put the above calcs into the ToolTips to help you follow it through)
So the final part, is to only return a single mark, so we use the IF FIRST()=0...END, and set Stack Marks to off.
So again pretty complicated, and you probably wouldn't want to build a suite of dashboards having to use this every-time. If that is the case, and you need to build lots of visuals a little bit of time spent getting the data in the right shape (densifying the dates at source, and normalising the table [so there is only one date field, with a tag for Open/close]) will be gotten back 10x over!!