I took an initial look at this and have a couple of questions:
- I can see that there is a row with a Null date_to for each machine name, what would you want that to end as? The current date, the end of the day, or what?
- The earliest date_from for each machine is different, what's the the start date for these analyses?
- What I'm trying to get at with this is to figure out what timeframe is going to be used as the denominator on the start/end so the % of total comes out as you want.
I've attached another version which has an extract of the original data as tracked in the mysql database and my converted set of dates.
The null date is because that state has not ended yet ie thats its current state, so i guess the from date in that case is now.
I've extended the data back into December, I'd like to start the analysis from 1st Jan
For each day from 1st Jan to today, what % of machines were at each state. That would be uptime for total facility
Then probably set it to be look at the last 3 months or so on a rolling basis.
Does that make sense?
There are a few ways to do this that I can think of:
1) Reshape your data so that there is one row per machine/state per day. Normally I do that with a cross product against a table with a range of dates and give the query the necessary where clauses to limit the number of rows returned, adding a column for the day from the cross product. So, for example, for machine HS1 that has three status changes in early 2013, there would be two rows for wash required from 3Jan to 4Jan, another 7 for the up from 4Jan to 10Jan, and another 7 from the planned repair from 10Jan to 16Jan. Then you can have a "Time in Status" calc field that is something like:
// from and two dates are within the same day, so count duration in between
IF date <= date_from and (date+1) >= date_to then
// from is on this day, to date is later than this day
ELSEIF date <= date_from and (date+1) < date_to then
//from starts before this day, to date is on this day
ELSEIF date >= from and (date+1) >= date_to then
date_to - date
//from date is before this day and to date is later than this day
Now you have a measure that can be added & subtracted as necessary, TOTAL() wrapped around it for doing % of total, etc.
2) Do a union of the data set and add two fields for a "date" and a from/to dimension. Then you can turn on domain padding for the date, and have some fun with table calcs because you'll need to use table calcs to fill out the from/to dates in the padded data (using LOOKUP and PREVIOUS_VALUE), then additional table calcs for the calculations themselves. For this reason, I recommend #1.
Either way, #1 and #2 will work so long as you are ok with Tableau showing only those states that are in the date range being filtered for. If you want to show all possible states no matter what, then you're going to need to do some padding for that, or maybe get away with a table calc filter.
3) Create a bunch of row-level calculated fields (one per state) and then use a measure names/values kind of view. This is perhaps the easiest to create in Tableau, but requires more work to maintain as new measures are added and can limit what kinds of aggregations you can do.
Here's what you need mate. We do this every day. Custom SQL splits every row with a start and end date into two rows while incrementing or decrementing the job counter. Then sort on date and show the running total, at any precision, plus split or filtered as desired.
I didn't see a second attachment from you, so had to export the data to Excel to re-import it using Custom SQL. I filtered out all data for jobs still in progress (date_to=null) to simplify things but you could instead just filter out date=null to keep track of the latest real-time state.
If this gets you where you need Jonathan or I can show you how to add a table filter to be able to show a subset of this data without losing track of the running total which has to start at the beginning of time.
Uptime ZL.twbx.zip 115.6 KB
Thanks for the Jonathan, will have a read and try some of these examples.
thought you might recognise the data and the problem, thanks for taking a stab at it. Will give it a look and get back to you.