For starters, Jonathan has pointed me to this page:
If Custom SQL is an option, here is one option using Domain Padding (Show Missing turned on for the date pill), and Running Total Quick Table Calc:
SELECT *,"Arrival" AS [Type],[date_arrival] AS [Date] FROM [survey_data#csv] UNION ALL SELECT *,"Departure" AS [Type],[date_departure] AS [Date] FROM [survey_data#csv]
IF [Type]=='Departure' THEN DATE(DATEADD('day',1,[Date])) ELSE [Date] END
IF [Type]=='Arrival' THEN 1 ELSE -1 END
The Date Shifted and Count fields work together to display the correct daily count.
Count.twbx 28.2 K
This Custom SQL (with the legacy connector) creates one row per date between date_arrival and date_departure:
SELECT * FROM [survey_data#csv] d, [date_scaffold#csv] l WHERE l.[Playa Dates] BETWEEN d.[date_arrival] AND d.[date_departure]
Comparison between [A] Joe's and [B] this comment's solution:
1) Both [A] and [B] give the same aggregated result.
2) [A] returns fewer underlying rows (40 2x) than [B] (251 12x) and therefore much better in this regard.
2) Implementation of Date period as multiple marks would remove the need of reshaping underlying data (20 1x).
3) [A] gives wrong result if data start date filter is applied and data start date is older than filter start date.
3) Filtering on data start date <= filter end date AND data end date >= filter start date should work though.
4) [B] shows who underlying persons are via tooltip > view data and therefore better if this is needed.
Tableau has seemingly no locale language option for the legacy connector. Therefore I had to change Windows Region and Language Location = United States and Format = English (United States) before getting correct dates. I cannot open attached workbook after changing my Windows Region and Language back to its original (Faroe Islands + Faroese).
Attached Workbook Version: Tableau 9.0
Thank you, Joe! and Kettan!
While my aim was to solve this without reshaping, what we've found is that there's really no good way to do so. I did manage to make it work on the original data shape, with a scaffold and table calculations. But that was complex, brittle, and would be difficult to maintain.
Because this is common type of problem, as evidenced by Jonathan's catalog of similar scenarios started back in 2012, I've written a blog post to explain the pattern.
Better to see the forest, for the trees!
Is it possible to do it without doing a SQL joint?
Do you have suggestions on how I could go about creating this same type of graph however without generating the missing rows in order to perform the counts? 60mins x 24hrs means 1440 entries per day. Because I am going to be measuring two years at a time, I want to avoid this if possible.
I have attached my example workbook. Any help would be appreciated.
Example Workbook.twbx 10.2 K
You can perform a pivot of the your two date fields in the connection, like:
Then you can create a calculated field like:
IF [Pivot field names]=='Out of Service' THEN 1 ELSE -1 END
You can then use a Running Total Quick Table Calculation for a pill based on that calc field.
When you place the 'Pivot field values' date time field on the Columns shelf, use a DATETRUNC style Minute and turn 'Show Missing Values' on:
This route does add complexity, eg to address Kettan's point 3 above, you would need a filled table calc date field to filter by dates properly (performed in the attached).
For 2 years, this will generate about 1 million marks but if you are looking for a different view, with other pills, that may impact the analysis. Be aware this approach does have fragility.
Help jm edit.twbx 25.1 K
Sorry I should have specified that I am extracting my data from Windows SQL Server. The pivot option will only work for Excel/XML data.