1 of 1 people found this helpful
The behavior you're seeing probably doesn't have anything to do with your tabadmin dbpass.
I suspect that you are figuring out the "window of time" your rows cover by eyeballing the _http_requests table. Since it records so many actions, we keep it slim by truncating it to 7 days worth of data any time you do a tabadmin backup or tabadmin ziplogs.
If I had to guess, I bet you backed up this box a few days ago, we truncated to 7 days of rows, and you've since added another 2-3 days worth of stuff.
If you want to keep the contents of these tables permanently, you should consider some sort of an "export job" to move them to a different place, or maybe leverage a Tableau Extract.
Hope this helps!
You are correct - the 1-July data was 7 days prior to tableau server application upgrade.
I have now had some good success in creating some viz's showing Who, When and What views are being created - from the _http_requests, _sites, _users, _workbooks, _views. I have a few concerns that you may be able to guide me on:
It would be great if the _http_requests table would store the viewid against each row - that would involve less complicated calculations to derive the viewname from the controller field. Could you pass that idea on to the dev team?
Many view activities show a userid = Null. - Update - looks like those view with :embed=y - and just looking at 7.0.7 release notes this appears to be resolved in the this upgrade.
Spreaking of upgrade, my major concern continues to be the truncation of this valuable data every time the app is upgraded (or log files trimmed). We are looking to upgrade in the next month or so, and I don not with to lose this valued user acivity date. You have identified a couple of suggestions:
1. should consider some sort of an "export job" to move them to a different place,
2. or maybe leverage a Tableau Extract.
For Option 2 - Incremental Tableau Extracts are fine up to a point - but when afull data extract is required - then all of the data will be lost forever. Unless you can suggest a better way?
For Option 1 - can you provide any details on how / what would have to be extracted - and options for places to extract to?
Both techniques have advantages and disadvantages.
The advantage of the "extract" technique is that it's easy - you pointed out the disadvantage -- that you must continue to use the same incremental extract forever and that there is no way to re-build it with older data if something goes wrong.
The disadvantage of the first technique is that you need to have skills with another database or Extract/Transform/Load technology to make full use of the idea. Most people that I know who use this technique leverage SQL Server Integration Services (ETL) packages to copy the data from the Tableau tables into permanent copies on SQL Server. Others write SQL scripts to copy the data from Tableau's tables to copies of the tables in a different PostgreSQL database.
If you don't use these databases yourself, maybe you could accomplish the same thing with a more simple database that allows you to import from an ODBC source - for example, you might be able to use MS Access to grab and store the data?
Did you find a better way to join _http_requests onto _views? I managed to do it using the currentsheet field joining to a concatenation of workbooks.name and views.name but it seems 'messy', and get into trouble when there's multiple copies of the same workbooks...
How did you calculate the number of distinct view 'Accesses'? i.e. I log on, click around a view for half an hour, and it generates 500 rows in the http_requests table. I tried a distinct count of session+user+view but again, it seems 'messy'.
I've had a dig around on the underlying tables (i.e. not just the views starting with an underscore) but didn't quite find a neater solution. I'd be really interested to hear if you worked anything else out or if you want to share notes then please contact me... my email is in my profile or just look me up on linkedin.
If you worked something out then please share here in the forum. Since two people have asked about this I'm sure there are others
I ended up getting a pretty good solution:
a) retaining history - we did end up implementing a SSIS solution to essentially take a copy of all the workgroup tables, and store them in a SQL Server database. The http_requests updates daily - incremental.
b) as for joining tables, I ended up only needing _http_requests, sites and users. By putting together some tricky text calculations, i ended up with workbook and view dimensions. It doens't solve the duplicate workbook names issue - but that is pretty rare in my case.
c) as for distinct view events, if you filter 'action' to including only 'show' and 'update_tab' you tend to get a pretty good count of user activity. Also, make sure you filter just events involving views (isView = TRUE)
I enclose a tds file to assist, and a sample of the sort of dashboard you can build with the dataset.
Hey Craig.. great work.. is it too much to ask to let us know what Dimensions and measures you have used to get those metrics. I am particularly interested in getting the visit counts for each dashboard and of possible the last date it was accessed at.
Your post seems very relevant to what I am trying to do currently. I am trying to track the count of accesses, when a users gets to a workbook that is published to "show as tabs" by accessing the workbook through the email URL (the one that is available in the "Share" button functionality) . I know the event does go in http_request table but I am running out of ideas to find the best way.
If anybody else has also tried this, please share your approach.
Thank you in advance!