I am working with Jérôme and I want to give more information hoping this will help anyone to answer.
Each action made by a user on Tableau Server will trigger a call to Tableau Server that can be categorized afterwards as an (action / controller) tuple. All these actions are recorded in the table http_requests into the internal PostgreSQL database.
What we want is to do is to measure the performance for different user stories (loading a view, filtering, etc.), which requires to get more information about what these tuples mean and how they relate to each other.
For example, when a user load a view, Tableau Server makes the following calls:
- (show, null)
- (bootstrapSession, null)
- (get_customized_views, null)
- (performPostLoadOperations, null)
The list of tuples we found in our databases when the field vizql_session in not null are the following (unfortunately, description are incomplete as this is the purpose of this topic and there might be errors at this stage):
(when action = sessions)
When is this happening? Description
Loading a view When a user loads a view, a session is bootstrapped export exportcrosstab get_customized_views
Loading a view
Editing customized view
Tableau gets the list of customized views for the logged user when a view is loaded get_global_filters getautocomplete getfilteritems performPostLoadOperations Loading a view Executed at the end of a dashboard loading region save_customized_view Editing customized view When user save a new customized view search searchwithindex sessions add-manual-items-to-filter sessions categorical-filter Interacting with a view sessions categorical-filter-by-index Interacting with a view sessions drop-nowhere Modifying a workbook sessions drop-on-shelf Modifying a workbook sessions drop-prepare Modifying a workbook sessions ensure-layout-for-sheet sessions get-drag-pres-model sessions get-show-me Modifying a workbook sessions keep-only-or-exclude Interacting with a view sessions level-drill sessions pane-pan-server sessions pane-zoom-server sessions pdf-export-options sessions pdf-export-server sessions png-export-server sessions quick-sort Interacting with a view sessions range-filter Interacting with a view sessions redo Modifying a workbook sessions refresh-data-server sessions relative-date-filter Interacting with a view sessions render-tooltip-server Interacting with a view User hovers a mark that have a tooltip defined with the cursor sessions restore-fixed-axes sessions revert-workbook sessions run-action sessions select sessions select-legend-items sessions select-none sessions select-region-no-return-server sessions set-active-zone sessions set-auto-update-server sessions set-parameter-value Interacting with a view sessions set-pattern-quick-filter-state sessions set-port-size sessions set-quick-filter-domain sessions set-quick-filter-include-exclude sessions sort-from-indicator sessions toggle-legend-server sessions undo Modifying a workbook set_default_customized_view Editing customized view show Loading a view show_metrics_view showadminview Administration Accessing administrative dashboards tempfile tile update_customized_view Editing customized view viewData
Accessing underlying data
Access raw data vud
Some look obvious, other not that much.
At the moment, we have no idea how these actions interact with each other (if they do) except for loading a view.
It's likely that most things we want to measure regarding interaction are when action = 'sessions'.
It sounds like the goal is to improve parallelism. Are you using version 9? The performance improvements in version 9 have been amazing and introduced many new forms of parallel transactions that never happened in version 8.x. If you're not there already, try 9. You might find your work here has already been done for you.
Hi Matt Morgenroth,
The goal is to monitor what are our slowest dashboards without being limited to monitor only the initial loading time.
Wouldn't it be nice if you could see when interacting with a dashboard that the quick filters are not performing well?
Wouldn't it be even better if you could do that at scale for your 1000+ dashboards?
We think so
That being said, we are not sure that filtering is limited to only one record in the http_requests; we are not sure what are the relevant actions; and we are not even sure all the events are recorded in this table.
1 of 1 people found this helpful
easier to look at the aggregate load times and take those that are just overall slow
I'm sure you've see the load time admin view already.
if a view never loads quicker than your desired threshold, the dashboard needs a rethink
you can run performance recordings in both server and desktop which the author can use to further tune their own workbook
this sort of precise recording of individual actions might be possible with the data available, but is likely to have some challenges in directly reprsenting the timings you are looking for
if the min load time is very slow and the average session time for the URL is also quite high, or higher than the rest, you have a solid argument for contacting the author to review the performance. Identifying the performance of a single feature is going to hard. Especially since features can have dependencies on so many other things that could be contributing to the bottleneck.
I can't say you wont' find what you need, it's just my opinion that it's a very tough assignment.
Are you on version 9? If not you might find you can spend this time elsewhere after realising the benefits of the new tech.
We are currently benchmarking Tableau 9 against Tableau 8.
We already look at the loading time but we have seen dashboard slow to load and fast to interact with, or on the contrary, fast to load and slow to interact with.
This actually might not be a bad idea to check the interactive time at a less granular level to just detect slow dashboards, knowing identifying them would already be a good step.
Thanks a lot for the answer.
Diagnosing specific issues within a dashboard is a problem we first started trying to offer tools for with the performance recorder.
Having that capability via the logs or postgres, to be able to pinpoint a problem in a dashboard? Sure that would offer some value to the author when they're asked to reconsider their design for performance.
Whether you can pinpoint the exact feature that appears to be the cause or not, the exercise is the same right? Contact the author and require that they refer to performance recodings and whitepapers to improve their baseline load and interact time before republishing (or before promoting to production).
Giving the author the name of the quick filter, or perhaps even the group of members within that quick filter which particularly slow things down for interactivity would be helpful, but the exercise of improving that interaction will involve several considerations outside of that pinpoint.
A good starting point is to tell authors to benchmark before publishing. Use the performance recording. State up front that load and average interaction times must be under a certain number of seconds or your published work will not be promoted to production.
They can test this in Desktop on their own or in your qa/test instance of Tableua Server for a more 'real world' feel. Server will always be slightly slower than (increasingly so as the workbook's efficiency decreases).
It surely is something we advice our Tableau Desktop users. Knowing we have 50+ people around the world pushing workbooks to Tableau Server, it's hard to enforce the guidelines are followed.
In that regard, having a way to monitor this is still a good thing as it can help us target the Desktop users who might need additional training.
In other words, we don't want this for helping the author, we want this to help us, the team in charge of Tableau Server, to take better actions with our authors.
There is, to my knowledge, no automatic way to know what the average interaction is. If we have to manually check all the workbooks, we will add a delay between conception and deployment of some dashboards, and this is something we are trying to avoid with Tableau.
By the way, this is a bit unrelated, but it seems that the Performance Recording doesn't show the loading of quick filters.
Let me add a bit more context here.
I think there are 2 objectives here for us:
1) Be able to give more visibility to each report designer on how to improve the performance of his reports:
Eg: A report designer could see that a particular filter is responsible to x% of his report's load time.
2) Be able to know if the overall user experience on tableau is getting better or worst.
Eg: Jérôme to his clients: "Median report's speed used to be 10s in January for our users, now it is 3s, so it's getting better"
Our key need here is to adress point 2. Because we cannot simply correlate the user actions on reports with the actions in http request, we are strugling to provide a realistic measurement of user experience. It might very well be that initial loading time is generally fast, like 3s, as seen from "bootstrap sessions" actions, but that the rest of the navigation is generally slow, because of the way reports are designed. And today, we don't know how to measure that.
Idealy, we would love to provide performance KPIs on reports which account for user experience and navigation patterns.
Eg: Jérôme to his clients: "Median navigation speed (changing a filter, changing tab...) used to be 10s in Jan and is now 5s. So report design is improving and best parctices are better applied. And by the way we identified those bunch of reports as being very slow for certain navigation patters, so can you ask your designers to work on them?"
As we have 4000 views each day, on hundreds of reports, we have to have a macro approach and cannot analyze each report individually. We generally classify reports given their destination client (sales, business development...)
As we have no mapping of user actions versus actions in http_request, and as some http_request actions are executed in parallel, while some other are not, it's hard for us to get those macro numbers on navigation speed.
FYI: Please note that the "postLoadOperations" records will not be pushed into http_requests starting in 9.1, so using it as an anchor point for when the viz stopped loading will no longer work. "get customized views" is probably the next-best substitute.
Thanks for the information. We didn't use it actually to monitor the performance as its execution time is not significant.
Getting render times are tricky. The following SQL is to run against your PostGreSQL database. We run it on our 8.2 install. We are adjusting the times for US Central.
SQL for Calculating Workbook Render
q.SITE_ID, w.PROJECT_ID, MAX(w. ID) as Workbook_ID, s.Name as Site,
p.name as Project_Name, w.name as Workbook, q.VIZQL_SESSION,
(DATE_PART('minutes', (q.COMPLETED_AT - MIN(r.CREATED_AT)))*60 )
+ DATE_PART('seconds', (q.COMPLETED_AT - MIN(r.CREATED_AT)))
1 as ViewEvents, MAX(r.user_ID) as SiteUserID,
MAX(u.System_User_ID) as SysUserID, MAX(z.friendly_name) as Username,
date_trunc('day', (q.COMPLETED_AT - interval '6 hours')) as RenderDate
(Select r.SITE_ID, r.VIZQL_SESSION, r.CURRENTSHEET,
MIN(r.COMPLETED_AT) as Completed_AT
From HTTP_REQUESTS r
WHERE r.COMPLETED_AT >= '01-JAN-15'
and r.ACTION = 'performPostLoadOperations'
and r.VIZQL_SESSION is not null
GROUP BY r.SITE_ID, r.VIZQL_SESSION, r.CURRENTSHEET ) q
HTTP_REQUESTS r ON q.VIZQL_SESSION=r.VIZQL_SESSION and
r.COMPLETED_AT <= q.COMPLETED_AT
LEFT OUTER JOIN _sites s ON q.SITE_ID = s.ID
LEFT OUTER JOIN _views v ON q.CURRENTSHEET = v.VIEW_URL and
r.SITE_ID = v.SITE_ID
LEFT OUTER JOIN _workbooks w ON v.WORKBOOK_ID = w.ID
LEFT OUTER JOIN _projects p ON w.PROJECT_ID = p.ID
LEFT OUTER JOIN users u ON r.USER_ID = u.ID
LEFT OUTER JOIN system_users z ON u.SYSTEM_USER_ID = z.ID
GROUP BY q.SITE_ID, w.PROJECT_ID, s.Name, p.Name, w.Name, q.VIZQL_SESSION, q.COMPLETED_AT
I had proposed to present this at TC15, but was not selected.
We've tried using this with some success--but be warned that the "performPostLoadOperations" anchor will no longer exist in 9.1. "get customized views" is the next best thing.
Have there been any updates to this table? We're wrestling with everything the logs are telling us, trying to understand what's being tracked and how.
1 of 1 people found this helpful
Measuring performance through server is important, but you should separate streams a bit here so you have the right person working on the right thing.
Workbook design is a concern if performance is an issue on Tableau Desktop. That needs to get resolved in Tableau Desktop.
A noticeable deviation in performance once published is expected and not in control of the Tableau Desktop author, nor the Tableau Server admin. Something loading in 1-10 sec in desktop could add a few seconds after publish, the larger the load time in Desktop, the greater the natural deviation in Server. That is getting better in 9.3 due to launch soonish, but largely in us managing perception and moving load costs around. There will always be loading overhead that we won't be able to make go away without folding space and time.
If your normal deviation for a 5 second loading dashboard in desktop is about 3 seconds in server, then it's not worth bothering the server admin about performance (or network admins), but if a 5 second desktop load time is troubling then the author should work on that in Desktop and get it down to 2-3 second load and hopefully that translates to a 5 second total load in server, but client network/location would contribute to deviations as well. patterns with IP addresses are possible to suss using the methods mentioned in this thread, but you'll have to blend in the data showing the IP groupings by subnet/office/vlan/etc.
So think about who needs to work on this. If a desktop load issue, don't speak in terms of http requests. If a large deviation in load from desktop to server, then the http requests will help you tell the story.
Does the server load time sometimes get within reasonable limits but frequently takes longer? when does it take longer and what corresponds with that? Higher server load? Higher database load? VM snapshot (bad don't do those)? Higher rate of requests to server? Busy network period? You get the idea.
Hope that's helpful food for thought.