1 of 1 people found this helpful
The http_requests are best looked at as a group of activities for each click. Some records will show the user ID, others will not. All will show the remote_ip of the user. This can be used for grouping of individual click activity. Records can further be grouped by
- session_id, movement within the Server structure, or
- vizql_session, opening and moving around within a workbook.
If you have a QA or DEV enviornment, try following the entries into the http_requests table as you perform different steps. This will give you a better idea of how the server works and you can see the individual program steps to each click.
Hope this gives some insight into the table.
Hi, Matt. Let me begin by saying "I don't know". Now, with that out of the way...
I suspect that some of the http requests are generated "internally" (for lack of a better word), and aren't by default associated with a user.
For instance, the HTTP 302 redirects that happen during auth show as NULL for user_id.
Furthermore, in the http access logs, for each view requested, there is a GET request that ends in "?:embed=y&:from_wg=true", and all the POSTs and GETs to /vizql/* after that have NULL for the user_id in the http_requests table. Again, it's educated speculation, but I suspect that wgserver may be doing the work here. Indeed, looking into the wgserver production logs seem to indicate that it is doing said GETs.
I haven't looked into other occurrences of NULL appearing in the user_id column.
What is your use for the info? Perhaps there is another approach that will accomplish your goals. The hist tables contain much of the same info, recorded as "Access" events. (They do not contain duration info, however, despite having a column named as such.)
Thanks for the helpful detail, Chauncey Brandom and Rick Kunkel! The goal I'm trying to accomplish is answer the question "What has been the experience of User Y with our Tableau Server's performance?". Generally this would be a response to a complaint from User Y so that I could narrow down the problem (How much slower are we talking? What views might they be complaining about? When did we begin seeing a performance degradation? What data connections exist in the workbook at issue?). But I'd like to use it proactively as well, of course.
The hist tables are helpful as well, and I could get answers to my questions using a combination of the hist data and the perf data from http_requests, but the two sources of data are not easy to correlate and my guess is that I'd have to hop back from one view to another until I'd narrowed down who was looking at what view at what time, and how long it took. But that's very manual and frustrating to do. I could also probably find a way to correlate the historical_events table with the http_requests, but it would make the query complicated and perform even worse than it currently does.
Have you made much progress on this?
Might want to verify on another server but looks like most of our requests have everything they need in that http_requests table. If a user_id is not present many times there is an associated vizql_session appears to tie it to a user_id.
So doing something like the attached query catches ~90ish percent of the queries. It's still pretty slow without adding some additional indexes.
http_requests.sql.zip 454 bytes
I've given up on associating users with all requests for the time being. The technique you outlined certainly gives me user_id values for sessions where it's been logged at least once, but the problem with my data is that I have sessions with NO user_id at all. So it doesn't give me enough non-NULL user_ids (using a LEFT join) to allow the data to be consistent enough for user analysis. I'm not sure what's going on exactly that's causing this on our server. I could get the right user_id values by parsing the remote_ip column in a self-join or subquery to the http_requests, but again, that's more complex and would hit the server more than I want to.
Still, thank you--I appreciate the help!
Hey Matthew -
If the goal of this exercise is to understand "What has been the experience of User Y with our Tableau Server's performance?", there is probably an easier and better way for you to approach same.
Trying to use http_requests and associated tables to understand the true amount of time spent rendering a viz is not a trivial exercise. If you look under the covers at "our" Performance History report you'll see that it is a very, very complex thing. You're really trying to recreate this particular wheel, and I wonder if you should even bother.
Here's what I'd suggest instead:
Simply run the Performance History report and then select "everything" on left-hand side:
Then, dump data in the "Underlying" tab after choosing "Show All Columns":
You now essentially have all the same data we use, without YOU needing to hack up some SQL yourself. Note that this raw data still isn't perfect - you can see all the NULL username values for actions we don't track identity for. The View Performance History report has ways of associating these with the correct request...So, you may see where I'm going with this....
Here is a slightly "hacked" version of the same report literally showing the report execution time for two users: Anakin Skywalker and Darth Vader:
This is just the "right hand side" of the same report over a long time period. You can still see lots of (blue) Null values, but it's pretty easy to see / understand who owns what. For example, At point #3 we see a NULL render in the (upper) "Compute View" pane. However, look at the Load View pane directly below- clear as day this is Anakin Skywalker doing something...so we can easily infer that #3 is Anakin, too.
I guess what I'm getting at is that it will be nigh impossible to get exactly what you want without expending tons and tons of effort...so why bother - use what is already there and generally "good enough" to create a solid performance baseline history on a per-user basis.