Why are you keying on PostgreSQL for your monitoring efforts? Are you monitoring other processes?
Hi Russell ,Thanks for your response,we have picked Postgre and Wgserver as we feel them as key process.
we have automated monitoring via the following two methods, why don't you do something like this?
Our monitoring app does an http call to our default Tablea server portal and checks for a return code of 200. Behind the scenes, I believe this is being done by a curl command, but probably could also be done by a wget command.
/home/nagios/libexec/check_http -H insights2.xxx --return-code "200"
Our monitoring app calls out to insights.xxx/admin/systeminfo.xml. This XML return specifies status for each service. But you need to whitelist your monitoring app so that it doesn't need to signin to Tableau server. Our monitoring app looks for the following status and anything other sends out an exception. service status="OK", “Active”, “Passive”, “Busy”, “ReadOnly”, “ActiveSyncing”
So, a couple thoughts -
I'm not a powershell guy, but get-process simply gets information about a specific process, right? If the process doesn't exist, it returns some sort of an error - and that's what you're trying to key on...however it can't tell you anything about the health of the process...So, just seeing if a process exists isn't particularly valuable because it could be there, locked up tight and not able to do any meaningful work for you....it is essentially down but your script would return "OK" simply because the process is indeed there.
I also wouldn't worry too much about wgserver. It used to be more important when it acted as Tableau Server's Application Server, but it hasn't done that since 9 - now it is more like a human appendix - it only handles REST API calls. If you're interested in the portal / app server, you want vizportal.exe
I like Jeff's second idea a lot - you're leaning on infrastructure you don't have to build yourself.
Just like the previous Jeff mentioned our organization uses an open source alert monitor software. We have it set to monitor for the HTTP 200 OK also for Tableau servers Apache. We also use that same monitor to check the status of the certs and ping for the Windows server that hosts Tableau also.
Another thing I experimented with when I was still running Windows server in the past was creating a task in Windows task manager; Task manager would look for critical events in Windows Event viewer under certain codes or services related to Tableau ( that seemed dubious or looking back on the logs when bad things happened- that event viewer said regarding Tableau) and if one occurred it would send me an email.
But I also recently started to read this blog and its links. How to Monitor Your Tableau Server – Part 2 – Tableau Server Application Monitoring | Paul Banoub's VizNinja Blog For more ideas.
I am also looking at using REST API to check under the hood a bit too.
Following on from others posts.
Here is a link to a powershell script which cane be used to monitor the Server Health by reading in the systemhealth.xml data, and then email an alert message
(Similar to what Jeff Strauss is saying)
All the best