1 Reply Latest reply on Nov 12, 2018 12:35 PM by Mark McGhee

    Can I take one worker node offline?

    Jack May

      I'm having trouble with services stopping on one worker in a three node cluster (Tableau server v10.5). I've opened up a ticket with Tableau support, but in the meantime I'd like to take that one worker offline. I've asked our network operations team to take it out of the hunt-group (the cluster is behind an F5/load balancer) but background tasks are still being performed by the degraded worker. My question is about load balancing and request traffic. Pointing the load balancer to the two remaining nodes doesn't prevent work from being routed to the degraded worker, does it? If I'm understanding it, the F5 just directs requests that come from the internet, and maybe background tasks originate with the primary node(?). Is there a way to take one worker out of service until I can put a fix in place? I know that will decrease processing capacity, but I'd rather all jobs run slower than have some jobs fail on the degraded node. So how do I take that worker offline?

       

      Thanks,

      -Jack May

      IDEXX Laboratories, Inc.

        • 1. Re: Can I take one worker node offline?
          Mark McGhee

          Hi Jack,

           

          You're understanding is correct in that your load balancer would really just be routing to the nodes that are running the "Gateway" service to be able to handle incoming requests.   The load balancer really won't have awareness that node "X" is running the backgrounder service.   Each node in a Tableau Server cluster basically is aware of what other services are running on the other nodes.   Those that are running Gateway get the request and may see that if Backgrounder is needed, it will route the request to one of the Backgrounder nodes.    If a node is being problematic, you may want to remove it from the Tableau cluster altogether.  When you do this, the other nodes will take that node "off their lists".   But, as long as the Tableau Services are running on that worker, it will still try to connect to the Primary and can cause issue.   One customer I worked with, they had disabled Tableau services on the removed node.  But then their IT team did Windows OS patching for all the nodes in the cluster and restarted all of them which resulted in the "removed node" becoming active again and it did cause issues until we stopped service.  They really didn't plan to add the node back to the cluster anytime soon so recommended they uninstall the Worker software from that "removed node".