Skip navigation
1 2 3 4 Previous Next

Toby Erkson's Blog

51 posts

This year of 2019 has been gloomy for my Tableau world as I've been stuck with v2018.1.4 for a year   Why so far behind?  I have to move from our current but "old" VM (Virtual Machine server) running Windows Server 2008 to a new VM with a current Windows Server 2016 OS, the VM team could not upgrade the OS on the VM box plus it would wipe out everything on it (their words).  The new environment was not set up exactly like my current production environment like I requested so it's been a total hassle with me having to work with an IT department that leaves much to be desired.  However, were are a couple superstars that helped me and progress is happening.  Another time loss for me was taking on the additional server administration position of three new environments:  Alteryx Server 2018 (which is horrible software), Alation, and on-prem Power BI (because we have a couple thousand SQL Server databases so it's "free"...whatever ).

 

First, just a recent version upgrade test

I have the VMs setup and after different tests I'm finally ready for a real test drive.  Obliterating the primary and secondary servers I installed 2019.3.0 raw, meaning, no production backup restored, just the basic install and nothing more.  I want to keep it simple.  I installed 2019.3.0 first and nothing noteworthy about it.  Next came the first test (since it originally failed prior to some VM tweak) of the upgrade from 2019.3.0 to 2019.3.1 was just under 20 minutes.  Spinning up TSM was about 5 minutes.  So less than 30 minutes and I had upgraded versions!  That's fast compared to the versions prior to 2018.TSM.  Uninstalling 2019.3.0 took under 30 seconds.  Nice.

 

During the upgrade script I refreshed my TSM screen and got this:

Panic!  But I reminded myself that TSM was being upgraded so it would make sense that it would "fail".  I sat on my hands and waited.

 

After several updates in the command window I clicked the refresh button in the TSM browser window and got this:

Okay, that makes sense.  I like that kind of error message

 

The script popped this up...

...and I thought, great, this is gonna be a long one but it was only a couple of minutes and it continued on.  Yeay!

 

Restore a backup

Next, I restored our production backup.  That took an hour sitting at the 24% mark, 20 minutes at the 27% mark, etc. Here's an interesting bit, the necessary services required for indexing were spun up (75% mark) and here they are:

 

So we get an idea of the basic processes necessary to perform indexing

Okay, from stop to start if looking at it from TSM's viewpoint the total restore time was less than 100 minutes.  The complete output:

According to the Server Disk Space report my primary is 295GB and secondary is 168GB in size.  I happened to open the Background Tasks for Non Extracts and saw this big surprise!

Over 1.6K pending tasks!  And this is after an hour of the restore!  Scrolling through the list there was one line for One Time Enable Flows Site Settings OnPrem, one line for Enqueue thumbnails upgrade, and all the rest were Upgrade thumbnails.  Interesting.  Another reason for performing the upgrade during low peak times.

 

Next?

Actually, now I will blow it all away, downgrade to our current version, restore a backup, then upgrade to 2019.3.1 along with all the server config tweaks we use.  I expect things to flow smoothly now so there's no need for me to document it.  If anything surprising pops up I'll mention it.  If you have questions I highly recommend asking the in the Server Administration forum where you'll get a much better chance of someone answering it...and probably more knowledgeable!

Ciara Brennan posted a link to an interview with current CEO Adam Selipsky, "Tableau Software - tackling the data mountain" Article in gigabitmagazine, and focused on the community piece of it.  To recap:

 

From Adam Selipsky in the interview by GigaBit magazine, Tableau Software: Tackling the data mountain | Cloud Computing | GigaBit,

“The company also has this really unusual and unique asset – and that’s the Tableau community. It's an incredibly energizing group to be a part of and frankly, it's also an incredibly important asset for the company. It's not easily matched by spending money on it. It’s something I think has been very carefully nurtured over a great number of years.”

 

This is my commentary on it:

 

Adam is correct in that having a great community is not something that can be created by just throwing money at it and it does take time.  Now, that's not to say money is completely out of the equation!  Funding is necessary for hiring quality personnel for the social spaces and, in our community forums, for decent software.  I'm sure monies are needed in other aspects of creating, building, and maintaining a customer community as well -- Zen and Ambassador programs, Tableau Conferences, swag, etc.

 

For over 20 years I've been in a lot of community forums (email lists and bulletin-boards before forums existed) -- both for my hobbies and for work.  I even set up and admin'd a forum on my server for a buddy's automotive repair shop.  While every forum will have its share of experts, it's the sense of positive community and friendships that can pull people together to build something that is greater.  I've left a leading, subject-specific enthusiast forum because I finally had enough of the negativity.  Having a thick skin is one thing but requiring to be armored for most conversations is ridiculous!  Establishing and maintaining a friendly environment definitely takes more effort, predominantly on the house-side of the equation:  The admin/moderation team must have rules that they enforce and model (i.e. walk the walk).

 

What I find unique about the Tableau community forums is that they are highly successful despite being owned & operated by Tableau.  The majority of forums are started by an enthusiast and if it's created early enough it tends to be the leader; the one where most will solicit.  The enthusiast-owned forums typically run pre-built forum software that is regularly updated, make use of custom settings, and a moderation team is quickly established.  On the flip-side, corporate-run forums are typically sterile:  Cheap software, minimal moderation, minimal answers from staff that require little product knowledge relying on canned scripts and end-users to fill the gaps.  They tend to generate more frustration than answers when it comes to questions not readily found in the documentation.  Tableau was smart to use a dedicated forum package and after a few years they finally implemented forum moderators.

 

While I believe the gamification (points & badges) assists to engage and encourage participation to help others, I think that there were a small yet active cohort of Tableau Desktop users that were the spark that started the strong community we see today.  Their posts of gathering details and then explaining their answers, being professional yet personable and caring, set the tone of the forums.  By setting a consistent example and encouraging others to learn and increase their own skills by participating in answering questions really got the "community" ball going!  I think this was the "special sauce" that gave life to the Tableau community and what makes it unique compared to other corporate-run forums.  With the inclusion of additional social media outlets the community grew and TC attendance -- putting faces to the names along with face-to-face interaction -- accelerated the maturation of community as we now know it.

 

Tableau's community is a rare asset that not only benefits the company but the users of the products it produces.  It's a synergistic system that other companies could emulate to positively differentiate themselves from competitors and provide a more satisfying user experience.

A question was asked today on Twitter about the handling of production Tableau Server environments.  There were a variety of responses and they varied according to the needs of the particular company, thus there was no -- and never will be! -- one singly correct answer.  However, I did notice that there were those who were not employing their Tableau Server environment in accordance to Tableau's EULA (End User License Agreement)   Tableau only allows the following:

  1. One production environment:  Where content (workbooks, data sources, etc.) is published and consumed in ALL ASPECTS, be it QA/testing, user acceptance, and production usage.
  2. Two non-production environments:  Where software testing can take place.  For example, new software version testing and/or beta testing.  I believe very specific trouble-shooting can happen here if it would normally impact the production environment.

(Like how I used a numbered list that matches the EULA requirements?  )

 

In a nutshell

What this means is that Tableau does NOT allow the common SDLC usage of non-production report testing (QA/DEV) in one server environment and then pushing production-ready reports into another, separate [production] server environment.  All QA/DEV and PROD work must happen in a single Tableau Server environment!

 

Oops and now

Well, we used to follow the common SDLC as that is what is used by our Cognos team and what the prior BI supervisor expected.  It wasn't until some time later, during a move from a single node to a 2-node environment, that we discovered we were not following the EULA   Our Tableau sales reps were very understanding in allowing us time to make the change.  Luckily, due to the self-service process already in place there was very little consternation and my end users took to the new process quite rapidly   When our 2-node environment was switched over, becoming our production environment, we implemented our new Tableau BI paradigm, the continuously-improving life cycle a.k.a. CILC.

 

How CILC came into being was driven by me but decided by my end users.  Yep, I gave all 200+ publishers the option to vote on the method they wanted to employ!  I've attached the two page document I sent to my publishers, SDLC_Proposition.docx.  I briefly explained what was going to happen and gave five options on how to begin the new SDLC.  They were given ample time to respond and provide any feedback.  These were their options:

  1. Have a QA Site and a PROD Site
  2. Have a QA Project and a PROD Project
  3. Use a suffix or prefix on workbook names to designate QA or PROD
  4. No QA on the Tableau Server at all.  Instead, have a "continuously improving", or agile, PROD Server.
  5. A combination of any of the above.

 

The overwhelming vote was for #4 with only one needing to have separate QA and PROD areas (I had them use option #2; this was before child Projects were implemented in TS).  Technically speaking, there's nothing stopping Project Owners from implementing options #2 (as child Projects in our latest version of TS) & #3 if they wanted and this works just fine within the spirit of CILC. 

 

I thus began working on what "continuously improving" meant and how to convey it to my end users.  The below document is what came about after the voting.  Since my end users are notorious for not reading emails and documentation...or at least not reading them comprehensively...I had to keep it short, providing just the necessary info.  This is one of the first documents my users need to read when they are granted the site role of Publisher.  It's also attached if you want to copy and use it in your own organization.

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

The Tableau Self-Service & CILC Paradigm

By Toby Erkson, Tableau Administrator, February 2018

With our multi-node environment we are truly an organization that functions in the self-service and continuously-improving life cycle methodology.

What does this mean?

Well, by using Tableau you are already involved in self-service BI (business intelligence reporting).  It basically means you don’t have to rely on another team or person to create your reports because you can easily do it yourself.

What does continuously-improving life cycle mean?  This is where we’re making a slight change to what is called the software development life cycle, or SDLC, which is how tradition BI reporting works (think Cognos). This is basically what it looks like:

Many of you are used to creating a workbook (Design & Develop), publishing it to the QA Server (Develop & Testing), getting end user acceptance (Testing), and then publishing to the PROD Server for production use (Deploy & Maintenance). Well, this is going to change just a little bit…

The QA Server as you know it in the SDLC is gone.  This is 100% driven by the end user license agreement (EULA) from Tableau because they consider how we were using the QA Server as a production environment and the EULA only allows for one production environment[1].

 

So what you will now be doing is what I’ve termed continuously-improving life cycle, or CILC for short (pronounced silk).  You will develop the workbook in Tableau Desktop and when it gets to the point where it’s working for the end user you will publish it directly to the production server (PROD Server).  The end user can use it immediately and if it needs updating/tweaking/maintenance you will make the necessary changes and publish over it, thus creating a revision history.  Rinse and repeat.  This is similar to agile software development and is actually not uncommon in other companies that use Tableau for enterprise reporting.

 

Not too crazy about this? Does it seem too millennial?  Fear not!  You have options to revert back to a more traditional SDLC:

  • Add a suffix or prefix to the workbook name that designates it as a QA report.
  • Set its permissions such that only certain people can see and interact with it while it’s in the UAT phase.
  • Create a sub-project for QA that has greater permission restrictions.  Once the workbook is ready for production usage you simply move it from the QA sub-project to the production project.
  • Or implement a combination of the above options.

It’s up to you or your Project Leader or Project Owner to decide upon the particular development life cycle you wish to implement, SDLC or CILC, as there is no wrong way thanks to Tableau’s fast and flexible reporting environment.

-- End Of Document --

 


[1] We are allowed two non-production environments.  For example, testing Tableau Sever software before implementing it into production.

Check this out:

 

This was my production Tableau Server with 1.99TB hard drive that had only 403GB free space.  Wow, that is a poo-poo ton of usage!!  I was getting disk space warnings every morning when the automatic backup process was running.  While we are an extract-heavy environment I couldn't believe that I was actually running out of hard drive space when the biggest extract was 9GB and the majority of them were 1GB and less!  What the heck was going on?

 

I mentioned my situation with our Tableau tech representative and he asked if the drive was getting defragged and if cleanups were being performed.  I knew the drives on all instances were being defragged so that wasn't the issue.  Because I didn't want to lose TS (Tableau Server) history for debugging I wasn't performing cleanups.  I disliked the idea of doing a cleanup because of the information that is lost.  Archiving the files (.zip) is very inefficient and not ideal.  A db that is continuously being updated would be a better alternative.  Also, performing a cleanup with the TS stopped isn’t a good option as that, of course, would mean the TS is unable to do any reporting work.  Since it’s a production server running 24x7 any downtime is totally undesirable and will impact someone, somewhere

 

However, since I was gradually losing disk space I had to do something...

 

So I bit the bullet and ran a cleanup while the TS was running:

 

20 minutes later the cleanup was done and now here's what my free space looks like:

 

  WOW!  That was a LOT of log file and HTTP requests cleanup goin' on!!

 

Here's how my disk space usage looks now:

 

Quite an improvement, right?

 

An embarrassing but learning moment for me.  Performing a regular cleanup is a necessary routine the TS admin needs to perform.  How often depends on one's particular environment but once a month seems reasonable and that's what I was previously doing on our QA Server.

 

How to preserve the history in an efficient and easy-to-use format is a whole other subject; search the Server Admin forum for suggestions or leave a comment below on how you do it

Looking over one of my blog posts, Tableau Server licensing & hardware upgrade , I can tell I was frustrated with my lack of server knowledge, both in terms of hardware and Tableau software.  Brandyn Moody did an admirable job of stating facts and providing a friendly & helpful reply -- something that I still need to work on

 

As a followup, our Tableau technical account representative has been getting Tableau Server health checks from us for the last few months and showing how things are trending in our environment.  We've also upgraded our hardware to 24 total cores, moved to a two-node environment with a dedicated 100% Backgrounder worker, and just last month upgraded to 10.5.2.  Thus the server health checks have been helpful in that they show a "before and after" of our whole environment with a focus on how it affects the end user, so we can see what is doing better (or worse), how things are trending, etc.  If you don't have a technical account rep. then ask your sales rep about one and getting a Tableau Server health check if you are lacking Jedi server skills.

 

Not to go unmentioned, our sales rep. stays in contact with me/us to make sure support tickets are getting the appropriate response times & priority, we're aware of any training and events, and working with our Tableau technical rep. to keep me, my users, and my manager all happy and productive.  Thanks Tableau for being easy to work with

 

Some examples from the health check:

Purpose*

This unlocks the postgres database on the Tableau Server, allowing full access to all of the tables in the database and not just the views. This gives the Tableau Server administrator visibility into the database. Please note that this is not supported by Tableau.

 

Get the password for the postgres db

This tip comes from Zen Master Tamas Foldi and you need to use it in order to get the password for the access operation below.

To get the pgsql password use the following command on the Tableau Server you will be accessing:

tabadmin get pgsql.adminpassword

In the blurred-out text above is where you'll get the password for your db.

Access for the postgres db

This hack is VERY DANGEROUS and UNSUPPORTED by Tableau! Unless you are VERY EXPERIENCED you should only access the database tables using Tableau Desktop or other read-only tool! This is not the preferred method of gaining access to the Tableau server database. Use at your OWN risk!

How-To

1. Open a command window (DOS prompt) and go to the PostGRES binary directory for your Tableau Server …pgsql\bin directory (most installs will look like C:\Program Files (x86)\Tableau\Tableau Server\8.0\pgsql\bin). The DOS command prompt in the PostGRES binary directory:

2. Now ‘open’ the PostGRESQL command prompt by typing in this:

psql -p 8060 workgroup tblwgadmin

Please note that if you use a different port then you need to change the 8060 value to the value of the port you use.

Here’s what my result looked like:

3. Notice that you need to enter in a password.  That's why you got the password mentioned in the beginning   Enter in your password and then hit the 'Enter' key.  Note that the cursor won't move and you won't see any text to type carefully.  I recommend using copy/paste.

 

4. At the workgroup=# prompt you can now execute commands:

For example, enter this command to change the role of the tableau user to have READ and WRITE access:

alter user tableau superuser;

After the above command executes you’ll see “ALTER ROLE” display and then an empty prompt:

Or in the situation I had, delete a custom view from the database that was no longer needed using the DROP command:

5. That's it. When you're done close the DOS command window.  Either CTRL + c  or  \q  (backslash followed by the letter Q in lower case) will exit you from the workgroup=# prompt and put you back into the cmd prompt.

 

Remember, you’ll now have the ability to write to your database and delete things! This is VERY DANGEROUS and UNSUPPORTED by Tableau! Unless you are an VERY EXPERIENCED you should only access the database tables using Tableau Desktop or other read-only tool! Make sure you backup your database regularly.

 

 

 

*Adapted from my Tabwiki document.

I saw it today on Twitter.  Someone who I consider a key player at Tableau, a self-admitted Tableau stalker, left the company.  Just time for them to move on.  We know the drill.

Last week...or at least sometime between now and the first of the year...two other great Tableau employees I knew had left the company.  There was a hint there are some more.

 

I grew up during, and into the end of, the "work for one company until retirement" paradigm.  As my work life has progressed that paradigm is no longer the expected norm.  Now, thanks to social media and job-focused media like LinkedIn, if people are not looking for a job they are still getting offers.  Having worked for a few contracting agencies this happens to me -- I stay in contact with them juuuuust in case as I learned early in my career that corporate loyalty to employees is an illusion, subject to immediate dismissal at their whim.  I as well left a great company because after a time there I knew my position was a dead-end so when a previous manager reached out to me for a professional skills growth opportunity I went for it (and thus was introduced to Tableau ) so I get it.

 

It saddens me to see such good people leave because I'll miss them.  It's like growing up with a friend during primary school and in high school their family moves out of town.  It's like having a sibling move out of the house.  Their absence is felt in the knowledge void left behind.  You get used to having these people around and using their knowledge to help others, heck, to help yourself, too!  The loss is magnified by the deeper product and tribal knowledge that goes with them.

 

I wish them well.  I know that the companies that now have them have gained an excellent resource.  Just know that you are missed.

 

 

Note:  I'm not diminishing their replacements.  This has nothing to do with them.  There's no doubt Tableau will do its best to replace them with great folks who could possibly <gasp!> be even better and I do look forward in meeting and interacting with them.

 

-- Twitter headstones --

You know, being a relative newbie to the server side of IT can be frustrating because there are so many aspects to it and networking types of issues are my bane.  Firewall, proxy, VIPs, SSL, etc. are so different and have their own teams [at least where I work] but seem like the same thing depending on one's inexperienced viewpoint.

 

While testing our up-coming 2-node Tableau Server environment I was having an issue with the "pretty" VIP (Virtual IP) that my end users would use that would direct them to the Tableau Server.  Screen results from using the VIP address were very random.  Sometimes the login page would display, sometimes not.  Sometimes it would show the list of Sites, other time just a partial list.  After maybe logging in I would get some workbooks.  Clicking on one would result in a "content not found" or "page could not be accessed" types of errors.  Web page rendering performance was often terrible.  As part of the load-balancer setup I carefully followed the instructions from Add a Load Balancer, several times in fact, but still no joy   I set up a case with Tableau Support and sent them my log files and waited.

 

Two weeks went by and no reply.  I pinged them -- because normally they are very quick to respond -- and was told an engineer was looking over my case.  Sigh.

 

My manager, who is much more skilled with networking than I, asked if the F5 was pointing to both the primary server and the worker server.  I said "both".  He suggested that the F5 team point it only to the primary server since there is no need for it to point to the worker server.  So I confirmed with my contact that, yes indeed, the F5 was pointing at both (a load-balancing function) so I asked them to have the "pretty" VIP only point to the primary Tableau Server.  They made the switch within minutes and I tested that.

 

BOOM!  It worked!

 

Lesson learned for this newbie:  Make sure load-balancers, proxy redirects, etc. just point to the primary and make sure you emphasize that requirement to whomever your contact is if you aren't doing it yourself.

Image result for old age wisdom ignorance youth

How...subtle

 

 

A few mornings ago I was going through my Twitter feed and saw a reference to an article that looked good.  I'm not a huge believer in certifications and definitely have no secrets about that if you search the forums.  It's mostly because I've seen so many people get certified who don't have any -- or not enough -- real-life experience.  I've been in the business intelligence realm for over 20 years so I do have the experience and wisdom that comes with living it, not simply reading about it.  The author is someone I respect and could give me a look at the other side of the story so I read their opinion in their blog post.  They did a good job but I tweeted my viewpoint.  Here's part of our conversation (which Twitter absolutely SUCKS at):

For both of us a real conversation would have been nice to clarify (argue? ) our viewpoints.  I let my subconscious sit on this and decided that I really disagree with the two rebuttals (which were thought-provoking), so here's my thoughts and you're free to disagree

 

Experience vs. Study

Okay, why do I emphasize on-the-job experience and claim it's better than "individual practice and mastery"?  While the whole subject is good it's item #2 that makes a perfect point in this article by Chris Love, Quality Assurance: A dirty word for Data? 7 tips for getting QA right.  There is only so much one can plan for or practice in a closed environment.

Here's a real-life example

I took Tae Kwon Do years ago and in class we learned our kicks, our blocks, our punches, as well as close-in combat techniques.  We would practice and practice and become very good with them.  But practice is different from the real world so we would also have the occasional sparing match and that is where things fell apart for some students!  Actually being in a situation where you don't know exactly what your opponent is going to do is quite different.  You can anticipate but you may anticipate wrong.  You could even perform the move correctly but do it too slowly, or at the wrong location, or not with enough strength, and fail.

 

Fail!?  How could you fail if you can demonstrate a perfectly executed example?  Because reality is different from study, that's why.  As such, in the workplace things come up that you didn't or can't expect.  Which brings us to...

 

A lack of vision

Okay, so the slash in "chaos/random factor" wasn't interpreted as I expected as it's meant to basically mean two words that pretty much have the same meaning -- trunk/boot when talking about the rear luggage space in a car, wrench/spanner when talking about a mechanics tool for working with bolts, etc.  The word "chaos" seems to be defined differently between us, too.  Either way, the randomness of humans is chaos; the aspects one can't plan for, can't see coming.  When working with Tableau Desktop and Tableau Server I've had users come to me with questions that I couldn't immediately answer.  Does that mean I have a lack of vision?  No, it means I can't plan for every single eventuality.  NOBODY CAN.

 

As a parent -- and I've found pretty much anyone with children like parents/guardians and our under-appreciated workers, teachers!, can relate to this -- is that children are a perfect example of chaos and no amount of planning, book reading, internet searching, etc. will give one "vision" to 100% counter what issues they may create.  That is not some shallow statement, it's one that has more depth than most think.  Having been childless for decades and suddenly becoming a parent has given me this experienced knowledge.  If you have a parent then they can confirm this, usually with a reminiscing smile.  Oh, and every child is different, even among siblings, so what works for one child may not work on another.

 

Think of getting your first driver's license and being on the road by yourself for the first time versus how you are as a driver now -- we all have stories of some of the silly and terrible things we've seen (or done) while driving that we could never dream up or simulate.

 

Experiencing subject-related chaos leads to deeper knowledge

Something happens, a piece of chaos wedged in the teeth of your machine, something your "vision" was unable to foresee despite all your planning and studying of books and now you have to ask an expert and/or delve deeper into a subject than you thought was unnecessary, boring, forgot due to lack of use (which happens more than you think), or whatever.  It's the solution you get after those "I don't know but let me get back to you" moments where I believe sticky knowledge is gained.  By "sticky knowledge" I mean information that isn't quickly forgotten.  It's that, "oh, that happened to me once and I had to do this..." knowledge.  This knowledge stays with you as it's gained by the on-the-job experience and gives you extrapolated knowledge, that leap from A to C without needing step B.

For example, you can see a file on a network drive but you can't open it so you contact the folder administrator and explain your issue.  They tell you that you don't have permission to read files.  Why weren't you already given access?  Well, you are the bit of chaos that the administrator didn't know would be needing the access (very common in an enterprise environment).  You ask for the permission to be set on the network folder to allow you to read files and they grant your username the permission.  Now you can read files from the network folder.  Weeks later, a Tableau Desktop publisher tells you the Tableau Server is broken because when they refresh their new report extract on the Tableau Server the data does not come through even though they used the proper UNC (Universal Naming Convention) file path to their Excel workbook in one of the many corporate file directories.  Because of your prior experience with being unable to read data in a network drive folder you reference your sticky knowledge to surmise that the reason is because the Tableau Server doesn't have Read permissions in the folder location where the Excel file resides.  This experience has allowed you to trouble-shoot more quickly, reducing down-time from having to go to the forums and post a question or submit a ticket to Tableau Support.

 

Oh, by the way, this tidbit of knowledge above I just shared isn't found in the documentation so if this was a piece of a question was on a certification test your chances of getting it wrong would have greatly increased   Just sayin'...

Update

I wrote the original post yesterday but due to delays I didn't publish it until today.  I've played around with this and came up with an improvement.

 

Onward

Seeing that Tableau translated TRUE to 1=1, which is further complicated by Tableau putting logic around it via the CASE statement, why bother having a CASE statement slow things down when I'm only interested in one condition of the output of the CASE statement?  The only output I care about from the CASE statement is 1 so I changed the left side of the Join Condition from TRUE to just 1:

 

 

When I looked at the SQL using the Convert to Custom SQL that second CASE statement was gone!

...

"ITEM_LOC_VENDOR_DIM_V"."CURR_TS" AS "CURR_TS",

(CASE WHEN ("ITEM_LOC_VENDOR_DIM_V"."PRIM_ALT_VNDR_CD" = 'P') THEN 1 WHEN NOT ("ITEM_LOC_VENDOR_DIM_V"."PRIM_ALT_VNDR_CD" = 'P') THEN 0 ELSE NULL END) AS "$temp0"

  FROM "CUSTOMS"."ITEM_LOC_VENDOR_DIM_V" "ITEM_LOC_VENDOR_DIM_V"

  ) "t0" ON

  (

   ("RVC_TOOL_NAFTA_V"."CO_CD" = "t0"."CO_CD")

   AND ("RVC_TOOL_NAFTA_V"."ORG_CD" = "t0"."ORG_CD")

   AND ("RVC_TOOL_NAFTA_V"."PLNT_CD" = "t0"."PLNT_CD")

   AND ("RVC_TOOL_NAFTA_V"."SUPLR_CD" = "t0"."SUPLR_CD")

   AND ("RVC_TOOL_NAFTA_V"."CHILD_ITEM_NO" = "t0"."ITEM_NO")

   AND (1 = "t0"."$temp0")

   )

  LEFT JOIN "CUSTOMS"."PURCHASING_BUYER" "PURCHASING_BUYER" ON

( ("t0"."PPF_BYR_CD" = "PURCHASING_BUYER"."BYR_CD") AND ("t0"."SUPLR_CD" = "PURCHASING_BUYER"."SUPLR_CD") )

 

See that, I went from a time-consuming CASE statement to a simpler condition of (1 = "t0"."$temp0").  I canceled the custom SQL and ran the extract.  It ran almost 30 seconds faster!

A report developer (a non-IT business user) came to me regarding an issue in a workbook of theirs.  The reason isn't relevant because there was something else in the workbook that challenged me.  I saw that it used Custom SQL-- something I heavily frown upon! -- and with a JOIN filter that took some pondering to get right.  Please note that I have modified the SQL for brevity as it's the concepts we are concerned with.  The data source is an IBM DB2 database.

 

Here's the first part that furled my brow, highlighted in orange:

select

     V1.CO_CD

    ,V1.FG_MDL_NO

/*

    ,t4.OEM_MAKE_CD

    ,t4.OEM_MAKE_NAME

    ,t4.MAKE_MDL_GVW_RATG_CD

    ,t4.MDL_SER_CD_BY_GOVT

    ,t4.FIN_CNTLG_REF_CD

*/

    ,V1.MAKE_BUY_CD

    ,V1.COMP_GRP_NAME

    ,V1.SUPLR_CD

    ,'XXXXXXXX' as SUPLR_NAME

    ,V1.SUPLR_TYPE_CD

    ,V1.HT_CD

    ,'######' as CURR_HTS

    ,case

        when v1.ht_cd='######'

        then null

        else '•'

     END AS HTS_CHGD

    ,V1.PREF_CRITR

    ,V1.PRODCR_INDC

    ,Trim(T1.ITEM_DESC) as DESCRIPTION

    ,V2.FACILITY_ABBR

    ,T3.DEPT_CD

    ,trim(T3.BYR_CD) ||'-'|| trim(T3.BYR_NAME) as BUYER

from ...

 

Why the bulk comment (/*...*/)?  If it's not needed then don't include it as it can be added later if necessary  Get rid of it!

'XXXXXXXX' as SUPLR_NAME served no purpose, it wasn't used.  Get rid of it!

Same for '######' as CURR_HTS , get rid of it!

The CASE statement can be made into a calculated field.  Get rid of it!

TRIMming the description can be done in a calculated field or where needed.  Get rid of it!

The BUYER field can be a calculated field, too.  Get rid of it!

We need to remember that if we are not going to use Custom SQL then we need to convert it into such a form that doesn't require functions in the SELECT body.

 

Next came the FROM body:

...

from

    customs.rvc_tool_nafta_v V1

        LEFT OUTER JOIN CUSTOMS.ITEM_MASTER T1

            ON V1.CHILD_ITEM_NO=T1.ITEM_NO

            and v1.co_cd=t1.co_cd

        left outer join customs.facilities_dim_v v2

            on v1.plnt_cd=v2.plant_cd

        left outer join customs.supplier_parent t2

            on v1.suplr_cd=trim(t2.suplr_cd)

        left outer join customs.item_loc_vendor_dim_v v3

            on v1.co_cd=v3.co_cd

            and v1.org_cd=v3.org_cd

            and v1.plnt_cd=v3.plnt_cd

            and v1.suplr_cd=v3.suplr_cd

            and v1.child_item_no=v3.item_no

            and v3.prim_alt_vndr_cd='P'

        left outer join customs.purchasing_buyer t3

            on v3.ppf_byr_cd=t3.byr_cd

/*

        left outer join customs.udi_itc_mdl_xref t4

            on trim(V1.fg_mdl_no)=trim(t4.oem_mdl_cd)

*/

where

    v1.make_buy_cd='B'

 

Again with the comment!  Get rid of it!

Notice the blue line, a condition of the JOIN, and the magenta line, a condition for the entire output.

 

Getting started

What I did was add the necessary tables to the Tableau data source canvas and created the basic JOINs.  This got the information into Tableau so I could then work with the conditionals.  Here's what it looks like and this shows the ITEM_LOC_VENDOR_DIM_V table (technically a view but just roll with me on this, okay?) that will be LEFT JOINed to the RVC_TOOL_NAFTA_V table with that blue condition:

 

Taking care of the WHERE condition

Next, doing the easy part first, I created a data source Filter:

Thus the overall WHERE condition, v1.make_buy_cd='B', is now taken care of

 

Taking care of the JOIN condition, starting on the right side

Now the conditional part (blue text) for this LEFT JOIN:

     ...left outer join customs.item_loc_vendor_dim_v v3

            on v1.co_cd=v3.co_cd

            and v1.org_cd=v3.org_cd

            and v1.plnt_cd=v3.plnt_cd

            and v1.suplr_cd=v3.suplr_cd

            and v1.child_item_no=v3.item_no

            and v3.prim_alt_vndr_cd='P'

        left outer join customs.purchasing_buyer t3...

This was a real head-scratcher for me because I'm not familiar with creating 'custom' Join Clauses on Tableau's data source canvas, thus it took some experimenting before I got the correct results.  Here's the final solution:

See the last line?  Well, it turns out both the left and right side MUST be filled out otherwise the custom JOIN clause won't "stick", it will disappear.  If you don't see it then it's not there.  So, how do we get that?  It's very easy, just click the drop-down arrow in the last row of the Join box and select the option "Create Join Calculation..."

You'll be presented with the calculated field editor where you can create your Join Condition just as you would create a Calculated Field:

When done simply click the green OK button and it will be added.  You must have a Join Clause for the other side of the JOIN as well.  After BOTH sides of the Join Clause have been created you can then exit the Join box.

I used parenthesis simply to make sure the order of operations happens as I expect it to but they are not required.

 

Setting the Join Condition on the left side

Both sides of the Join Condition need an entry, a value, since there's that equal ( = ) operator between them.  Since I don't care about what's on the left side of the JOIN I just make sure it is always a Boolean TRUE.  This just fulfills the need to have something there.  If TRUE is not valid for your situation simply use (1=1).  Conversely, if you need a FALSE condition you can alternately use (1=0).

 

Now, on the other side of the equal condition is where I need to make sure only rows with the letter "P" for the [Prm Alt Vndr Cd] are pulled.  The result of this needs to be Boolean (TRUE or FALSE).  The equation for the JOIN condition is now completely set.

 

How Tableau translates it

Here is what the query looks like when I select Data >> Convert to Custom SQL.  This is only for the curious and isn't necessary to do when converting Custom SQL to the data source canvas.  I'm only focusing on the FROM body since the SELECT body is now no longer "interesting" to look at:

...

FROM "CUSTOMS"."RVC_TOOL_NAFTA_V" "RVC_TOOL_NAFTA_V"

  LEFT JOIN "CUSTOMS"."ITEM_MASTER" "ITEM_MASTER" ON

    (

     ("RVC_TOOL_NAFTA_V"."CHILD_ITEM_NO" = "ITEM_MASTER"."ITEM_NO") AND ("RVC_TOOL_NAFTA_V"."CO_CD" = "ITEM_MASTER"."CO_CD") AND ("RVC_TOOL_NAFTA_V"."ORG_CD" = "ITEM_MASTER"."ORG_CD")

    )

  LEFT JOIN "CUSTOMS"."FACILITIES_DIM_V" "FACILITIES_DIM_V" ON ("RVC_TOOL_NAFTA_V"."PLNT_CD" = "FACILITIES_DIM_V"."PLANT_CD")

  LEFT JOIN "CUSTOMS"."SUPPLIER_PARENT" "SUPPLIER_PARENT" ON ("RVC_TOOL_NAFTA_V"."SUPLR_CD" = "SUPPLIER_PARENT"."SUPLR_CD")

  LEFT JOIN

    (

     SELECT "ITEM_LOC_VENDOR_DIM_V"."CO_CD" AS "CO_CD",

      "ITEM_LOC_VENDOR_DIM_V"."ORG_CD" AS "ORG_CD",

      "ITEM_LOC_VENDOR_DIM_V"."PLNT_CD" AS "PLNT_CD",

      "ITEM_LOC_VENDOR_DIM_V"."PPF_BYR_CD" AS "PPF_BYR_CD",

      "ITEM_LOC_VENDOR_DIM_V"."ITEM_NO" AS "ITEM_NO",

      "ITEM_LOC_VENDOR_DIM_V"."SUPLR_CD" AS "SUPLR_CD",

      "ITEM_LOC_VENDOR_DIM_V"."PRIM_ALT_VNDR_CD" AS "PRIM_ALT_VNDR_CD",

      "ITEM_LOC_VENDOR_DIM_V"."CURR_TS" AS "CURR_TS",

      (CASE WHEN ("ITEM_LOC_VENDOR_DIM_V"."PRIM_ALT_VNDR_CD" = 'P') THEN 1 WHEN NOT ("ITEM_LOC_VENDOR_DIM_V"."PRIM_ALT_VNDR_CD" = 'P') THEN 0 ELSE NULL END) AS "$temp0"

     FROM "CUSTOMS"."ITEM_LOC_VENDOR_DIM_V" "ITEM_LOC_VENDOR_DIM_V"

    ) "t0" ON

    (

      ("RVC_TOOL_NAFTA_V"."CO_CD" = "t0"."CO_CD")

      AND ("RVC_TOOL_NAFTA_V"."ORG_CD" = "t0"."ORG_CD")

      AND ("RVC_TOOL_NAFTA_V"."PLNT_CD" = "t0"."PLNT_CD")

      AND ("RVC_TOOL_NAFTA_V"."SUPLR_CD" = "t0"."SUPLR_CD")

      AND ("RVC_TOOL_NAFTA_V"."CHILD_ITEM_NO" = "t0"."ITEM_NO")

      AND ((CASE WHEN (1=1) THEN 1 WHEN NOT (1=1) THEN 0 ELSE NULL END) = "t0"."$temp0")

    )

  LEFT JOIN "CUSTOMS"."PURCHASING_BUYER" "PURCHASING_BUYER" ON

    (

     ("t0"."PPF_BYR_CD" = "PURCHASING_BUYER"."BYR_CD")

     AND ("t0"."SUPLR_CD" = "PURCHASING_BUYER"."SUPLR_CD")

    )

See where $temp0 is created in the sub-SELECT body?  That is the right side of the Join Condition I created on the data source canvas.  The result set of that sub-SELECT is given the alias "t0".

Inside the ON body of the LEFT JOIN is the right side of the Join Condition I created on the data source canvas.  This is where the TRUE = [Prm Alt Vndr Cd]= 'P' Join Condition I set up on the data source canvas comes into play.  What is interesting about the translation is that Tableau converted TRUE to (1=1).

 

In closing

Even with the poor Custom SQL query, the extract created by this data source took slightly less time than using a proper Tableau data source canvas one.  I'm no expert by any means but my guess is that the use of Tableau-generated CASE statements in the JOINs doesn't help.  The fact that both sides of a Join Calculation are necessary is likely the culprit.  It would be nice to have the ability to have one-sided Join Calculations, thus more closely mimic what is written in the SQL query.

How my Sunday evening went

Okay, after justification and multiple approvals I finally got our Tableau Server core-license upgraded for an additional 8 cores and the additional "hardware" for it.  This means we are now a licensed 16 core environment and have 128GB RAM.  To be exact, our Tableau Servers run in a VM (Virtual Machine) environment.  I know it's just "flipping switches" for our resource increases but as far as my end users are concerned it's simply an improvement in processing power due to additional [virtual] hardware upgrades.  They're happy so I'm happy.

 

Right now I'm working with our architecture expert on how to turn our QA Server into a two-node environment.  Currently it is still the same 8 core, 32GB environment since it gets little usage so there's no need for the whole enchilada.  However, our PROD Server is very actively used and since we are allowed 16 cores my manager suggested we upgrade it now and after the QA Server passes inspection we can then convert our PROD Server into a two-node environment.  So yesterday (Sunday afternoon) the virtual environment folks upped my PROD Server to be 16 cores and 128GB RAM.  I then decided to adjust the processes in the Servers tab of the Tableau Server Configuration program to take advantage of the extra processing power.

 

I decided to give her the beans* and edited my production server as so:

After clicking the OK button I got this message:

Hmmm...well, despite the scary message I decided to start the Server and this popped up shortly...

...but the Server did come up and everything was configured as I had set it.

 

Cool.  However, I was admittedly nervous about how this would impact server performance so I decided to dial it down a tad.  Here's what I set up my next attempt as:

I got this warning...

...which is odd because it showed a value of nothing for the data engine even though I put 4.  I clicked the OK button and got this message:

I clicked the OK button and got this message during the Server start-up:

Yet when I went to the Server Status page after the Server booted up it appeared that everything is exactly how I set it up...

...including my tabsvc.yml file

And?  How's it runnin'?

In a word:  Fantastic!

Every day there are 65 extracts executing at 5AM Pacific Time with another 24 added at 5:30AM (much of this is for the east coast users).  These are just the hourly extracts (15/30/60 minute schedules).  There are also weekly and monthly schedules that add to the load.  Oh, and then there are subscriptions, too.  Wow, those really suffered!  I'm talking one was 9 hours late   Holy Rip Van Winkle Batman!!!

Related image

I would normally see in-process tasks waiting for over two hours during the mornings.  Now I'm seeing 8 backgrounders chugging away and the task queue is dramatically reduced.  By 7AM everything is done, no backlog like before where things wouldn't wrap-up until after 8AM.  Sweet.

 

Wow....so many backgrounders.  ~David Pires via Twitter

Yep, so, why did I do what I did?  Yeah, no freakin' clue

 

Considering we dropped a couple extra hundred grand into Tableau's bank account one would hope to get little support with it -- not a lot but at least some solid recommendations on how to start out as I've communicated that I am not a server guy, I'm a programmer by trade so a newb to server admin.  I understand a company needs to make money but a pitch for consultation services kinda rubbed me the wrong way after this purchase.  I initially sent a Tableau-created workbook that contained info on how our previous production environment was working (actually, over-working).  It was agreed that we definitely needed additional hardware.  Given the info in it, along with the known hardware upgrades we were planning, I was hoping I could get a recommendation of the processes that should be increased and by how much.  I'll give them the benefit of the doubt and say that maybe I wasn't clear enough with my request.

 

Based on what little info I could find in the admin guide and from the community forums I finalized on the setup for now.  Note:  Once our architecture is done with testing in QA then our PROD Server will transform again but with two nodes instead of the current single.

  • Now that the box has more cores and RAM why not bump up the VizQL performance?
  • Extracts and Subscriptions were very crippled so definitely go for the cores/2 maximum!
  • I upped the Data Server since it "...loads extracts into memory and performs queries against them":  Extracts and Refresh Schedules and Tableau Server Processes.  Interestingly enough, from the same link is this:  "Every instance of a data engine process will always have one instance of the file store process present as well."  Not according to my setup   However, everything is running smoothly from what I can tell and I've received no complaints from my Server nor my end users.  My hope is that this helps with the user experience while on the Server.

 

For better or for worse, that is my semi-ignorant logic in this setup.  In my opinion Tableau's documentation is very ambiguous...or maybe too general...for setting up a Tableau Server thus I'm learning the hard way through experimentation   I hope I can make Mark Wu proud

 

Recommendations and comments welcomed below!

 

 

 

* "give it the beans" is an expression meaning "giving it all".  For example, in a car, if you "give it the beans" it means step all the way down on the gas pedal and go as fast as you can.

I attended a Cognos seminar on Wednesday because I wanted to learn more about what Cognos has to offer with its version 11 (Cognos Analytics) upgrade.  Though I am the Tableau Administrator for our company (and was originally employed as a Cognos developer, too) there is a Cognos admin and webFOCUS admin in our team and they sit next to me.  Knowing what the other products have to offer and how they differ is good to know.

 

Here are a few take-aways from that session that made me envy Cognos over Tableau , thus what I would like to see with Tableau:

  • Their portal has a list of recently viewed reports.  Tableau Desktop does this, why not Tableau Server?
  • Their community support page has How-To documents and short videos of new features as they are released.  Users are immediately 'trained'; no reading, no experimentation necessary.
  • Their reporting web portal is very customizable to brand it to the company.  Tableau Server is terribly weak in this aspect nor is it easy to hack.  Tableau APIs/Java scripting?  Not what I would call 'user-friendly' nor simple.  The ability to import a couple images and change a name in a simple fashion is the best we've got
  • Users create their own robust schedules (e.g. 2nd Tuesday of the month @ 1:35PM) for subscriptions, there's no need to rely upon an administrator to create it...and we already know the scheduler needs a big revamp for greater flexibility.  Information Builders product, webFocus, is this way as well.  As the Cognos product manager stated, it's all part of them moving towards self-service.

I know there is the Idea section but not everything should be subject to democracy, it should just be.  Besides, the Ideas section just isn’t effective for some intrinsic, should-already-be-there things, nor does it do a good job of getting that information "out there" (granted, there isn't an easy solution for that).

 

It was obvious to me that Tableau has muscled in to Cognos' territory and they are seriously working to simplify and improve their products.  It's still not a threat to Tableau at this time but for those locked into Cognos at least they will see much needed improvements and I think Cognos is finally moving in the right -- and now different for them regarding all this "self-service stuff" -- direction.  For example, they were proud of the fact that a small, new server environment will only take about 1/2 to 3/4 of a day.  Meanwhile I'm smugly thinking to myself that such an install for Tableau Server is 1/2 that time

 

People I'm spamming cuz I want them to see this:

Sean Boon

Tableau Community

While trouble-shooting a subscription issue with an end user o' mine they sent me the URL of their viz:

http://Prod.Server.com/#/views/UTPLiveConnection/WeeklyStudy/CHKOOL@active.directory.com/CoburgWeekly?:iid=3

That is an URL that I had never seen before and it totally baffled me.  I tried entering in my user ID instead of theirs, TERKSON@active.directory.com, and got the generic "That page could not be accessed." error. 

 

So I asked the user about it and they said it probably had something to do with it being a saved custom view.  Huh.  So I went to the viz and sure enough, it was a saved view:

 

 

We use Active Directory for authentication so the custom view was saved under the AD account ID followed by the custom name.  Mystery solved and now you know how to read these different view URLs! 

 

Toby Erkson

Awesome Twitter post

Posted by Toby Erkson Ambassador Mar 15, 2017