1 of 1 people found this helpful
I don't think it's as simple as saying what functions 'should' be available.
Tableau is really constrained by support for functions in the back and database. So largely it's a question of what is defined in the SQL standard and then of those, what is actually supported by all the back end databases Tableau supports. Generally Tableau only exposes functions that are common to all databases (though there are a few exceptions like median() which is only available for Oracle and the data engine and countd() which isn't there for the Jet engine). You can always get at other functions that happen to be available in your particular back end with the RAWSQL functions. I think that is absolutely the right way to do it.
The other option is for Tableau to implement functions in Tableau itself, but then they have to operate on the result of database aggregate functions (eg WINDOW_MEDIAN (AVG ())) otherwise it doesn't scale.
I suppose there may be some scope for adding more window functions or similar, but it's really not as simple as just sifting through the menu in excel and taking your pick, I don't think.
Richard is spot on.
Let me re-frame the question: What Excel functions do you wish were in Tableau even if performance were 10x worse than any average query issued by Tableau? 100x worse? Etc...
Ok, you may have opened a can of worms.
Richard, you're right, it's not as simple as handpicking formulas available in Excel and saying they should be available in Tableau. I think my (admittedly few) frustrations with the Tableau calculations are rooted in the sparse documentation. When I learned Excel, I was able to pick up 90% of what I needed from the help file, and the really gritty technical stuff like array formulas are well documented in an army of longstanding forums and blogs. Excel has a few decades on Tableau to amass documentation, that is a reality. So there are a few things at play here with people like me, who learned to use Excel to do magic in the office, but are picking up Tableau and being wooed by it's obvious *** appeal (and other virtues), but are wired in an Excel mindset.
1) Excel is amazing at finding, calculating, storing, and tossing around variables - and lots of them.
Variables in Excel can be entered by a user, they can be pulled from another source, they can be calculated based on data in the source, outside the source, next to the source, a combination of any of those. When the vast ocean of available formulas just isn't enough, users have full latitude to just write their own formulas to use within the spreadsheet with a relatively easy to learn programming language. Great spreadsheets use variables to create more variables, which feed other variables, making the mechanics of spreadsheets very flexible.
2) Excel has no native form factor for analysis.
Not only are there many different ways to store/find/calculate variables, they can be represented as a single value or a range of values in one or 2 dimensions. This freedom of form causes a few issues with hardcore Excel users trying to adopt a technology like Tableau in the fact that a lot of what's happening with Tableau calculations is kind of abstract. With Excel, if you got it right, you know it right when you type it in because the numbers are right there in front of you. Where Tableau is designed to go straight to marks and labels are secondary, Excel starts with data and visulization is secondary.
To exacerbate why Excel users get confused by the calculations, in Tableau some things that affect the outcome of the formula are controlled by contextual controls through dialog boxes outside of the calculation window (compute using, etc). In Excel, if the result is not what you expected, the problem is in the formula. Something like "compute using" is part of the formula itself in Excel.
So, with that context, here are a few thoughts on the actual topic of this thread.
There are a crapload of formulas in Excel that just probably have no place in Tableau. Lookup formulas which are centered around ranges, a bunch of boolean tests, and a number of very specific-use financial formulas, etc. that don't have native relevance.
That said, there are hundreds of functions in Excel, lots that could be directly translated. But the horrible truth is 90% of Excel functions are redundant (don't tell anyone!). After all, average is just sum/count - we don't technically need an average function, since we could always just write the function as sum(field)/count(field).
The fact is we need an average function because it's used all the time, and it's way easier than writing out sum/count every time we need average. The same could be said of almost every suggested formula that isn't already in Tableau. I can't think of very many particular calculations that are not possible in Tableau, it just comes down to how easy they are to execute, and I think that's what needs to drive the conversation.
Having said that, I don't often find myself wishing there were function in Tableau that doesn't exist - it's usually more about I don't know how to execute them (like the recent sumif thread - simple in Excel, just totally different in Tableau).
A few that come to mind, remembering that there are ways to accomplish them - just less work with a dedicated formula:
text() is an extremely useful and flexible text manipulation formula in Excel. If I want to change a date from the normal look of 1/1/2013 to 20130101, I just enter text([DateField],"yyyymmdd"). Sure you can do the same thing with Tableau using a bunch of date formulas crammed together, but you have to admit, that's pretty **** elegant. It does a bunch of other manipulations too, not just dates, making it a one stop shop for simple text transformations (look it up if you're not familiar).
There are a couple of other date formulas that periodically make date calculations easier to implement in Excel, like networkdays() and weekday(). A bunch of those are probably covered by date functions in Tableau I just haven't used yet.
Choose() is an interesting one, though almost identical to Case() in Tableau. I'm wondering if you could nest table calculations in both the expression and return for that formula in Tableau.
Well, enough banter for now.
Summary: Tableau's calculations are capable, just confusing and need more/clearer documentation, especially around the table calculation dialogs outside of the formula window.
+1 on what Justin said, with one exception:
Justin, you wrote: "Where Tableau is designed to go straight to marks and labels are secondary, Excel starts with data and visualization is secondary."
I have a different perspective on this. My users are folks who know Excel and generally have had very little exposure to databases, so I've had "a bit of a think about this," to use Richard's phrase.
I'd say that Excel starts with numbers & strings & dates i.e. raw data, whereas Tableau starts with the structure of the data (more on that in a second). As you noted, "Excel is amazing at finding, calculating, storing, and tossing around variables - and lots of them." In my mind, that's because Excel doesn't actually care about the structure of the data, you can put all your calculations in cells C2-C20 that pull from who-knows-where and the final total ends up on a separate worksheet in G56. There's a huge amount of flexibility and control in that, and if we're not careful then maintenance can be a nightmare.
Also, Excel also starts with presentation and formatting. On Excel 2011 for the Mac, after the copy/paste, the first buttons in the upper left on the ribbon in Excel are about formatting the font, and the rest of the Home ribbon is about formatting, the next two ribbons are for Layout and Tables. The effect of this on my users is that historically they've created a lot of worksheets that look great for printing or inclusion in PowerPoint, but are awful for using for further analyses, like alternate timeseries, etc.
Tableau starts with the structure of the data - identifying data types, and organizing into dimensions and measures, discrete & continuous. In the interface, the most prominent positions in the upper left are given over to working with the data (undo, redo, save, and connect to data).This requires more up-front work than Excel, where in Excel we can start throwing data onto a worksheet and move stuff around as necessary, and make a VLOOKUP or SUMIF point to whatever we want to.
Then Tableau uses the structure of the data to derive attributes of the view, i.e. what pills are on the Rows and Columns Shelves, the Marks Card, etc. Again, this is different from Excel, where we can arbitrarily choose one batch of cells to be the data series and choose a totally different batch for the labels.The advantage to Tableau in doing this is making possible the cycle of visual analysis, where in moments we can generate dozens of views of the data as we look for insights, and adding additional data (in the form of new records or blending in other sources) is a point-and-click operation that doesn't require tons of additional formulae.
Given that the same output can generally be created in both applications, I don't think most users coming from Excel really appreciate how fundamentally different the two applications are.
Ok, back to the topic at hand, Excel functions in Tableau. I believe the most-wanted calculations already have Ideas for them and/or have been feature requests in the past, ones I can think of are some sort of FORMAT() or TEXT() function, PERCENTILE(), the NETWORKDAYS() and WEEKDAY(). Also, having the ability to have user defined functions would get around some of these questions because we could build them ourselves.
A couple more comments on this.
I was thinking about missing functions that often come up. One example is floor () and ceiling () - where expressing them for yourself is really quite tricky. The effect is that people usually just use int () which may be anywhere from slightly wrong through to sometimes horribly wrong.
I've always assumed that there was no support for floor () and ceiling () because they aren't in the SQL standard. But I just had a look and it seems they are in the standard, but just not universally implemented. For example, I don't think Postgres implements them. So it seems to be a case of lowest common denominator.
So two thoughts about cases like this.
1) Is there a case for implementing functions like these for those databases that support then? They could just be greyed out in the interface for those that don't.
2) Failing that, it would be really handy to have a comprehensive list of functions that are supported by each back end database Tableau supports, which are available from the RAWSQL pass through functions.
I think the answer to that lies in the creation of User Defined Functions that we could share on the forums, once used would be available to each new workbook, and would be saved along with the workbook when used.
I'm sure this opens a can of worms (UDFs with the same name, etc.), but ceiling and floor can be expressed 100% correctly using existing calculations that work on every datasource.
+1 on floor and ceiling. Those could prove super handy implemented as table calcs.
**edit, having reread the post above, I think I may have misinterpreted what a floor() or cieling() formula would be expected to do. What I had in mind was reporting back the max/min values of a field visible in a view. Highest and lowest month of sales on a line chart, etc.
The other one that I forgot to mention earlier is rounding. roundup/rounddown are helpful in addition to the standard round() function.
Also, for modeling, rand() and randbetween() have common uses, and I imagine would have a plethora of applications with viz as well as modeling. (jittering, anyone?)
Visual Basic for Applications
Agreed Justin - for non JET sources, this works: RAWSQL_REAL("RAND()")
Power View in Excel 2013 (Pro) really appears to be an attempt to provide some Tableau-like functionality in Excel. I think the differences (unless Tableau raises the bar) are starting to become less noticeable and less critical.