6 Replies Latest reply on May 31, 2018 4:32 PM by Eron Lloyd

    Working with data represented as arrays

    Pinney Colton

      Postgres (and other databases) allow for data to be represented as variable length arrays within a single database column.  This is particularly helpful for storing what would normally be a very high number of boolean fields in a single column.  Unfortunately, it doesn't appear Tableau visualizes the data as I would hope.

       

      For example, let's assume I have some survey data about the various condiments people like on their hamburgers.  I have two respondents in my dataset, one respondent said "Mayonnaise" and the other said "Mayonnaise, Ketchup".  If I wanted to show a bar chart of the results, Mayonnaise would have a value of 2 and Ketchup would have a value of 1.  Unfortunately, Tableau seems to be working with the array returned by Postgres as text and is showing two unique values:  {Mayonnaise,Ketchup} with a value of 1 and {Mayonnaise} as a value of 1.

       

      Unfortunately, representing the data as a series of boolean values is problematic as well, in terms of visualizations.  Does anyone have any recommendations for how to achieve what I'd like to do?  Effectively, creating a set of boolean values treated as a single data point for visualizations.

        • 1. Re: Working with data represented as arrays
          Tamas Foldi

          Yes, I had the same problem with Oracle's VARRAY/NESTED TABLES as well with postgres/greenplum arrays. The solution / workaround is to use unnest for field definition (in underlying table views or with RAWSQL_STRING).

           

          So, instead of COLUMN try UNNEST(COLUMN) where COLUMN is your array type. It will generate something from:

           

          ID | ARY

          ---+----------

          1  | {foo,bar}

          2  | {barfoo,foobar,foofoo}

           

          if you select like SELECT ID, UNNEST(ARY) from TABLE you will get

           

          ID | ARY

          ---+----------

          1 | foo

          1 | bar

          2 | barfoo

          2 | foobar

          2 | foooo

           

          thus you can create your very own bar chart. The fun that you don't have to use custom SQL, as a single RAWSQL can unnest your array

          2 of 2 people found this helpful
          • 2. Re: Working with data represented as arrays
            Pinney Colton

            Thanks - that is helpful when I'm working with a single column, but it would result in double counting if I wanted to include another field (like Gender, for example) in my results.  I'm hoping for some way to work with that data simply by dragging and dropping fields in a workbook using "select * from table" as my query, rather than having to write specific queries for each worksheet.

             

            It doesn't sound like that ability exists in Tableau currently.  I can certainly use various queries to aggregate the data before working with it in Tableau.

            • 3. Re: Working with data represented as arrays
              Tamas Foldi

              First of all, you can still pivot a content of an array to column using UNNEST + CROSSTAB keywords. But that one needs a custom SQL, but at least you will see all of your array values as columns.

               

              To be honest even if SQL arrays are part of SQL92 standard only just a few SQL engines implemented and even fewer people using it. It is just a dirty hack to put relations into the same DB block even if by relational design they should be located in different tables. And there are many functions which converts arrays into columns or rows, so there are always workarounds. Thus, I don't see this as a critical "lack" in Tableau's data access. But that's only my opinion.

               

              However, you can still open feature request in "Idea" section and maybe tableau software will implement it in a forthcoming version. I will vote for that definitively as I still like to use arrays even it's just an unrelational hack.

              • 4. Re: Working with data represented as arrays
                Pinney Colton

                Oh I agree it's not as much a Tableau shortcoming as it is an issue with the usefulness of standard relational tools when working with this type of data.  Other tools like SPSS are a little better at dealing with this type of data.

                 

                This does impact my ability to put Tableau desktop in the hands of someone who doesn't really know SQL all that well and still produce useful results.  It's not a big deal when you have just one or two sets of boolean values you want to work with, but if you have a survey with 200 questions and 50 of which contain lots of boolean values, you end up writing a lot of very specific queries to work with the data.

                 

                When I have a moment to write this requirement up, I will definitely add it to the idea section.

                • 5. Re: Working with data represented as arrays
                  Pinney Colton

                  @Tamas:

                   

                  I have created an "idea" at:  http://community.tableau.com/ideas/1767

                   

                  I decided to go about this from the angle of improving how we work with large numbers of boolean fields, rather than trying to get Tableau to support columns with array data types.  I think we both agree that the latter is not universally supported across relational tools.  The former - boolean fields - are universally supported, and anyone who has ever had to work with a few hundred of them to build reports can relate to this issue.

                  • 6. Re: Working with data represented as arrays
                    Eron Lloyd

                    I have begun heavily utilizing arrays instead of basic many-to-many tables, and it has been going well. Whether one agrees if it's an acceptable 1NF that ensures atomicity is up for debate, but the fact is that arrays have been included as part of the SQL standard since the 1999 release, and with SQL:2003, things like MULTITYPE are now also supported. Having support for array display and manipulation in Tableau would be very much appreciated for anyone doing data science and working with all the various array-based data sets.