9 Replies Latest reply on Dec 29, 2015 4:02 AM by Simon Runc

    Restructure our database into the right format to easier use in tableau

    Randy Adhitama Setiadi

      Hi all,

       

      I want to ask what and how to get the right database format in tableau to make our work easier.

       

      This is the example of database format that i have right now

       

      PersonIDa1a2a3b1b2b3
      1100001
      2010100
      3001010
      4010001
      5100100
      6010010
      7001001
      8010100
      9100010
      10010001

       

      Let say i get this data from surveying certain people. On the column there is a1,a2,a3,b1,b2,b3, where actually the numbers represent the brand, and the alphabet represent the different question for each brand. So a1 and b1, basicly is the differen question for the same brand. For example : question a is "Which brand is good?" , question b is "Which brand is more reliable"

       

      the purpose is i want to show the data into percentage for each question. Something like this

       

       

      QuestionBrand 1Brand 2Brand 3
      a30%40%30%
      b30%30%40%

       

      Currently what i done was, i created a calculated field for calculation for each column (a1,a2,a3,b1,b2,b3) which is really time consuming and i have more than hundred question for each brand.

       

      Is there a better way to make this simple? restructing database from tableau or do it by my own? or change the variable name?

       

      Thanks

       

      Randy

        • 1. Re: Restructure our database into the right format to easier use in tableau
          John Sobczak

          One option is to reshape all the measure columns into a single column.  This can be done in the data source connection tab through the pivot columns feature.  Once this is done only one Table calculation is needed.  I have attached the sample workbook.

          • 2. Re: Restructure our database into the right format to easier use in tableau
            Rody Zakovich

            Hello Randy,

             

            The PIVOT Option in Tableau is only available for EXCEL and TEXT based files. So you will want to "re-structure" your data on the DB side via a new table or view, or through CUSTOM SQL.

             

            As far as "Shape" of the data, it really depends on what the end goal is. Most times having a Transactional or Long table, is going to be the ideal, but it really depends on what you are trying to do.

             

            Here is a great article on the topic.

             

            http://redheadedstepdata.io/lookup-vs-transactional/

             

            Note* alot of the more "complex" questions we see on the forums are related to making Tableau do something, when the data is not in an ideal "shape". Generally, the more Data modeling you can do on the backend, the easier your life will be in Tableau.

             

            Simon Runc I know you like these topics, any thoughts?

             

            Regards,

            Rody

            • 3. Re: Restructure our database into the right format to easier use in tableau
              Simon Runc

              Hi Randy,

               

              Note* alot of the more "complex" questions we see on the forums are related to making Tableau do something, when the data is not in an ideal "shape". Generally, the more Data modeling you can do on the backend, the easier your life will be in Tableau.

              ...very true! Randy you are going about things exactly the right way...a bit of time thinking about and reshaping your data is got back 20 fold in the end...and Tableau(ing) is a lot more fun working with it rather than against it!

               

              So as Rody says, 99% of the time you want you dimensions going down the page in single columns/fields, and a column/fields for each measure.

               

              So for what you have there, it would look something like

               

                 

              Personal_IDQuestion_IDBrand_IDScore
              1a130%
              1a240%
              1b150%
              1b220%
              1c190%
              1c250%
              2a160%
              2a250%

               

               

              From this structure (which is probably how it is stored in the DB) you can then join the Question_Master (which would put things like 'question title') and Person_Master (which would then contain any other information you collected on the participant, such as age)...or anything else you want to slice the data up by!

               

              From this 'shape' the example table you posted would be very quick and easy to build.

               

              As you're looking at survey data, you might find this excellent 3 part blog on the subject using Tableau interesting

               

              Using Tableau to Visualize Survey Data — Part 1 » Data Revelations

               

              Hope this helps, but please post back if you can't create what you need with this structure (ideally with some dummy data) and I'm happy to take a look

              • 4. Re: Restructure our database into the right format to easier use in tableau
                Randy Adhitama Setiadi

                Hi John thank you for your suggestion. I want to know how you manage to reshape the data like yours? do you do it manually on excel? or using pivot table on tableau or "Reshape data" from tableau excel add-ins ?

                 

                Hi Rody and Simon, thank you offered me another suggestion. I've tried to reshape it using both pivot on tableau and "Reshape data" on tableau excel add-ins. But how can i reshape the data into something like John's?

                 

                And want to make sure something, if the position of PersonID is in front of another variable it should be not affect the calculation right?

                 

                Thank you for you all participation

                • 5. Re: Restructure our database into the right format to easier use in tableau
                  Simon Runc

                  So you can do this with either!! If you look at John's workbook, you'll notice a '=' sign next to the Question and Brand dimensions. This is because John has used the Pivot feature in Connect to Data, which creates the 'Pivot field names' and 'Pivot field values' fields. He's then taken a left and right to create a 'Question' field (a,b,c...) and and 'Brand' field (1,2,3...). I've also attached the same thing done via the Excel Add-in. So I used the add-in as per the image below

                   

                   

                  From the resulting pivotted view, I then added in the LEFT/RIGHT functions to pull out the a,b,c or 1,2,3...I could have done these formulas in Tableau (but have put into Excel for Explanatory reasons).

                   

                  Using Tableau Pivot on the Connect Screen, I'd select all by the Person_ID column, and then select Pivot (as shown below)

                   

                   

                  I'd then use the LFET/RIGHT functions to create the Brand and Question dimensions

                   

                  Hope that makes sense?

                  • 6. Re: Restructure our database into the right format to easier use in tableau
                    Randy Adhitama Setiadi

                    Hi Simon,

                     

                    Yes, that actually makes sense how to do it. I tried to generate the same way to my original data, but i found a problem. My data contain more than 2k column to reshape. The last time i tried, i manage to take out unuseful column and make it just into 666 column (still too much i think). Then i tried to run tableau add ins in excel, it said because the data is too big it'll convert into .csv files, but even with .csv file it still can't fit all the data. I think it's because we work with big data.

                     

                    Any other solutions?

                     

                    Thanks

                     

                    Randy

                    • 7. Re: Restructure our database into the right format to easier use in tableau
                      Simon Runc

                      Yes that is a lot of columns!!

                       

                      Where is the data originally stored?

                      ...if it's in a Database (SQL, MS Access, MySQL...etc.) then you (or your DBA if you have one!) can create a view of the data using the UNPIVOT function. Although, I suspect (if it is in a DB) it is stored in this 'Long' format, and is actually PIVOTED so that it fits in an Excel worksheet.

                      • 8. Re: Restructure our database into the right format to easier use in tableau
                        Randy Adhitama Setiadi

                        HI Simon,

                         

                        sorry for late reply, had a vacation for a few days

                         

                        Yes, the data is originally stored in excel.

                        What i've done till today is..

                        - I manually added the brand column in the excel make it into 1 row of brand and duplicated it for numbers of brand (example: i have a data for 1k rows, and i have 5 brand. So i duplicate the data 5 times for each brand in every 1k rows. So my current data is 1k*5 = 5k rows).

                        - But i encountered a new problem, when i add the excel data into tableau it took very long time to make a workbook works. So i tried Simon's suggestion by using SQL. First i add the excel into SQL and make table on it. Then i use the SQL table as the data source in tableau, and it works much faster than excel. But then again there's another problem, i figured that SQL only can read not more than 512 column while my data is contain more than 2k rows. Any suggestion ?

                         

                        Thanks

                         

                        Randy

                        • 9. Re: Restructure our database into the right format to easier use in tableau
                          Simon Runc

                          In terms of other solutions...I think you're best to use some sort of ETL/Data-wrangler tool. I'm fortunate enough to have Alteryx (for Ad Hoc/Consultancy projects) or to define my own data model for production projects (where we use Python to get the data into the 'right' shape as part of the ETL).

                           

                          As such I've not had the need to use other tools...I have found the following link

                           

                          Unpivoting Data with Excel, Open Refine and Python | Open Data

                          OpenRefine

                           

                          where the Open Refine looks like the best option.

                           

                          Once you've got it into a CSV (in the desired long thin shape), you could connect this directly to Tableau. However performance wouldn't be great as any queries sent by Tableau are using CSV's Jet Engine which is pretty slow, to get round this I'd extract the CSV into a TDE and work from this. If you get new data you can just refresh the TDE and it will rebuild.

                           

                          Let me know how you get on