12 Replies Latest reply on May 18, 2016 7:16 AM by mortenbodaugaard.jrgensen

    Multiple Fact Tables - Common Dimensions

    Rishabh Dhingra

      Hi,

       

      I have a few measures in one fact table and another set of measures in second fact table. Both these fact tables share common dimensions and are joined to the dimensions at the same level.

       

      Can I simply add these two fact tables and create a join for these two fact tables to the dimensions? I am not sure if it would create any duplicates. Or I should be using data blending in this case?

       

      This is really urgent!! Any help would be appreciated.

       

      Thanks,

      Rishabh

        • 1. Re: Multiple Fact Tables - Common Dimensions
          Jonathan Drummey

          In general, a join will be faster than a data blend so I'd suggest going with the join. In the connection editor you can choose to hide fields from any of the tables and thereby remove duplicates.

          • 2. Re: Multiple Fact Tables - Common Dimensions
            Rishabh Dhingra

            Hi Jonathan Drummey

             

            Thanks for your response. Will it not duplicate the data?

             

            Example:-

             

            Fact 1

            Product    Customer  Sales

            P1               C1         200USD

            P2               C2         100USD

             

            Fact 2

            Product    Customer  Discount

            P1               C1         50USD

            P2               C2         10USD

             

            Product Dimension

            P1

            P2

             

            Customer Dimension

            C1

            C2

             

            Now, can I join Fact 1 with Product and Customer Dimension and also Fact 2 with Product and Customer Dimension and report the right Sales and Discount figures?

            • 3. Re: Multiple Fact Tables - Common Dimensions
              Jonathan Drummey

              Assuming that all you have in your fact tables is those fields, then yes, you won't be duplicating data. If you have other fields (such as purchase date, etc.) then those would also need to be join keys to prevent unwanted duplication. There are situations where adding too many join keys causes undesired performance, that's one of the exceptions to the "in general, a join will be faster than a blend" statement I made in my prior post.

              • 4. Re: Multiple Fact Tables - Common Dimensions
                Rishabh Dhingra

                I have multiple fields in the fact tables, like surrogate keys which would be joined to the respective dimension tables on the key column. I think I have around 8 dimension tables, hence 8 such columns in the fact tables. These key columns remain the same in both the fact tables.

                 

                Will it work in this case?

                • 5. Re: Multiple Fact Tables - Common Dimensions
                  Jonathan Drummey

                  If the data has the granularity that you've said it does and you are able to join across all of the dimensions, then yes, it will work (subject to my caveat around performance). However, since you seem to have an incomplete understanding of joining across tables (the questions you ask are not unique to Tableau, and instead normal for any relational database), I suggest you do thorough testing to be sure that it is working as desired.

                   

                  Jonathan

                  • 6. Re: Multiple Fact Tables - Common Dimensions
                    Rishabh Dhingra

                    Thanks Jonathan Drummey,

                     

                    I shall try this and do a thorough testing. Also, could you please help me in understanding which approach works best?

                     

                    a. Having one single fact table joined to all the dimensions in Tableau.

                    b. Have one single flat table containing both dimensions and facts

                     

                    Thanks,

                    Rishabh

                    • 7. Re: Multiple Fact Tables - Common Dimensions
                      Jonathan Drummey

                      Like many things in Tableau, the answer is, "it depends." What is "best" depends on a number of factors, including (but not limited to):

                       

                      - volumes of data

                      - speed of sources

                      - relative sparseness of data

                      - freshness of data

                      - live connection vs. extract

                      - granularity of the data (particularly when facts/measures are at different granularities)

                      - granularity of the analysis/es

                      - interactivity & view load/refresh time requirements

                       

                      The easiest thing for Tableau to work with is a data source that is flat, but that doesn't mean creating a single flat table. For example, when using a live multiple tables connection to a source with referential integrity Tableau can use join culling to only query what is necessary, so a single fact table joined to the dimension tables could work. A Tableau data extract essentially creates a flat table that is highly optimized for the kinds of queries Tableau makes.

                       

                      However, there are dependencies on the structure of the data and the analyses you are performing, here are some examples:

                      - Doing a ton of joins in a live connection when there is high cardinality (lots of distinct values to join) might end up being too slow, so doing some pre-computation to flatten out the source might be better.

                      - If you have budget data at a coarser granularity than detailed expenditures or a similar situation, the work to make the two different granularities line up in Tableau might be more easily/quickly done in the data source.

                      - The major granularity of the analysis might be at some high-level roll-up, so creating a (drastically smaller) aggregated source might be worthwhile to keep that fast, then have a second (larger and slower) source for drill-down into the details.

                      - A daily executive dashboard might demand a load time of 5 seconds to display and therefore require more effort in the data source, while a monthly report might be ok taking several minutes.

                       

                      I'm sorry I can't give you a single "best" answer, the use cases to which we put Tableau are so varied that there's only a best answer for each distinct situation. Given a few sentences of description on the forums, all I can provide are some guidelines.

                       

                      Jonathan

                      • 8. Re: Multiple Fact Tables - Common Dimensions
                        kettan

                        which approach works best?

                         

                        a. Having one single fact table joined to all the dimensions in Tableau.

                        b. Have one single flat table containing both dimensions and facts

                        This  dilemma  would  diminish  if we had  Dimensionally Modeled Extracts 

                        • 9. Re: Multiple Fact Tables - Common Dimensions
                          Rishabh Dhingra

                          Hi Jonathan,

                           

                          Thanks a lot for your inputs, this is really helpful.

                           

                          I agree that the best answer varies for different use cases. I am looking to see what works the best for you, I am currently using a single fact table connected to multiple dimensions to use join culling.

                           

                          Also, I am thinking to explore creating a single flat table to simply remove any scope of joins to make it even more faster.

                           

                          Just to give you a background, we had initially developed the dashboards on SQL Server Analysis cube, however we faced lot of performance issues. I had tried simplifying the dashboards and was able to bring the time down but still we have one dashboard which takes around 50 seconds to load and around 80 seconds for an action filter. Therefore, we thought lets try connecting to the data warehouse directly to see the performance and we are just looking for the best way to get speedy dashboards. I know extracts would be the best option to get the desired performance results. But we also need to consider minimal changes to our current architecture and maintenance strategy.

                          • 10. Re: Multiple Fact Tables - Common Dimensions
                            Jonathan Drummey

                            I totally understand the need to balance between costs of maintenance, changes, load times, etc. At a higher level, I've come to frame this as a problem of design needs. Very often we are working with systems (software, data structures, processes & procedures, etc.) that were designed for operational needs, marketing needs, legal needs, billing needs, but not necessarily the requirements of analytics. In my case, I work in a healthcare system with software & processes that are designed for clinicians and for billing, but not for what I need for quality improvement. So I spend a majority of my time researching & wrangling data.

                             

                            You asked what works best for me, see I Have Wee Data – Microsoft Access and Tableau | Drawing with Numbers for details. I generally use queries to create flat data sources and then use extracts, but that's due to the particular limitations of my environment.

                             

                            Also, if you haven't seen this already, definitely check out Designing Efficient Workbooks | Tableau Software by Alan Eldridge, it's the best resource on improving performance that I know of.

                             

                            Jonathan

                            • 11. Re: Multiple Fact Tables - Common Dimensions
                              Stephen Moody

                              I know this is a year old but maybe others still reference this post?

                               

                              I am assuming that you have more than one fact table because of different grains. Unless Tableau has figured out how to write coalesce queries to write separate queries for each fact and then join the results (which avoids duplication), the answer is no.  Some try to address it with data blending but that isn't what it was designed for (and it is typically slow).  In my opinion, the best approach is to denormalize (flatten out) the data before you introduce it to Tableau (which is the conclusion you came to.)

                              • 12. Re: Multiple Fact Tables - Common Dimensions
                                mortenbodaugaard.jrgensen

                                Otherwise you can union all the data in a single table with a lot of columns and then load it into an extract (The same way you do it when you use QlikView or Qlik Sense)