9 Replies Latest reply on Nov 13, 2013 5:43 PM by Julius

    Does join culling work in Netezza?

    anders hoff

      I am accessing a star scheme stored on Netezza via Tableau. I am aware that Netezza does not enforce primary key and foregin key restrictions, but I was hoping that join culling would still work. I have followed the steps outlined in this post: http://tableaulove.tumblr.com/post/11692301750/what-i-learned-about-tableau-join-culling-over-summer , but I do not get the same results as in the post.

      In short: Tableau still executes the unnecessary join statement when I add the "units" column.

       

      My access to Netezza is via the Netezza ODBC driver, which I am guessing might be relevant.

       

      Have anyone gotten join culling to work in Netezza, or does anyone have any suggestions about what I am doing wrong?

       

      Using Tableau Desktop 8.0 and Netezza NPS 7.x.

       

      Anders.

       

      -----

       

      my tables:


      create table tmp_dim_pets(
          id integer primary key not null,
          name varchar(20) not null
        )
        distribute on random ;

      create table tmp_fact_pets(
          date datetime,
          id integer,
          unit integer
        )
        distribute on random ;


      insert into tmp_fact_pets values('1/1/2012',1,10);
      insert into tmp_fact_pets values('1/1/2012',2,2);
      insert into tmp_fact_pets values('1/2/2012',3,3);
      insert into tmp_fact_pets values('1/2/2012',2,5);
      insert into tmp_fact_pets values('1/2/2012',1,6);
      insert into tmp_fact_pets values('1/2/2012',3,6);
      insert into tmp_fact_pets values('1/3/2012',4,7);

      insert into tmp_dim_pets values(4,'horse');
      insert into tmp_dim_pets values(3,'fish');
      insert into tmp_dim_pets values(2,'dog');
      insert into tmp_dim_pets values(1,'cat');


      alter table tmp_fact_pets
        add constraint fk_pets
        foreign key(id) references tmp_dim_pets(id) ;

       

       

      the sql executed by netezza:

       

       

      SELECT SUM(cast("TMP_FACT_PETS"."UNIT" as BIGINT)) AS "sum_UNIT_qk"

      FROM "TMP_DIM_PETS" INNER JOIN "TMP_FACT_PETS"

      ON ("TMP_FACT_PETS"."ID" = "TMP_DIM_PETS"."ID")

      HAVING (COUNT(1) > 0)

        • 1. Re: Does join culling work in Netezza?
          Cristian Vasile

          Anders,

           

          the author of the article posted on tumblr is Russell Christopher, so let's ping him to be aware of your post.

          >ping -w 390 -n 32 -l 512 Russell

           

          Regards,

          Cristian.

          1 of 1 people found this helpful
          • 2. Re: Does join culling work in Netezza?
            Russell Christopher

            It will not. Referential Integrity constraints must be in place.  Sorry.

            • 3. Re: Does join culling work in Netezza?
              anders hoff

              It turns out that I get the join culling behaviour that Russel Christopher describes in his post if I connect to Netezza via the IBM NETEZZA option rather than via the ODBC option. This is in Tableau Desktop 8.0 (I forgot to add the version in my original post, will add it now.).

              • 4. Re: Does join culling work in Netezza?
                Robert Morton

                Hi Anders,

                 

                Yes, you can define constraints in Netezza. Even though they are not actually enforced by Netezza, Tableau is able to determine the PK/FK references between tables, and use that to perform join culling.

                 

                -Robert

                • 5. Re: Does join culling work in Netezza?
                  Julius

                  (Resurrecting this conversation.)

                   

                  There's a new feature in 8.1 called Enable Tableau referential integrity. Will this direct Tableau to do join culling in Netezza, negating the requirement to define constraints within the database? I'm in the DW school that RI is not needed in a star schema design if you have unknown handlers but join culling didn't work in 7.0 if FKs weren't defined. The 8.1 feature is most welcome.

                   

                  Can you also comment if this dialog will still require inner joins, or can enabling this dialog force join culling whether we inner joined or left joined the star schema query.

                   

                  Thanks.

                  • 6. Re: Does join culling work in Netezza?
                    Russell Christopher

                    It will indeed. The point of the feature is to allow you to get the behavior you want without litterning up the dw with constraints you might not want.

                    • 7. Re: Does join culling work in Netezza?
                      Julius

                      Thanks Russell. What about the joins? When using Enable Tableau referential integrity, does it matter if the joins are inner or outer (which was the case pre 8.1)?

                      • 8. Re: Does join culling work in Netezza?
                        Russell Christopher

                        Haven’t tested that myself yet, sorry.

                        • 9. Re: Does join culling work in Netezza?
                          Julius

                          Tested it on 8.1 beta 6 and it looks like it requires inner joins. Adding Tables to the Data window

                           

                          -- If outer joined, the dimension does not do join culling, even if option is checked

                          2013-11-13 01:10:09.588 (-,-,-,-) 1250: <QUERY protocol='005ff7c8'>

                          2013-11-13 01:10:09.588 (-,-,-,-) 1250: SELECT [DimCustomer].[LastName] AS [none:LastName:nk]

                          2013-11-13 01:10:09.588 (-,-,-,-) 1250: FROM [dbo].[FactInternetSales] [FactInternetSales]

                          2013-11-13 01:10:09.588 (-,-,-,-) 1250:   LEFT JOIN [dbo].[DimCustomer] [DimCustomer] ON ([FactInternetSales].[CustomerKey] = [DimCustomer].[CustomerKey])

                          2013-11-13 01:10:09.588 (-,-,-,-) 1250: GROUP BY [DimCustomer].[LastName]

                          2013-11-13 01:10:09.588 (-,-,-,-) 1250: </QUERY>

                           

                          -- If inner joined, the dimension does join culling, even if option is not checked

                          2013-11-13 01:20:51.673 (-,-,-,-) 0680: <QUERY protocol='03a2ab30'>

                          2013-11-13 01:20:51.673 (-,-,-,-) 0680: SELECT [DimCustomer].[LastName] AS [none:LastName:nk]

                          2013-11-13 01:20:51.673 (-,-,-,-) 0680: FROM [dbo].[DimCustomer] [DimCustomer]

                          2013-11-13 01:20:51.673 (-,-,-,-) 0680: GROUP BY [DimCustomer].[LastName]

                          2013-11-13 01:20:51.673 (-,-,-,-) 0680: </QUERY>

                           

                          -- The fact table does join culling only if inner joined, and in cases I've found it does join culling even when option is not checked. Not sure when it determines this.

                          2013-11-13 01:12:45.998 (-,-,-,-) 1a7c: <QUERY protocol='009af9e8'>

                          2013-11-13 01:12:45.998 (-,-,-,-) 1a7c: SELECT SUM([FactInternetSales].[SalesAmount]) AS [sum:SalesAmount:ok]

                          2013-11-13 01:12:45.998 (-,-,-,-) 1a7c: FROM [dbo].[FactInternetSales] [FactInternetSales]

                          2013-11-13 01:12:45.998 (-,-,-,-) 1a7c: HAVING (COUNT_BIG(1) > 0)

                          2013-11-13 01:12:45.998 (-,-,-,-) 1a7c: </QUERY>


                          The implications of this for me is that I will most likely always use inner joins as a rule of thumb and that it really makes this approach [ multiple tables with dimensional design > single denormalized table > custom sql ] possible.