3 Replies Latest reply on Dec 13, 2013 10:06 AM by Robert Morton

    top n filter to impala

    scott wu

      Hello All,

           I use Tableau desktop 8.05, choose "single table" ==> "connect live" ==> cloudera impala ,

      When I apply top n filter, The query tableau issued to impala was

          SELECT `zip_more`.`description1` AS `none_description1_nk`, SUM(`zip_more`.`income`) AS `sum_income_qk` FROM `default`.`zip_more` `zip_more` JOIN ( SELECT `zip_more`.`description1` AS `none_description1_nk`, COUNT(1) AS `xtableau_join_flag`, SUM(`zip_more`.`income`) AS `x__alias__1` FROM `default`.`zip_more` `zip_more` GROUP BY 1 ORDER BY `x__alias__1` DESC LIMIT 10 ) `t0` ON (`zip_more`.`description1` = `t0`.`none_description1_nk`) GROUP BY 1

      The same result could retrieve without 'JOIN'

      SELECT `zip_more`.`description1` AS `none_description1_nk`, COUNT(1) AS `xtableau_join_flag`, SUM(`zip_more`.`income`) AS `x__alias__1` FROM `default`.`zip_more` `zip_more` GROUP BY 1 ORDER BY `x__alias__1` DESC LIMIT 10 ) `t0` ON (`zip_more`.`description1` = `t0`.`none_description1_nk`

      I use impala-shell to test this, the query without 'JOIN' only took half of time (4 min v.s 8 min),

      Do I miss anything? or anyway I could try to optimize this?

       

      Thanks in advance!

      --Scott