6 Replies Latest reply on Sep 29, 2016 2:28 AM by kettan

    Accessing Duplicate rows in a Column

    John Mervin

      Hi All,

      I have a task in Tableau, I am facing this issue since 1 month but I am not able to resolve it. Any help would be greatly appreciated.

       

      The first 4 columns in the image is coming from the excel file and the columns highlighted in the yellow color is the expected output/result. The Sample data for this task is attached in the workbook. By looking at the image you can come to know what is the output. If you need further explanation regarding how I need the result Is explained as below.

       

      Screen Shot 2016-08-03 at 6.54.20 PM.png

       

      • Is Repeat..?: if there is more than one “Part Number” for the particular “Series Name”, the second and consecutive row with respect to the same “Part Number” and “Series Name” should be passed with “YES”, and rest all should be “NO”.
      • Previous Serial Number: if there is more than one “Part Number” for the particular “Series Name”, the second and consecutive row with respect to the same “Part Number” and “Series Name” should be passed with the previous “Serial Number” with respect to the same “Serial Name” and “Part Number”, and rest all should be blank or null or empty.
      • Months Difference: if there is more than one “Part Number” for the particular “Series Name”, the second and consecutive row with respect to the same “Part Number” and “Series Name” should be passed with the difference in months for the previous “Part Number” with respect to the same “Serial Name” and “Part Number”, and rest all should be blank or null or empty.

       

      Thanks,

      John

        • 1. Re: Accessing Duplicate rows in a Column
          Michel Caissie

          John,

           

          If you are still stuck on this, you can check in the attached.

           

          Sheet 3  in the final view , similar to your mockup.

          In Sheet 2  I rearranged the dimensions so it's easier to understand the data.

           

          I used a lod calc to count the number of  Serial Numbers per SeriesName-Part Number

          {fixed [Series Name],[Part Number]: COUNT([Serial Number])}

           

          For the Part numbers having more than one serial number, I use the index() function to compute  Is Repeat

          if ATTR( [Cnt Sn per Pn-Sn] ) > 1 then

              if index() = 1 then  'No' else 'Yes' end

          else 'No' end

           

          Next , I use the lookup() function to get the previous Serial Number and the also using the datediff() function I get the Months difference

          if [Is Repeat]  = 'Yes' then lookup(ATTR( [Serial Number]),-1 ) end

           

          if  [Is Repeat]  = 'Yes' then datediff('month', lookup(ATTR( DATETRUNC('month', [Month])),-1 ),ATTR(  DATETRUNC('month', [Month]) )) end

           

          Now the index() and lookup() functions are table calculation so you have to set the computing on those.

          if you right-click the blue pills with a little triangle,

            go in Edit Table Calculation,

            Computing using:  click Advanced

            In the addressing Pane you need to add all the dimension in the same order that they are on the Sheet 2 Rows

            Restarting: Part Number

           

          Also, Previous Serial Number  and  Months Difference  are nested calculation. So when going in Edit Table Calculation  you have to set the computing on the two calculated field

          that you have in the Calculated field drop down.

           

          Michel

          2 of 2 people found this helpful
          • 2. Re: Accessing Duplicate rows in a Column
            John Mervin

            Hi Michel,

             

            Thanks for the reply, it works for all the condition except one thing.

             

            If you sort the "Month" field in Descending order, the output changes, it should give the same output irrespective of sorting (i.e. ascending or descending) or adding an extra field to the row or column shelf.

             

            Thanks,

            John

            • 3. Re: Accessing Duplicate rows in a Column
              Michel Caissie

              John,

               

              Table calculations are directly tied to the dimensions in the view. So if you add dimensions  on the row shelf you will have to adjust the computing of the table calculations accordingly.

               

              Also, you cannot drill down in a hierarchy because it adds dimension and the computing of the table calculation is not dynamic, so it wont work anymore.

               

              Regarding the sorting, since we are using the lookup function, relative position changes and it returns a different result.

              We can resolve this by putting a copy of the Month(Month)(copy) on the Detail shelf  and adding it in the addressing on top of the Month(Month) .This copy will remain unsorted even if you sort the Month(Month) in the view, so the result will be unchanged.

               

               

              Sorting Concepts Related to Quick Table Calculations | Tableau Software

              1 of 1 people found this helpful
              • 4. Re: Accessing Duplicate rows in a Column
                kettan

                Table calculations are fragile and complicated - a dangerous mixture of properties!

                Of course, if we have a table calculation that works, we will use it, if it is the best or only known solution.

                 

                This said, I think we all agree that having data as row level dimensions & measures makes life simpler.

                This can for some scenarios be done with Level of Detail calculations (LoD), but guess "previous row" data can't be fetched by this otherwise powerful feature.

                 

                Another approach is to do this outside Tableau.

                If your original data source is Excel, I would simply sort data and use formulas for the new fields.

                If your real data is stored in a relational database, I would probably use a window function for this ( or a correlated subquery column ).

                 

                You can see (and play with) an example of a window function (LAG) for this purpose here:   SQL Fiddle

                Referred link uses your sample data.

                 

                SELECT "Month", "Series Name", "Part Number", "Serial Number"
                , LAG("Serial Number") OVER (PARTITION BY "Series Name", "Part Number" ORDER BY "Month") "Previous Serial Number"
                FROM "Sheet1" t1
                ORDER BY "Series Name", "Part Number", "Month"
                

                 

                The syntax used here is for Postgresql 9.3 but believe it is more or less the same in other databases.

                 

                Having built-in support of SQL Window functions would probably be valuable and therefore posted this idea:

                 

                1   SQL Window Functions

                .

                1 of 1 people found this helpful
                • 5. Re: Accessing Duplicate rows in a Column
                  John Mervin

                  Thank you for all the response.

                   

                  -John

                  • 6. Re: Accessing Duplicate rows in a Column
                    kettan

                    If you are connected to a database, you may be able to use a correlated subquery inside a RAWSQL pass-through function. I intend to collect answers in this regard in  Re: Correlated Subquery