1 2 Previous Next 17 Replies Latest reply on Feb 22, 2017 9:37 AM by Joe Oppelt

    Aggregate with Table Calculation as Dimension

    Beckett Simmons

      Good day,

       

      I am working on a scheduling worksheet that needs to display batch data in a particular way. Effectively, we need to use a table calculation to create "jobs" and then we want to display the total duration of each job. The problem is that we cannot base LOD calculations on table calculations and we also cannot use a table calculation as a dimension for another table calculation (say, sum duration restarting every job). So I am stuck on how to calculate what I need. I've attached a packaged workbook as an example.

       

      A little terminology for the workbook. A batch can be broken into one or many "jobs". A job is simply a group of back-to-back "sub-processes" so that there have no other jobs or batches separating them in terms of time (that is why we need a table calculation) (Also note that the actual value of Job Id doesn't matter, just that they are unique for each job.). A sub-processes is simply a record in the database and each sub-processes has a unique start time.

       

      troubless.png

       

      As you can see above in the "What I kinda want" sheet, we have used WINDOW_SUM() table calculation to get half way to what I want. The problem is that it gets too granular. You can see that the first job is broken out into two lines instead of staying as one row and displaying the sum of both sub-processes (You can see this for the red at the bottom too.). And for some reason Job ID three doesn't even have the right duration sum.

       

      Ideally we could just do a simply LOD like {FIXED [Job ID]: SUM([Duration (days)])} but unfortunately Tableau's order of operations doesn't permit this.

      Any ideas on how to achieve this?

       

      Thanks

        • 1. Re: Aggregate with Table Calculation as Dimension
          Joe Oppelt

          (note to self:  V10.0 workbook here)

           

          See attached.  Sheet 4.  I just changed the table calc settings.

          • 2. Re: Aggregate with Table Calculation as Dimension
            Beckett Simmons

            Hey Joe,

             

            Good try, but it seems like sheet 4 just sums by batch and then repeats that same sum for each job (both blues are 7 and both reds are 10). Basically what the "What I have" sheet did in the first place. What I really want is the sum of each job. So the first three duration calculations rows should be: 4, 2, 3 The underlying sum of sub-processes for the first three rows would be something like: 1+3, 1+1, 3.

             

            Hope that that makes sense.

            • 3. Re: Aggregate with Table Calculation as Dimension
              Joe Oppelt

              I don't want to dig into it.


              What are the correct values and sizes of your various blocks.  How many rows do you want to see.  You know your numbers and what things mean.  I don't have my head wrapped around it.

               

              I can make it do what you want, but it's not clear what the final output should look like.

              • 4. Re: Aggregate with Table Calculation as Dimension
                Joe Oppelt

                Do you even want Job ID displayed?  What is that actually doing?

                • 5. Re: Aggregate with Table Calculation as Dimension
                  Beckett Simmons

                  correct_output.png

                  See above. I want six rows with those particular Total Duration by Job values. The sizes of each of the blocks are correct already.

                   

                  You are right, we don't need the show the Job ID column, it is made on the fly to break up the Gantt bars. (See "What I don't want" to see what happens if we don't use this.)

                   

                  (If you want to dig in: From the "What I have worksheet" you can scroll over the bars in each row of the Gantt and view the duration for each. Some rows have just one bar. Some rows have multiple bars that look like a single bar because they are actually two or three bars where the next bar starts as soon as the last one ends. For each row we want to get the sum of the duration for each bar in that row. For instance, the first row has two bars. The first has a duration of 1 and the second a duration of 3. Therefore we want a Total Duration by Job value of 4.)

                  • 6. Re: Aggregate with Table Calculation as Dimension
                    Joe Oppelt

                    What causes Batches 1 and 3 to break in half like that?  (Not the calc itself.  I mean what business logic makes you want to split them?)

                    • 7. Re: Aggregate with Table Calculation as Dimension
                      Joe Oppelt

                      The reason I ask is this.  You use a table calc to split them.  But you can't use a table calc as a dimension in addressing the [Total Duration] table calc.  But you need it to get the viz you want.

                      • 8. Re: Aggregate with Table Calculation as Dimension
                        Beckett Simmons

                        It has to do with the details of the particulars of how the batches are processed. Imagine more columns in the real data with things like trim settings, grade, input stock etc. So one batch could be processed with two different trim settings or two different input stocks and that will make the batch break up into multiple blocks. I removed all of those details for simplicity in my example.

                         

                        And yes, they need to be split in this particular fashion in order to get the view I'm looking for. The data source that is being used doesn't include this type of Job ID data and probably can't be changed to do so. That is the whole reason I'm using Tableau, so that we can do this type of analysis.

                        • 9. Re: Aggregate with Table Calculation as Dimension
                          Joe Oppelt

                          See Sheet 5.

                           

                          I took your data and made a new data source, adding another column (Other Factors) so that there is a dimension to separate the batches.  Your actual data has other columns that will do this for you, but I needed something in there.

                           

                          I kept Job ID to sort the "chunks" according to our desired order.  I just took it as black box that the calc does what you need it to do.  (Note, I have it on ROWS, at the front of the list, so that it does the sorting.  But I unchecked "Show Header" so it doesn't display.)

                           

                          We don't really need Total Duration on the ROWS any more, but I left it there for the sake of display.


                          Because there are multiple Start Dates in the data for some of the Batches, you get multiple marks.  But you really only want one, that starts on the first start date, if I'm understanding this correctly.  So I did a calc to capture INDEX(), and I put it on FILTERS, and I selected only value = 1.  The subsequent (if any) marks on a given row don't display.

                          • 10. Re: Aggregate with Table Calculation as Dimension
                            Beckett Simmons

                            Sorry, I guess I gave you misleading information.

                             

                            More correctly, the business logic is that we have different sub-processes (records of data) because there is a potential for other factors to change, not that they actually do change. To be honest, I don't really know why the batches are split up into sub-processes like that, but in the data there are sometimes places where sub-processes (records) may be in the same job (row in the Gantt) but have different other factors (both a and b). (I dunno, we could make up an example like when a washing machine stops. We could change some of the settings, but we could also just run another cycle with the same settings - with the only difference between the last two cycles being start time.) See sheet 6 to see how this can cause problems (I changed some data). So unfortunately we can not use job, date, and other factor as a composite key (if you know database terms) to differentiate jobs (Gantt rows). The only field that is guaranteed to be unique, a primary key, is the start date. Hope this makes sense.

                             

                            Effectively, Job ID is not dependent (even partially) on other factors. It is solely dependent on records having the same batch ID and also running back-to-back with each other.

                             

                            Also, I don't mind the multiple start dates, they can be left as is. It might be helpful to actually have multiple bars that look like one because they could have differing tool-tips. Though that's a nice little trick you did, I'll keep it in my back pocket if I need it someday.

                            • 11. Re: Aggregate with Table Calculation as Dimension
                              Joe Oppelt

                              I don't get what you're looking for with the new data.

                               

                              What do you want sheet 6 to look like?

                               

                              You can see how Sheet 5 handled it.  Given my understanding of the data, Sheet 5 is already doing what I would expect that new set of data to do.

                              • 12. Re: Aggregate with Table Calculation as Dimension
                                Beckett Simmons

                                Sheet 5 is visually what I am looking for. You are correct in that way.

                                 

                                But Sheet 5 makes some assumptions about the data that cannot be made. I made a new data source that is very similar to the one you made except that it changes some of the other factor values. In the real data the other factor isn't uniform like you originally understood. Hence Sheet 6. Basically, Sheet 6 is a duplication of Sheet 5 but just with a different dataset. Though I also made some changes to Sheet 6 so it would display better. But try it yourself. Duplicate Sheet 5 and replace the data source, you'll see the problems.

                                • 13. Re: Aggregate with Table Calculation as Dimension
                                  Joe Oppelt

                                  What you uploaded with the new sheet is using your new data set where you have the extra "a" for one of the Batch-3 rows.  Both your Sheet6 and my Sheet5 are using it in your workbook (Version D).

                                   

                                  Looking at my sheet 5 (which is using your new data), what do you need to see differently?

                                  • 14. Re: Aggregate with Table Calculation as Dimension
                                    Beckett Simmons

                                    Sorry, I'm not communicating very well am I.

                                     

                                    Below is  your original Sheet five. It is exactly what I want. The problem is that it depends on the Other Factor data having particular values and follow rules that the real data doesn't follow.

                                    sheet_5_old_data.png

                                    If I make two changes to the data we run into problems. All I did was change two Other Factor values.

                                    new_data_changes.png

                                    And then we get batches overlapping and not displaying as it originally did. (Originally like in the first image.)

                                    sheet_5_new_data.png

                                    I want Sheet 5 with my new data to display like Sheet 5 with your data.

                                    Practically, we shouldn't need to have the Other Factor column involved with this at all anyways. I doesn't help at all with determining how to organize and layout the data in the view.

                                     

                                    Forget about Sheet 6, I didn't know what I was talking about when I posted it.

                                     

                                    I hope this makes more sense.

                                    1 2 Previous Next