11 Replies Latest reply on Feb 25, 2014 11:25 AM by Matt Lutton

    Count transactions that have one item, but not another

    jonathan.richman.2

      I have data structured with a transaction_ID and then items tied to that transaction. It's simple to count the number of each item purchased and the number of transactions of course.

       

      What I'd like to do is count the number of transactions that have one item, but not another. Is there any way to perform a count like this?

       

      Unfortunately, I don't have a sample that I can post due to the sensitivity of the data. Any direction you can give would help.


      Thanks,

      JR

        • 1. Re: Count transactions that have one item, but not another
          Matt Lutton

          Should be fairly simple to mock up an example using Superstore data set provided with Tableau.

           

          Or simply create a random set of transactions that are in the same shape as your dataset, pull that into Tableau, and post the packaged workbook here along with a mock up of what you're trying to do.

           

          A lot depends on how your transactions and items are structured in the data.

           

          Thanks.

          • 2. Re: Count transactions that have one item, but not another
            jonathan.richman.2

            The problem is I haven't the foggiest idea how to even begin this. Clearly, there's some calculated field I'd need to create, but I'm not sure what that would be. I'll take a shot, but I'm not quite sure of how conceptually to make this work. Basically, the calculated field would be something like this (and I know this isn't the right syntax):

             

            If receipt_ID has item 1 and Item 2 (which is housed in dimension called item_name) then "Item 1 and 2"

            elseif receipt ID has only item 1 then "Item 1 Only"

            elseif receipt ID has only item 2 then "Item 2 Only"

            END

             

            And so on doing this will all my items (actually grouped items) are categorized like this. The idea would then to be able to count Item 1 and 2, Item 1 Only, etc. I don't know how to correctly tie all of this together or to do the exclusion logic (e.g., Item 1 Only).

             

            Any ideas how to get started?

            • 3. Re: Count transactions that have one item, but not another
              Matt Lutton

              The first step would be to show us how your data is structured.  Create a sample set of transactions that matches the structure of your raw data, pull that into Tableau, post a packaged workbook file here and that will allow you to get the best help possible.

               

              I understand what you want, but the method used will depend on how the data is stored.  For example, for each transaction, is there one row with comma separated items, or do you have multiple rows for each transaction?  The raw data matters, in a scenario like this.

              • 4. Re: Count transactions that have one item, but not another
                jonathan.richman.2

                One more obvious question and then I'll do what you suggest.

                 

                Is there a way to sample data using Tableau or do I need to do this on my own and pull into a workbook? Also, I have the issue that the item names will give away the customer and the data needs to be confidential or blinded. Ideas on how to handle?

                 

                It would be great if there's a simple solution for this, as I frequently have this issue, but can't exactly share the raw data I'm working with.

                 

                Thanks for your help.

                • 5. Re: Count transactions that have one item, but not another
                  Matt Lutton

                  There are several sample datasets (Superstore, Coffee Chain, etc) that come with Tableau, and for most scenarios, that data can be used to re-create the problem.  However, it is important that the structure of the data matches your production workbook--so for this example, are your transactions stored in the same way as the sample data?  If so, we can use that to create a simple example of what you are trying to do.  If not, its best to create your own sample that represents the structure of your data.

                   

                  For a scenario like this, I would simply look at my raw data for a sample set of transactions, open Excel, copy and paste a few records, randomize them, then pull that data into Tableau.

                   

                  There are some tips for creating sample data in Tableau with your actual data.  But I find its usually much simpler to take a sub-set of the data and randomize that in Excel, extract it, and post a packaged workbook file.

                   

                  Don't use real item names and customer data.  Just make it up.

                  • 6. Re: Re: Count transactions that have one item, but not another
                    jonathan.richman.2

                    Finally getting around to making this extract Matt Lutton as I now have a deadline to complete this analysis, so hoping you can help.


                    Here's an example workbook with how the data is structured in my full workbook.

                     

                    The one tab shows transaction ID (row) and item_name (column) and the values are SUM of quantity. So, this shows you how many of each item were purchased for each transaction. For example, transaction 8058498 included Items 1, 2, and 3 while transaction 8079310 included Items 4, 5, 6.

                     

                    What I want to do is to be able to count the different combinations of items ordered. For example, count transactions that only contain items 1, 2, 3. This is the part I have no idea how to do. The general idea is to create a calculated field like this:

                     

                    If transaction_ID has item 1 and Item 2 (which is housed in dimension called item_name) then "Item 1 and 2"

                    elseif transaction_ID has only item 1 then "Item 1 Only"

                    elseif transaction_ID has only item 2 then "Item 2 Only"

                    and so on...

                     

                    I'd then be able to sum the number of transactions with each label (e.g., "Item 1 Only") by using the calculated field.

                     

                    I'm open to this working in another way if anyone thinks that makes more sense.


                    Thanks,

                    JMR

                    • 7. Re: Re: Count transactions that have one item, but not another
                      Matt Lutton

                      This is a tough one for a number of reasons.  I can create conditions for "Item 1 Only", "Item 2 Only", etc. but cannot find a way to analyze transactions with multiple items.  This is one instance where storing all the items ordered in one column might actually benefit you, because you could then write a calculated field based on what an Item Name CONTAINS.  But since they are stored in separate records (which is what we usually want with Tableau), this becomes more challenging.

                       

                      I'm sure there is a way, but I don't have any really helpful advice to provide here.  If you were simply analyzing transactions based on the number of items ordered, we could do that, but I do not know how to do what you are wanting with this data set.  I tried using Sets, Combined Sets, and calculated fields, but couldn't figure it out.

                       

                      For single items, we can use something like:

                       

                      IF SIZE()==1 AND CONTAINS(attr([item_name]), "Item 1") then COUNTD([transaction ID]) end  // Item 1 only calc (must use the proper compute using settings for the view--addressing on Transaction ID, then Item_Name, restarting every Transaction ID--these fields must be in the view for this to work as well, so use of the Detail shelf may be needed)

                       

                      Even if I could figure out "Item 1 and 2 Only", it would be a table calc with an offset to look at the next row value, etc.  That would require a lot of calculations (one for each combination of items you want to analyze) and they'd be somewhat complicated table calculations..

                       

                      Anyway, I hope this helps.  I'm not saying there isn't a simpler way; I'm just saying I cannot figure it out.

                      1 of 1 people found this helpful
                      • 8. Re: Re: Count transactions that have one item, but not another
                        jonathan.richman.2

                        Fair enough. Thanks for looking into it. Good to know I wasn't the only one that couldn't figure out a way to do it.

                        • 9. Re: Re: Count transactions that have one item, but not another
                          Matt Lutton

                          With enough time, I could figure out the rest, but it would be very clunky and difficult to understand.

                           

                          If this type of analysis is needed, you can reshape the data so there's only one record for each transaction ID, with comma separated Items, then you could do what you want more effectively.

                           

                          But again, there's always a chance I'm missing a simpler way.  Hopefully someone else will see this and provide some help.

                          • 10. Re: Re: Count transactions that have one item, but not another
                            jonathan.richman.2

                            Assuming I could convert the data to what you're saying, what would the process be then? Can you point me in the right direction and I'll take a shot?

                            • 11. Re: Re: Count transactions that have one item, but not another
                              Matt Lutton

                              Something like:

                               

                              COUNTD(IF CONTAINS([Item], "Item 1") AND CONTAINS([Item], "Item 2") then [TransactionNumber] end)

                               

                              This assumes items do not have similar names, like "Pepsi" and "Pepsi One" or something like that--if that is the case, doing a CONTAINS on the word "Pepsi" wouldn't work because it would count both instances.

                               

                              Again, I do not know that reshaping the data is necessary; I'm just providing my experience.