8 Replies Latest reply on Dec 9, 2016 2:40 PM by Sarah McGraw

    Help needed--data blending plus calculated field for crosswalk.

    Sarah McGraw



      I've been trying a couple of different approaches to bring together three different data sources into one dashboard, but I've only been successful with getting two out of three sources in correctly. I've attached a workbook so you can take a look. I think the answer is going to involve fixing a calculated field...scroll down for details, or read on for more background.


      Here's a summary of what I'm trying to do:


      I'm using custom SQL to bring in performance data from SQL server. Unfortunately, that data source lacks a critical crosswalk, which is located on a different server (the table is pretty small, and I can also just dump it into excel, as I did for the attached workbook--either way it's a separate source). I'm also linking to target data in excel.


      In my first few attempts, I discovered that it was easiest to bring the target data in using data blending (when I tried to use a cross-database join, the data kept blowing up and giving me incorrect totals). The file I've attached here shows the performance data (distinct count of participants) correctly charted against targets using the blend approach, but with the crosswalk not yet incorporated.


      The problem is that I also need to use the crosswalk to reassign a substantial percentage of the participants to the correct geographic area.


      First, I tried to bring the crosswalk in as a cross-database join. In some ways that worked--I was able to write a calculated field to pull the correct area for each seeker--but it caused the charts to break. Suddenly, my distinct count of participants, charted against the targets, was no longer working. The error message indicated that the problem was with using a distinct count and a blend at the same time. It seems odd to me that the problem only appeared once I added the cross-database join, since the blend was there all along. Also, the cross-database join did not use the field involved in the distinct count, and the blend also doesn't link on that field.


      Next, I got rid of the cross-database join and tried to bring in the crosswalk as another data blend. However, now the calculated field that associates the participants to the correct area is not working.


      This formula worked when I was using the cross-database join, but doesn't work with the blended data (with the full datasource.page.field name used for [WDA_Crosswalk]:


                     IF [WDA_ETOData] = "Central Office" THEN [WDA_Crosswalk]

                     ELSE [WDA_ETOData]




      However, with the data blend (linked on the same field, SeekerZIP), I get the error "All fields must be aggregate or constant when using table calculation functions or fields from multiple data sources". The source fields in the calculated field are text, and in the attached file, that field is called "Final WDA". (It's also stuck in the Measures category when it should be a Dimension).



        • 1. Re: Help needed--data blending plus calculated field for crosswalk.
          Joe Oppelt

          I just took a cursory look here.  When I opened the workbook (in 10.1) it upgraded, so I'm not going to post an updated workbook here that you probably can't read.


          I changed the calc with the error to this:



                 IF attr([Area]) = "Central Office" THEN attr(([Zip_WDA_Crosswalk (OfficeZipCrosswalk)].[WDA]))

                  ELSE attr([Area])




          Now there is no error.  That's how you get around "must aggregate" errors.  I don't know if this fixes things for you, but that's something you'll need to do to use stuff from the secondary source.  (You can also do MIN() or MAX() on strings.  That would also satisfy Tableau's need to bring aggregated data from the secondary source.)

          • 2. Re: Help needed--data blending plus calculated field for crosswalk.
            Sarah McGraw

            Hi, and thanks for the quick reply. Unfortunately, that doesn't fix the problem (I guess what it does is reveal a different problem, which is still useful).


            What is supposed to be happening is this:


            The source file has (among other data), an ID, an Area (13 choices, one of which is undesirable), and a zip code. The crosswalk has a zip code and an Area (12 choices). The blend (join) is on ZIP.


            The calculated field is supposed to check to see whether the "source" Area is the undesirable 13th choice. If it is, it's supposed to supply the "crosswalk" Area that is associated with the blended ZIP. And if the ZIP is not in the crosswalk, the field should supply the value "Statewide".


            Using "ATTR", "MAX" or "MIN" is not working at all (it's supplying a single value for all rows--either one of the Areas or "*" for multivalue). This seems like a simple matter of the blend on ZIP not working. So, why isn't it working? Could this be a data type issue?

            • 3. Re: Help needed--data blending plus calculated field for crosswalk.
              Joe Oppelt

              I was hoping you would mention what version your workbook is so that any modifications I make will be accessible to you. 

              • 4. Re: Help needed--data blending plus calculated field for crosswalk.
                Sarah McGraw

                Ah! 10.0. Been meaning to get around to the upgrade.


                -Sarah McGraw

                • 5. Re: Help needed--data blending plus calculated field for crosswalk.
                  Joe Oppelt

                  See attached.


                  What I did in here was made a duplicate of the ZIP data source, and blended just on AREA.  I did this because you probably still need to do your other blend by ZIP for other stuff, and you can use the original copy to do that.


                  So I blended just on AREA.  See Sheet 3.  It's just a simple viz to show how the blend is working.  But notice that ""Central Office" is null.  So See Sheet 4.


                  I made a calc to do a ZN on number of records.  (See Calc 2.)  That function forces a zero if the value is null.  Now we can hook text to the null" bar as well.

                  See [Calc 1].  You will use that value instead of WDA.  It's a value that resides in the primary source, so no matter what other blends you also have on the sheet, this value is available to you for each primary-source area.


                  BTW, the IF logic in Calc 1 could also say:


                  IF ATTR([Calculation 2]) = 0 then "Statewide"



                  If you ever end up with multiple [Area] values that are "undesirable", this would set them all to "Statewide".

                  • 6. Re: Help needed--data blending plus calculated field for crosswalk.
                    Sarah McGraw

                    Thank you so much! Unfortunately, this ends up looking pretty different than what I need, assuming I've understood your workbook correctly (which I shouldn't assume). The calculation I need isn't something I'd use as a measure at all: it's just a dimension to bucket distinct SubjectIDs by area. I won't put in on the chart in any way except as a filter, so that the user can choose an area and see the Distinct SeekerID totals for that area (it would replace the "Choose a WDA slicer/filter on the dashboard). Is there a way to move this element into a filter that I haven't figured out? I don't seem to be able to create an additional "Area chooser" calculated field and set it equal to "Calculation1". Is there a way to pass it through a parameter?


                    Also, it is definitely necessary to go through the zips to get to the areas in the crosswalk--that is because there are many records that have an invalid area ("Central Office") but a valid zip in the crosswalk, which allows me to bring in the valid area with the calculation. When the areas are replaced there should only be a couple thousand unassigned SeekerID remaining. (There are also records where both areas are valid but don't match each other--in those cases I need to use the value from the SQL table. I'm not sure what would happen to these if I blend on Area--would the record get duplicated?).


                    Would it be helpful if I attached the cross-database join version? Of course, that one won't let me do the distinct count while simultaneously bringing in targets.

                    • 7. Re: Help needed--data blending plus calculated field for crosswalk.
                      Joe Oppelt

                      A lot of what you are saying requires more knowledge of your data than I have time to digest.


                      But let me point out that you can still use Calculation1 as a filter.  You need to have [Area] on the sheet to do that, so I added it to the Targets by Region sheet to demonstrate this.  Then I put Calculation1 on the filter shelf.  there it is.  I added it to the dashboard, and it works on the Targets by Region sheet.  (See attached.)


                      You can still do whatever you want with the ZIP-blended original data source.  You can have yet another calc overriding the values in Calculation1 (call it Calculation3 just for identification purposes here.)  If you have a valid ZIP as identified in the original crosswalk data source, pull that value from the original source.


                      Or, perhaps you can try doing a blend on the original data source using ZIP AND Area.  You might be able to work out all the logic in one calc.  I just don't know enough about the data to work out the details on that.

                      • 8. Re: Help needed--data blending plus calculated field for crosswalk.
                        Sarah McGraw

                        Wow, this is going to take a bit to digest. It does look like the math is off here (some duplications happening somewhere), but that may be something I can fix. Thanks for your help, and hopefully this solution works out.