5 Replies Latest reply on Sep 27, 2018 1:22 PM by Josh Brockner

    Flow Failed: An error has occurred while running the flow.

    Josh Brockner

      Hi. I have a flow that will not run the output step. I have no errors listed in my flow until I attempt to run the output, then I have an error in the upper right of the canvas. This error has been "System error:null" on the original flow, and is now "System error:java.lang.RuntimeException:jave.lang.NullPointerException" in a test flow that I made to reduce joins in the flow.


      In the original flow I am joining data from 6 tables. I have added output steps after each step to see where in the flow the error begins. All table joins are working fine. Then I run one aggregate on all fields to remove duplicate rows, this is also working fine. The next step is to run another aggregate to get a sum of records for a specific courseID, the rows have courseID and professorID and some courses have multiple professors. This aggregate will tell me how many professors I have for each courseID. I am able to output this second aggregate as well. However, then I need to join these two aggregate steps to add the total number of professors in the course back to the original (unduplicated) data. It is at this point that my outputs start failing.


      I created the test flow by running off an output file of the joins of the first 5 tables to see if I was hitting a memory error. I still get a failure after joining the aggregate steps.


      Has anyone seen anything similar or have any feedback on the system errors?

        • 1. Re: Flow Failed: An error has occurred while running the flow.
          Jim Dehner

          Hi Josh

          I would really like to see the flow - can you post the file


          There is something in the wording off your post that I question -

          Prep is designed as a preprocessor to shape, clean and connect data files that are then used in Tableau (or other) for analysis -

          When you "Aggregate" the data you are reducing the dimensions - think like using and LOD in Tableau - you are creating a different virtual level of data in the file - except in Prep the "different level" is not virtual - you have re-shaped the data to reduce the dimension - now pivot the data - again reshaping and then try to rejoin it with the original file? What did you join on that wasn't already consolidated or reshaped?




          • 2. Re: Flow Failed: An error has occurred while running the flow.
            Josh Brockner

            I'm using the aggregate to create a measure for the most part, then I am adding that measure back to the rest of the data with a join. In my case I have a field for class ID that is not unique to each row. the unique field is a calculated field that combines the class ID with the Faculty ID, since each class can have one or more faculty. I need to find the percentage of the class that should be assigned to each Faculty member for that class. The way I did that was to take an aggregate on the classID field to create a measure with the total number of rows that it appears in. This step leaves me with a classID and a total of the Faculty for that class. Then when I join back to the rest of the data on class ID I am just adding the total Faculty measure to the rest of the data, and from there I can create a calculation to find the % of the class to attribute to each faculty member for each class. It may not be the best way to create the calculation.


            I tested it further yesterday on a system with more memory but it still failed. It was actually failing before the join and before any aggregation, while in earlier testing it failed after the aggregation. One thing that I didn't mention was that this flow also contains custom SQL as one of the data sources. I then created a new test flow using the output of the first 5 table joins and the output of the custom SQL, and it still failed. So now I am wondering if there is something in the data I am pulling with the custom SQL that is causing Tableau Prep to fail.


            Is it possible for the flow to fail based on something in the data? Does Tableau Prep somehow sanitize the data it pulls from a data source, that I am missing in my custom sql?

            • 3. Re: Flow Failed: An error has occurred while running the flow.
              Josh Brockner

              So my tests were a red herring. My current hypothesis is that the error is a result of a join upon calculated fields. I am using a calculated field that is a  concatenation of three other fields to make a unique identifier.

              • 4. Re: Flow Failed: An error has occurred while running the flow.
                Jim Dehner


                This is the heart of the question I asked before -

                did you create a "Calculated field"  or did you do an "aggregation "


                Calculated field









                see the csv file attached - it is the output of the step 


                I can't see your data so I don't know what you want to do - but I think it is the later that reshapes the data



                If this posts assists in resolving the question, please mark it helpful or as the 'correct answer' if it resolves the question. This will help other users find the same answer/resolution.  Thank you.

                • 5. Re: Flow Failed: An error has occurred while running the flow.
                  Josh Brockner

                  I have aggregated steps and calculated fields. I am using an aggregation step then rejoining to the rest of the data on the one field that I ran the aggregation on in order to get a sum of the rows that share that field entry. As far as I can tell, I can't calculate this with a calculated field. For example, using your images as an example, it would be like trying to find the total number of orders from each customer and adding that field to the original row level data. I can't find a way to do that using the calculated field in tableau Prep, but I can and have done it as I described above. Create an aggregate step to group on customerID and aggregated Number of Rows, then join that aggregation step back to the previous step on customerID to then add the number of rows containing customerID to the original row level data.