5 Replies Latest reply on Apr 12, 2016 3:52 PM by Dmitry Chirkov

    Extracting Data

    Chrissy Scott

      Hi All,

       

      I am having some real problems creating an extract for my data. It's less than 250,000 rows and 30 columns. Left join on two tables. All unused fields are hidden. It's taking about 2 hours to complete. Ideas? Is 30 columns too many?

       

      I created a workbook with the same two tables, not joined. Refreshing the two extracts takes about 6 minutes. Does the join slow down the extract?

       

      Thanks,
      Chrissy

        • 1. Re: Extracting Data
          Andrew Connolly

          Hi Chrissy--

           

          The answer will come from the data side here. Are you joining on the correct field(s)? In the database, are these fields indexed? If you were to select the same rows/columns directly from the database, how long would it take?

          • 2. Re: Extracting Data
            Chrissy Scott

            Hi Andrew,

             

            Thanks for the response. Let me try to explain my logic. The data is survey data. Calls come in to the service desk and are assigned a "Call Number." If a survey is triggered for that Call Number a survey is sent. We have one table that lists all the call numbers where a survey was sent. And another table that lists just the surveys that were returned, by call number. Note: in one table the field is "Call Number" and in the other it's "Call No."

             

            So, right now I'm linking the the data by "Call Number."

             

            The table on the left lists all the call numbers where surveys were sent and the table on the right lists only call numbers if the survey was returned (and I'm using a Left Join) so the result lists all Call Numbers where surveys were sent and the survey scores if the survey was returned. The return rate is low, 18%-ish. So there are a lot of null values in the table.

             

            When the table was created directs from the database, it took less than 1 minute.

             

            Should I be joining differently?

            • 3. Re: Extracting Data
              Chrissy Scott

              Hi,

               

              I did some testing with the join field. Depending on the field I choose, it goes a lot faster. But... it's saying 1 million + rows when there are only about 250 tickets. Do you know what would cause that?

               

              Thanks,

              Chrissy

              • 4. Re: Extracting Data
                Andrew Connolly

                Sorry Chrissy--without knowing your database and data model, it's tough to answer confidently. If you're returning more rows than you expected, it's probably not the correct field to join on--i.e. that field does not have a 1 to 1 relationship between the rows in your two tables. Have you tried joining on multiple fields? Maybe that will return the right rows with the increased performance.

                 

                You mentioned that you had hidden all unused fields, which help. Have you also tried enabling the "Aggregate data for visible dimensions" option? This can improve extract creation performance as well.

                • 5. Re: Extracting Data
                  Dmitry Chirkov

                  Two simple questions:

                  1) What is the underlying datasource? Files? Database?

                  2) How many records (put "Number of Records" on the viz) do you get from the join and each individual table?

                   

                  -Dmitry