7 Replies Latest reply on Sep 21, 2016 10:06 AM by Luciano Vasconcelos

    Desktop: 11min duplicated entries from csv

    David Hodgson

      I have a CSV which has rows every minute, and which I'm analysing further.  I've started seeing with Tableau Desktop 9.3.6, duplicated data in the results, where there are no duplicates in the source data.  I'd not noticed this previously, but that doesn't mean it wasn't there.

       

      Here's how Tableau presents a count of the entries for the first few minutes:

      Measure NamesMinute of time_of_dayMeasure Values
      Count of Live.Households1 January 1900 00:002
      Count of Live.Households1 January 1900 00:002
      Count of Live.Households1 January 1900 00:011
      Count of Live.Households1 January 1900 00:011
      Count of Live.Households1 January 1900 00:021
      Count of Live.Households1 January 1900 00:021
      Count of Live.Households1 January 1900 00:031
      Count of Live.Households1 January 1900 00:031
      Count of Live.Households1 January 1900 00:041
      Count of Live.Households1 January 1900 00:041
      Count of Live.Households1 January 1900 00:061
      Count of Live.Households1 January 1900 00:061
      Count of Live.Households1 January 1900 00:071
      Count of Live.Households1 January 1900 00:071
      Count of Live.Households1 January 1900 00:081
      Count of Live.Households1 January 1900 00:081
      Count of Live.Households1 January 1900 00:091
      Count of Live.Households1 January 1900 00:091
      Count of Live.Households1 January 1900 00:101
      Count of Live.Households1 January 1900 00:101
      Count of Live.Households1 January 1900 00:112
      Count of Live.Households1 January 1900 00:112
      Count of Live.Households1 January 1900 00:121
      Count of Live.Households1 January 1900 00:121
      Count of Live.Households1 January 1900 00:131
      Count of Live.Households1 January 1900 00:131
      Count of Live.Households1 January 1900 00:141
      Count of Live.Households1 January 1900 00:141

       

      The time_of_day is calculated: DATETIME([Time]-DATETRUNC("day", [Time] ))

       

      Here's the actual source data from the CSV file:

      timeLive.Households
      16/09/2016 00:0057
      16/09/2016 00:0150
      16/09/2016 00:0250
      16/09/2016 00:0359
      16/09/2016 00:0453
      16/09/2016 00:0552
      16/09/2016 00:0648
      16/09/2016 00:0747
      16/09/2016 00:0844
      16/09/2016 00:0940
      16/09/2016 00:1043
      16/09/2016 00:1157
      16/09/2016 00:1250

       

      Notice no duplicates!

       

      The duplicates are every 11 minutes, which is 1/128 day, which makes me think it's a data truncation / rounding issue.

       

      Is this a bug or did I make a mistake in my calculation of the time_of_day?

       

      It's quite dramatic in the graphs: