    Tableau Prep Sampling

    Will Griffiths

      Hi there,


      Could someone clarify how the sampling is calculated as part of the data input? There is a basic explanation but it would be good to get some kind of full explanation of what is going on behind the scenes here:




      For example if I was to give it a table with 20 fields and 1,000,000 rows, how much would be returned?




          Isaac Kunen

          Hi Will,


          Roughly, we look at the schema for the table and try to make a guess at the number of bytes in a row. We then adjust the number of rows we pull in to hit a target size. So we'll pull in fewer rows for wider tables with larger data types (like strings), and more rows for narrow tables.


          BTW, we make the determination *after* looking at the choices you make in the input step. So if you can cull the columns that you pull in, we'll pull in more data for the columns that remain. And we also sample *after* the filters you apply in the input step; so adding filters there will give you a sample of the rows you actually care about.

