2 of 2 people found this helpful
Roughly, we look at the schema for the table and try to make a guess at the number of bytes in a row. We then adjust the number of rows we pull in to hit a target size. So we'll pull in fewer rows for wider tables with larger data types (like strings), and more rows for narrow tables.
BTW, we make the determination *after* looking at the choices you make in the input step. So if you can cull the columns that you pull in, we'll pull in more data for the columns that remain. And we also sample *after* the filters you apply in the input step; so adding filters there will give you a sample of the rows you actually care about.Hope this helps,-Isaac