Thank you Norbert. I was hoping to get more intimate details as to what does it mean "Tableau takes a best guess"
Hmmm.... What I expect is the following.
All fields with "text"-values will be assigned as a Dimension.
All fields with "date"-values will be assigned as a Dimension.
All fields with "integer" values will be assigned as a Measure.
If "Best guess" is made incorrectly a simple "drag & drop" is sufficient.
Upfront: Understanding your underlying datamodel is "key"
Tableau has done a good job of documenting this -- see the links below:
More helpful info:
I'd also recommend viewing all the Training material -- specifically, the "Why is Tableau Doing That?" videos should help understanding the types of fields we can use in Tableau:
Thank you but your answer still does not address my question. What is "All Fields"? Does Tableau looks at all of the values for a column to determine if it is date or does it sample top 100 rows or does it look at information schema? You keep giving me a "human" front end answer where I am looking for an exact deep technical understanding.
The link you provided does not address my question. I was not looking for a lesson on the differences between Dimensions and Metrics, but rather specific technology question on what Tableau uses to identify data types.
Best wishes to both of you and thank you.
Thank you everyone,
Let me re-iterate the questions as I am obviously confusing everyone. If we take a table of a billion rows and join it with another table of a billion rows in a Custom Query data connection, Tableau has a few choices in terms of determining the data type:
1) Look at information tables that hold the metadata. But this would mean Tableau needs to decode the custom SQL and determine which fields belong to which table.
2) Take a sample of X top rows and then parse the data
2a) Take first X rows
2b) Do random sampling
3) Something else I have not considered?
The key here is that the tables could be huge and Tableau cannot have the capacity to examine unique values across every field. So how does it remain efficient in determining the Data Type?