In my case we don't have too many publishers, so as admin I by default disable publishing for everyone and tell them to let me know when they are ready to publish for the first time. Then I take them through the process and explain the difference between extract and live connection (if needed) and all other aspects that you mentioned in your post. After that I enable the publishing permission. I can still intervene afterwards to mitigate any overloads and other inefficiencies.
I guess this wouldn't work if you have many publishers across many sites. You can either set up a process where only you can publish, publishers submit workbooks to you and you publish them (a Server Nazi option). Or, perhaps better, allow them to do whatever they want and monitor extract size and refresh times, etc., via Server admin views and intervene as required.
I don't think it makes sense to set any limits on data size or number of rows, as Server and extract performance depends on so many factors and 1M rows full refresh straight from a well-indexed table might run much faster than a convoluted custom-SQL query on 1K rows with 20 joins and no indices. Also, some databases are even faster than Tableau extracts, e.g. VectorWise, so having live connection to several million rows to it is no issue.
Thanks DB, good points. Yeah, I've got multiple users across the globe. Definitely not a lot but growing.