1 2 3 Previous Next 516 Replies Latest reply on May 24, 2017 11:30 AM by Minh Nguyen

    Grow your own Filled Maps

    Richard Leeke

      Version 7 filled maps are great, but the one thing you can't do is define your own shapes.

       

      There was a lot of discussion on the version 7 beta forum about how cool it would be to be able to import definitions of shapes while defining custom geocoding. I was particularly interested in that having spent a few years as the technical lead on the project that digitised New Zealand's survey records and put them online - so I was really keen to see what I could visualise with all that data I helped make available.

       

      So when the answer came back from the Tableau folk during the beta that there would definitely be no support for extensibility in version 7, I couldn't resist the challenge. I'm not known for taking no for an answer.

       

      And the answer is that if you're prepared to rummage about under the covers a bit in unsupported territory, it is possible to add shapes to custom geographic roles. There's quite a bit of work to get it going, but with the help of some open source GIS libraries I've managed to automate the whole process and bundle it all up as a tool which makes loading data from a shape file quick and relatively easy (once you've got the hang of it). I've shared it with a couple of people to get a bit of feedback on how usable it is, and I'll happily share it with the wider community once I've tidied up a few of the loose ends the "beta testers" highlighted.

       

      At this point I should stress that this is an unsupported hack, and by unsupported I mean several things:

      1. If it doesn't work as you expect, there's no guarantee it will ever get fixed. Best endeavours,if I'm interested and not too busy, that sort of thing.
      2. If you have any problems with a workbook that uses this approach; don’t even think about asking Tableau for help until you remove the custom geocoding. (I have no idea what Tableau’s attitude would be, but I know what mine would be if I were them.)
      3. It is virtually certain that a future release of Tableau will change how geocoding works in some way that will stop this from working altogether – simply because this approach relies on very specific (and unpublished!) details of the internal structure of the geocoding database. That is bound to change at some point. Hopefully any release that changes things in this way will also include adding support for similar extensibility capabilities, but there’s absolutely no telling.
      4. It uses an open source GIS library - and at least one of the features I'm using (simplification of complex shapes) doesn't work as well as I'd like - but there's nothing I can do about it.

       

      The implications of all of this are clear: don’t use it for anything which you care about. In particular don’t use it for anything which needs to keep working beyond the next release of Tableau.

       

      Personally, I intend to use it for point-in time, throw-away analysis: blog posts and the like, and also to explore how this sort of capability would be useful if it were a supported part of the product. I strongly suggest you limit your use similarly.

       

      You have been warned.


      Example

       

      Here's a sample viz on Public which illustrates what it can do - and also highlights some of the issues to watch out for which I'll discuss below.

       

      Edit: November 2014 - A recent change on Tableau Public has reduced the maximum precision at which maps can be shown - hence the very jagged coastline shown here. The Tableau Public change also briefly broke a lot of filled maps generated with the hack completely, due to a subtle difference in the internal encoding generated by the hack. Which all goes to emphasize the risks of relying on hacks.

       

      This viz shows the Tsunami warning zones for the area around where I live in New Zealand, with red, orange and yellow reflecting the areas affected by progressively larger or more local events. The green area shows the "meshblock" (a collection of land parcels) which contains my house.

       

      What it Does

       

      The utility takes one or more spatial data files containing polygon data, transforms them to an appropriate geographic coordinate reference system for Tableau to use (i.e. to lat/long coordinates) and generates CSV files in the format needed for creating custom geocoding roles.  After the custom geocoding has been imported to Tableau the utility is run again to insert the polygon boundary data into the custom geocoding database.

       

      The source spatial data can (in principle) be in any spatial data format supported by the Geographic Data Abstraction Library (GDAL, an open source GIS library).  I say in principle because I’ve only tested with ESRI shape files and a handful of other formats, but I see no reason why it shouldn’t work with anything the GDAL utilities can understand (which is an extensive list).

       

      The utility also supports purging of unneeded geographic roles from the resulting custom geocoding database. Reducing the size of the database in this way can improve performance and also reduces the size of any packaged workbooks (which minimises the use of Tableau Public quota when publishing the workbook).

       

      One of the key factors which determines the viability and usability of the resulting geocoding database is the number and complexity of loaded shapes. Too many, or too complex shapes can lead to very poor performance or even an out of memory error - it can simply take Tableau outside the envelope it is designed for.

       

      To help ensure you don't overload it, the utility provides the option to simplify the boundaries of the shapes using the GDAL library and also to display statistics about the complexity which help in deciding the appropriate simplification settings.

       

      However, don’t expect too much.  Simplification of spatial data is a notoriously difficult task and can often lead to anomalies and artefacts in the simplified data (such as missing or overlapping “slivers” at the boundaries of adjoining shapes).

       

      For example, the two screenshots below show a sample of New Zealand electoral boundaries simplified with a tolerance of 1,000 metres (left) and 100 metres (right).  The original shape file with no simplification results in almost 600,000 boundary points being loaded.

       

      Simplifying at 1,000 metre tolerance reduces that to 3,000 boundary points (a factor of 200), which makes the view much more responsive, but clearly introduces a lot of error.  At 100 metres tolerance the number of points is around 16,000 (down by a factor of 40 on the original), which still allows the view to respond quickly whilst also retaining acceptable accuracy.

       

                            Electoral 1000.png Electoral 100.png

       

      Finding the best compromise between simplicity (and hence performance) and accuracy can involve a lot of trial and error. Getting satisfactory results may require manual intervention using a GIS tool. It can be particularly difficult if there is a wide range of sizes of shapes in the one file, since the same level of simplification has to apply to the whole file.

       

      The second tab in the viz embedded above shows a few more examples of different levels of simplification, which you can explore interactively. The reason I've been using the Tsunami Zones shape file for a lot of my testing is because the shapes are extremely complex and clearly very difficult to simplify well.  It's very hard to simplify long, thin highly detailed shapes like that without breaking them. In fact, in this case, it's probably most appropriate to leave them detailed - the exact position of the boundary relative to my house is something I want to know as accurately as possible, and as I've only loaded the data for my local area, there isn't too much data and Tableau can cope with the detail.

       

      The third tab in the viz above demonstrates some of the differences between what you can achieve with filled map support and what you can achieve with "the old approach" of joining all the boundary points as polygons pioneered by Joe Mako a couple of years ago. For one thing, with the filled map approach, there are only three rows in the data source and three (very complex) marks displayed. With the polygon approach, the data source has over 20,000 rows - one for each boundary point. That makes things much simpler when it comes to using the shapes for actually displaying analysis results (here I'm just using it as a drawing tool, really).

       

      Another difference is that the polygon approach doesn't allow for holes in shapes. The best you can do is draw another shape over the top, representing the hole. I've illustrated this with the Tsunami zone around a headland across the bay from my house. Whilst support for holes in shapes matters a lot for certain types of GIS applications (such as if you're a resident who wants to know if their house is at risk of Tsunami damage), in practice it's probably not often important for the sort of visualisations being done with Tableau.

       

      But there are pros and cons of the custom geocoding hack and the polygon approach - not least that the polygon approach is completely supported by Tableau. So I've also built into the tool the ability to output the data in much the same format as Joe's utility does, allowing the individual points to be plotted as polygons. This just has the advantage over Joe's process that it automates all of the manual steps needed to transform the shape file to an ESRI format file in the right coordinate reference system, and it also supports simplification of complex shapes.

       

      I'm hoping to finish off tidying up loose ends and make this available sometime within the next couple of weeks.

        • 1. Re: Grow your own Filled Maps
          Richard Leeke

          Well "tidying up the loose ends" has turned into a bit of a mission, but it's almost there, and I hope to share it within the next couple of days or so, as soon as I've got some final feedback from the folks who've been testing it for me.

           

          In the meantime, I realised that something which would probably be much more use to most people would be a simpler way to extract the boundary points from shape files to allow drawing maps with polygons. That approach has the very distinct advantage of being fully supported by Tableau. No messing with the internals of the custom geocoding database and no danger that it will stop working at the next release.

           

          So I've chopped the relevant bits of code out of my "geocoding hack" and turned it into a simple DOS command line tool which does the whole job of extracting the feature and boundary point data from a shape file with a single command.

           

          Several people, most notably Joe, have shared methods of doing this in the past, so there's nothing particularly new here. It's just that with the work I had to do to get my hack going I had all the bits I needed to turn it into a very simple utility which does the whole thing with no manual intervention.  No more messing around with copying and pasting text from a GIS tool. No need to worry whether the shape file uses the right geographic coordinate reference system (or at least not usually - as long as it can recognise the file's CRS it converts it automatically). It has support for lots of different shape file formats including ESRI shapefile, MapInfo files, GeoJSON and a whole lot more (though I haven't tried many of them). It also has an option to simplify the boundary to reduce the number of points (which is subject to exactly the same caveats as I described above).

           

          Here's a brief explanation of how to use it. Download links and installation instructions are at the bottom of this posting.

           

          Usage

           

          The basic command to use it is very simple.  Given a shape file <shapefile_name>.shp (it may be a different extension if it's not an ESRI shape file), the command:

           

          shapetotab  <shapefile_name>.shp

           

          will generate 2 CSV files: <shapefile_name>_Features.csv and <shapefile_name>_Points.csv.  The first one will contain a row for each row in the shape file with all the feature details (including the calculated coordinates of the centres of the shapes), the second one will contain a row per boundary point, structured as needed to draw polygons in Tableau. Used like that, with no optional parameters, both files will include a generated unique ID column, which can be used to join the two files, so classifying details from the shapefiles can be used as dimensions in the Tableau workbook.

           

          The command has a few options, as shown below.  I'll explain the most important ones and then walk through a brief example of using it.

           

          Usage: shapetotab [options] <shape_file>

          OPTIONS:

            --name <output_stem>  the prefix of the output points and features files (defaults to

                                  shapefile name)

            --id <field_name>     the name of the unique identifier from the shape file (if any)

            --keepinner           include points for any inner rings (by default these are ignored)

            --simplify <number>   simplify the boundary lines, with the given tolerance (in the

                                  units of the source file)

            --no_transform        don't transform the coordinate reference system

            --from_crs <CRS>      the coordinate reference system of the source file (expressed

                                  in the format used by the GDAL libraries, useful if the source

                                  file does not define its CRS)

            --to_crs <CRS>        the coordinate reference system to use for the output points

                                  (defaults to WGS84, probably no need to change it)

            --precision <N>       the number of digits of precision to be used for output

                                  coordinates (default 4)

            --info                run ogrinfo to display details about the source shape file

            --version             display the program version number and exit

           

          The shape file <shape_file> may be any format of spatial file supported by the GDAL utilities.

           

           

          The --info option is usually the place to start. Running:

           

          shapetotab --info <shape_file>

           

          will display details about the file, including listing all the field names contained in the file.

           

          The --name option simply allows you to specify a shorter alias for the shape file (often shape file names are long and cumbersome. The name is used as the stem of the features and points files and also as the stem of the generated unique ID field if an identifier from the shape file is not given.

           

          The --id option allows you to specify a field in the shape file to use as the unique identifier.  If it's not unique you'll get an appropriate error.

           

          The --keepinner option allows you to generate polygons for any "inner rings" representing holes in any of the polygons in the file. Often these holes are filled by other shapes representing "islands" within the one with holes in it. It's generally too difficult to do anything useful with these inner rings, simply because it's hard to get the right one on top all of the time. It's also rare for there to be inner rings of a size which matter for Tableau visualisations. So the default is that the inner rings are just ignored. This option lets you get at them if your really want them. It generates an extra couple of fields in the points file: ring_id and ring_type ("inner", "outer"). See the third tab of the workbook I embedded in the original posting for an example of using inner rings.

           

          The --simplify option allows you to specify the size tolerance to use for simplifying the boundary by reducing the number of points. The size tolerance is expressed in the units of the source shape file (which can be seen in the output of the --info command). Typically this may be metres (or presumably yards or miles or something for the folks in countries that haven't yet heard of the SI system) for a "projected" coordinate reference system and degrees for a "geographic" (lat/long) coordinate reference system. Finding the appropriate value depends on the area covered and the size of the individual shapes - typically probably in the range of a few meters to several kilometres, or 0.0001 through to 0.1 degrees.

           

          The --no_transform, --from_crs and --to_crs options are there to cope with some fairly unusual cases with shape files which don't have all the necessary details internally. I'll explain that a bit more when I post the "hack" utility.

           

          The --precision option lets you override the default precision used for latitude and longitude values. The default is 4 digits after the decimal point, which works out to be around 10 metres. It's rare to need more than that.

           

           

          Example

           

          The zip file containing the utility also contains a sample directory with a copy of the shape file for the Tsunami zones around my home that I used in my original posting. Once you've downloaded and installed the utility, follow the steps to create the Tsunami map.

           

          1) Start a DOS command window in the sample directory with the shape file.

           

          2) Run the command:

           

          shapetotab --info porirua-tsunami-evacuatio.shp

           

          That should display details about the shape file. If it doesn't, double-check that you have followed all of the installation instructions correctly.

           

          3) Run the command:

           

          shapetotab --name Tsunami_Zones porirua-tsunami-evacuatio.shp

           

          That should say:

           

          Generated 3 feature rows for Tsunami_Zones

          Generated a total of 20041 points (min: 4418, avg: 6680, max: 8709)

          Overall totals: 3 rows, 20041 points

          Done in 1 seconds

           

          It will have generated two files: Tsunami_Zones_Features.csv and Tsunami_Zones_Points.csv.

           

          4) Open Tableau and create a new Multiple Table connection from these two files. Tableau will automatically join them on the generated ID field [Tsunami_Zone_ID].

           

          5) Set the mark type to polygon and drag fields onto the view as follows:

           

          Tsunami_Zones_Points#csv_Longitude -> Columns

          Tsunami_Zones_Points#csv_Latitude -> Rows

          point_order -> Path

          COL_CODE -> Color

          Tsunami_Zones_ID -> Level of Detail

          sub_polygon_id -> Level of detail

           

          Make sure you pick the [Latitude] and [Longitude] fields from the points file rather than the features file. The ones in the features file are the coordinates of the centres of the shapes. Make sure the fields point_order, Tsunami_Zones_ID and sub_polygon_id are set as dimensions.

           

          This should now look like this:

          Tsunami_Zones.png

           

          At this point you can either join or blend in the data that you want to visualise.

           

          As easy as that.

           

           

          Known Issues

           

          I have come across several issues caused by invalid shape files (often distributed from reputable sounding sources like government statistics departments). These are the ones I can remember.

           

          • Divide by Zero Error. This was caused by a shape with only three points (which as the start and end point have to be the same, means that it was just a "there and back" line. The area of the shape was therefore zero, which breaks the library routine I'm using for calculating the centres of the shapes.
          • Out of Memory Error. This turned out to be caused by a shape with a boundary line which crossed over itself, when attempting to simplify the boundaries with the --simplify option.
          • Error message about boundary line crossing itself. The same file as in the previous example gave a hint about the problem when run without the --simplify option. The library didn't say which feature was broken, though.
          • Error message about a shape which did not form a closed ring. Again, the message correctly identified the problem, but did not identify which row was broken.

           

          All of these issues needed edits to the shape files with a GIS tool to fix them. I have been using QGIS, which is an open source tool. In some cases the invalid row was reported by the QGIS validation routines, in other cases it didn't.

           

           

          Download and Installation

           

          1) Download ShapeToTab.zip from here:


          http://dl.dropbox.com/u/59458890/ShapeToTab.zip


          and save it somewhere appropriate on a local drive.
           

          2) Unzip the file, which will create a subdirectory of “ShapeToTab”, containing the utility plus a configuration file.

          There is also a sub-directory: “Sample” containing the shape file for the example above.
           

          3) Add the location where you have installed the utility to the PATH (edit system environment variables from control panel), e.g.:

          “C:\Data\Tableau\ShapeToTab”
           

          4) Download version 1.9 of GDAL (Geographic Data Abstraction Library) and save it in an appropriate location (such as a directory called gdal under the ShapeToTab directory, or wherever you prefer to save program files.  The current stable release of GDAL and a nightly build of the latest version are available at GISINTERNALS. I have been using release-1600-gdal-1-9-0-mapserver-6-0-1 (choose the zip file containing all components).

           

          As the GISINTERNALS site often seems to be unavailable, I’ve put a copy of the version I’ve been using in my Dropbox account, here:

           

          http://dl.dropbox.com/u/59458890/release-1600-gdal-1-9-0-mapserver-6-0-1.zip

           

          5) Unzip the GDAL package.

           

          Running GDAL components requires various directories to be on the path and other environment variables to be set. This is done automatically by shapetotab, based on a setting in the configuration file (step 6).

           

          6) Modify the configuration file (shapetotab.yml), which is located in the installation directory, specifying the GDAL installation directory.

           

          For example:


          # GDAL installation path
          GDAL_installation_path: C:\Data\Software\GDAL_1.9


          A few other optional and little-used options are available for this file, but I won’t bother describing them here.

           

          That all seems a bit long winded - but once you've got it installed and have run it a few times it really makes generating filled maps extremely easy. It's certainly a lot simpler than the "hack" for adding your own filled maps to custom geocoding - which is why it's taken me a bit of time to get that written up and posted.

           

          Message was edited by: Richard Leeke  Added note about changing ID fields to Dimensions (thanks for the heads-up Shawn!).

          2 of 2 people found this helpful
          • 2. Re: Grow your own Filled Maps
            Alex Kerin

            As I indicated in my email a while back Richard, this is outstanding work. I didn't realize you had come so far with creating the tool.

            • 3. Re: Grow your own Filled Maps
              Richard Leeke

              I've finally finished the tidy up on the original tool for adding shapes to the custom geocoding database.

               

              At the risk of going on about this too much, I'll just stress one more time that this is an unsupported hack and will result in workbooks that I imagine Tableau will not want to support and will almost certainly stop working in some future release of Tableau. To make sure you never forget that it's a hack if you do decide to use it, I've even stuck the word hack in the program name: it's called tabgeohack.

               

              But despite that dire warning, I'm very confident that it's not going to harm your Tableau installation. The only thing it touches is the custom geocoding database in the repository, and in the event of issues that can always be removed through Tableau or even just deleted.

               

              So just to clarify the difference between the two tools I've posted here:

               

              • shapetotab which I posted the other day is very simple to use and just generates CSV files of boundary points which you can use to draw polygons from the boundary data in various types of geospatial files. It doesn't use any back-door methods, so the generated workbooks will be completely supported and will continue to work in the future. However, drawing maps like that has several downsides. It can result in very large numbers of rows in your primary datasource and is best suited to visualising simple data that you can join or blend directly to the point data. It doesn't lend itself well to complex analysis such as table calculations, and lacks other smarts that you can achieve with geocoding filled maps.
              • tabgeohack which is posted below is quite complex to set up and use and takes you out of supported territory. OK I've laboured that point enough, I think.

               

              The instructions for installing and using tabgeohack have grown too big to just paste into this thread, so they are attached as a word document.

               

              Links for downloading the actual utility and also the spatial libraries needed to make it work are contained in the instructions.

               

              I'll be interested to hear how anyone gets on with either of them. And there's a fully trained team of volunteers on standby to help field the questions. Thanks Shawn and Robert!

              • 4. Re: Grow your own Filled Maps
                Shawn Wallwork

                I've been helping Richard test this excellent new tool he has created. It works great! He asked me to post a bit about the struggles I initially had with DOS. We're hoping this will help others get off to a better start than I did...

                 

                Before last month I hadn't looked at a flashing command prompt in probably 15-years. So I was unaware of the integration between Windows and the DOS window. I spent a lot of time typing (mistyping) things like: CD C:\Data\Performance Testing\Tableau\Filled Maps\shapetotab and some of Richards other long paths. Until he pointed out these short-cuts. (I work in Windows 7.)

                 

                1. While CTRL-P does NOT work at the command prompt, RT-Click gives you a paste option
                2. To get to the end of the path quickly, instead of typing at the DOS prompt, open the folder in Explorer and SHIFT-RT-Click the folder. This gives you an 'Open Command Window Here' option (very handy!)
                3. If you RT-Click the DOS window banner you get a 'Properties' option, which let's you configure how the DOS window looks and works.
                4. Other than the specific commands tabgeohack uses, you'll probably only need CD (change directory) and DIR (lists everything in the directory). And you probably won't even need these if you use #2 above.
                5. One other useful short-cut is the TAB key. If you type the first few letters of a file name and then hit the TAB key the first match in the directory will be auto-filled. If that's not the one you're looking for, then hitting TAB will page through your option. Especially handy if you have a set of names all the same with different extensions.

                 

                Here are some pix you might find helpful:

                Richard-1.PNG

                Richard-2.PNG

                Richard-3.PNG

                Most importantly, don't let unfamiliarity with DOS keep you from using these great tools Richard has created!

                 

                --Shawn

                 

                Message was edited by: Shawn Wallwork

                • 5. Re: Grow your own Filled Maps
                  Alan Eldridge

                  Hi Richard,

                   

                  First, let me congratulate you on this tool - it's an amazing piece of work. It addresses one of the most common questions I hear when people first see mapping in Tableau - how do I get my own regions into the tool? Or more commonly from our APAC customers - how do I have filled maps at the postcode/LGA level?

                   

                  I had a play with it this morning, loading the 2011 Australian postcode boundaries from the ABS website. Your instructions were clear and helpful - all up it took about an hour. Here's the resulting workbook I created:

                  Aus Pcods.jpg

                  This makes AUS postcodes (in fact, any custom boundary - I've got mesh blocks, SA's and LGAs all lined up to be imported) a first class citizen in the mapping world and allows us to do mixed marks, value-driven fills, etc. all without complication. Note - the additional detail can make the files BIG - the postcode TWBX  is 27MB now because it contains very granular boundaries. Of course, there are parameters in the solution to tweak this but I wanted to be able to drill right down without too much sharding between the polygons. I expect I could get it much smaller if I tuned the geoDB through your process.

                  Of course, while I personally see benefit of the tool, I have to reiterate the caveat that this is totally unsupported so if it breaks between upgrades or breaks your Tableau environment our support team won’t help. But for those of you prepared to venture into uncharted territories, this is a great utility.

                   

                  Cheers,

                  Alan

                  • 6. Re: Grow your own Filled Maps
                    Richard Leeke

                    Thanks Alan - good to know someone had managed to make it work. I'd seen some of your Australian polygon map work in the past and wondered if you would have a look at it.

                     

                    I've done exactly as you are intending to do with a whole host of different breakdowns of New Zealand land boundaries. One thing in particular which works quite well is hierarchical drill-down, and it's probably worth a quick comment on that because I spent a while over-complicating it.

                     

                    Initially I thought that if I wanted to drill down through the hierarchy (in NZ terms from Territorial Authority to Area Unit to Meshblock, say) that I would need to define that hierarchy in the custom geocoding structure. But actually, as all of those breakdowns are uniquely identified, there's no need to do anything to show the hierarchy in the custom geocoding definition - just define them as three distinct geographic roles and then create a custom hierarchy with the dimensions in the data source. Then just drop the hierarchy on Level of Detail.

                     

                    As far as I can work out, the custom geocoding hierarchy is only needed if the hierarchy is needed for unique identification (same City name in different Countries, or whatever).

                     

                    The other thing with hierarchies like that is being careful to get the colouring right as you expand the hierarchy.

                     

                    As you say - loading the full detail makes the geocoding database get large - which can make it slow and also too big for Public. I have a database with 9 different breakdowns of NZ, simplified to the level where the slivers and overlaps are still barely noticeable, and it's 72 MB (though of course NZ is tiny compared with Australia). I actually loaded down to the individual Primary Parcel level for NZ - which was a 2 GB geocoding database - but that just gave an out of memory error when I tried to use it since Tableau needs to cache the entire geographic role.

                     

                    Next project is to try to come up with a better simplification scheme which doesn't break the topology - and ideally doesn't need lots of trial and error. But that is orders of magnitude harder than what I've done so far - which is basically just a bit of plumbing.

                    • 7. Re: Grow your own Filled Maps
                      Alan Eldridge

                      One challenge I had to overcome... the ESRI .shp files I downloaded from the ABS were PolygonZ data (opposed to Polygon) - this caused the tool to fail. It was not a difficult workaround as I just loaded the data into QGIS and saved it back out again, but it would be nice if the tools could handle the difference transparently. FYI.

                       

                      Cheers,

                      Alan

                      • 8. Re: Grow your own Filled Maps
                        Richard Leeke

                        Thanks for the heads-up - I haven't tried any PolygonZ files (obviously). I'll see what's involved - I'm guessing the GDAL libraries will make that very easy.

                        • 9. Re: Grow your own Filled Maps
                          Richard Leeke

                          Well the only reason it didn't handle PolygonZ was that I'd put some validation in to stop people getting confused by trying point and line shape files - and I'd forgotten to allow for PolygonZ. So I've now allowed that - and PolygonM. I just ignore the Z or M values.

                           

                          Trying it out on the ABS Postal Areas file it then broke because of invalid shapes (there are 3 entries for various sorts of unclassified or non-existent locations which don't have valid polygon data - which I don't think is legal in a shape file).  That led to a divide by zero error when calculating centroids. Converting PolgonZ to Polygon with QGIS must have just got rid of those - or did you have to deal with those 3 rows in some way?

                           

                          Anyway - I've hit that divide by zero issue on the centroids with bad shapes before, so as that looks like being something which may happen quite a lot I decided I'd better figure out how to handle that, too. So I now trap errors on the centroid calculation and try a few approaches to approximate the position before resorting to setting the centroid to (0, 0) if there's simply no shape there at all.

                           

                          I also had a bit of a play with different simplification settings on that Postal Areas file. It's a bit challenging for the simplistic simplification used by the GDAL libraries because that uses a uniform simplification scheme across the whole file - and Australian postal areas are a great example of the sort of non-uniformity of sizes of objects that you often get - ranging from very small areas in the middle of densely populated urban areas to vast swathes of outback. With no simplification there are 4.6 million boundary points and screen refresh is quite painfully slow. Simplification to 0.01 degrees brings that down to 100,000 boundary points and screen refresh is very snappy. The resolution doesn't look bad for the whole of Australia, but zooming in on central Melbourne you get lots of gaps and overlaps:

                          Melbourne - simplified to .01 degrees.png

                           

                          At 0.001 degrees, there are 400,000 boundary points, still pretty snappy refresh and now only a few small slivers of gaps and overlaps:

                          Melbourne - simplified to .001 degrees.png

                           

                          That brings the size of the packaged workbook down to under 3 MB, too. So I think it's well worth trying it with a setting of: "simplify_tolerance: 0.001".

                           

                          The algorithm I've got in my head is intended to deal with that non-uniformity much better - but that may just be brave talk - we shall see. That Postal Areas file will be a good test if I ever get around to trying it...

                           

                           

                          The new version of tabgeohack also includes fixes for a few other minor issues people have noticed:

                          • It now handles purging rows from CMSA properly in 7.0.1 (and still works with 7.0.0)
                          • It now correctly reports total rows purged
                          • Validates field aliases and disallows any that will clash with Tableau reserved column names

                           

                          I've also added support for PolygonM and PolygonZ, handled the divide by zero issue and create a dummy polygon for the invalid shapes in "shapetotab".

                           

                          Just download them both again from the same links:

                           

                          shapetotab: http://dl.dropbox.com/u/59458890/ShapeToTab.zip

                          tabgeohack: http://dl.dropbox.com/u/59458890/TabGeoHack.zip

                          • 10. Re: Grow your own Filled Maps
                            Alan Eldridge

                            Nice work, Richard. I hit the same problems that you did, fixed it the same way (into QGIS and back out) and found the same resolution/tolerance (0.001).

                             

                            If anyone is interested, here are some workbooks I've created for AUS postcodes, LGAs and suburbs. By restraining the geoDB to have just Australia in scope makes the files MUCH smaller:

                             

                            http://dl.dropbox.com/u/3987438/Custom%20Geo%20Boundary%20Examples.zip

                             

                            I also had a play around with some cadastral data to show land parcels - another potentially interesting application:

                            Land Parcels.jpg

                            Lots of interesting fun to be had with this hack. Thanks again for sharing...

                            • 11. Re: Grow your own Filled Maps
                              Richard Leeke

                              > I hit the same problems that you did, fixed it the same way (into QGIS and back out)

                               

                              Should all be fixed in version 1.0.2. Sing out if you hit any more issues and just let me know exactly what you were doing when it happened. Ideally post the tabgeohack.log file from your install directory. Same applies to shapetotab (shapetotab.log).

                               

                              Nice examples by the way. But what's with the holes in South Australia LGAs and Suberbs?


                              • 12. Re: Grow your own Filled Maps
                                Richard Leeke

                                I hit another bug while trying to load 400,000 German buildings - there's just so much variability in the internal format of the shape data. Anyway - I fixed that, so the latest version is now 1.0.3.

                                 

                                Here are some buildings in central Berlin. Lots of scope for interesting visualisations with this sort of data, too.

                                 

                                Berlin Castle.png

                                • 13. Re: Grow your own Filled Maps
                                  Richard Leeke

                                  Robert Mundigl has just posted a great "quick start" guide to using tabgeohack in this posting on his Clearly and Simply blog site.

                                   

                                  As Robert was putting the article together he came across a few tricky things that I've managed to help with a bit by adding a few new features. The main things are that it now displays a small sample of the data from the shape file, which makes it easier to work out which fields you need, it has an option to filter the rows that you import, based on any of the fields in the shape file and it now recovers better from certain types of invalid shapes, which seem to occur a lot in the shape files out there in the wild.

                                   

                                  If you download now you'll get the the new version that contains all of those.

                                  • 14. Re: Grow your own Filled Maps
                                    Kirdan Lees

                                    Hey there Richard,

                                     

                                    Trust you are well. Had a frustrating day here working through the new hack you have created that holds so much promise. I must be doing something silly - yet can't for the life of me see what.

                                     

                                    I have the regional and territory authority databases from statistics new zealand set up in tableau as polygons and while that works well for static images there is too much detail there for the animated versions.

                                     

                                    So I'm trying to use your tabgeohack files to first create a regional map (and will look at TA later) with less detail. I can download all the progs. I can get the roles command to kick out a simple csv file of regional coordinates. The shapes command looks like it is working pretty sweet too but no cigar when I come to work with the filled map. The custom geocoding picks up the regions OK but there is a link missing to the shapes.

                                     

                                    I'll think on this again tonight but if you did have thoughts outside your work and responding to other queries would love to know.

                                     

                                    Cheers,

                                    Kirdan

                                    1 2 3 Previous Next