1 2 3 4 5 Previous Next 72 Replies Latest reply on Apr 17, 2019 1:29 PM by Mrinalini Kulkarni Go to original post
      • 60. Re: Concentric circles
        Richard Leeke

        Good to understand the scale and the objectives of the analysis - and also the constraints on using the live connection to the database.

         

        I actually think that with appropriate indexing of the spatial columns you might find that the analysis is surprisingly quick against the server, so you might not have to use an extract like that. The trouble with using an extract is that all the spatial calculations would need to be static, so for example you would have to use a fixed radius (or a few fixed values of radius), rather than being able to enter any value of the fly. Also, finding the households within a given radius of a shop can use a spatial index with the data in SQL Server, which means it only needs to consider households that are "near" the shop, rather than calculating the distance from the shop to all 12 million households.

         

        If you do go the extract route it would almost certainly be best to use a SQL Server spatial function to calculate the distance from every shop to every household and store that in the extract, rather than doing the distance calculation on the fly with a trigonometry expression based on the lat/lon coordinates - which would potentially have to evaluate the distance for all 12 million households every time you refresh a view.

         

        One option you might want to consider rather than using an extract is just to load the tables into a SQL Server Express database on your local PC. That way you can make full use of the spatial capabilities without putting any load on a shared server apart from taking a copy of the tables.

         

        Let me have a bit of a play with some data I've got (1.2 million NZ street addresses) and get a feel for how that performs - then I'll try to explain the pros and cons of some options.

        • 61. Re: Concentric circles
          kristian fraser

          Morning,

           

          Thanks, its been a torrid few days, Whilst i have been encouraged to investigate the server rout i am observing push back in some quarters because there are lots of other users who nee the resource. its a reasonable position for them to take.

           

          I like your suggestion of bringing data into my local DB and then working from there, it has many advantages.

           

          As a hedge, i have continued to work on the application side calculations, i know its not optimal but its a good fall back position if decisions don't go my way. I have attached my current workbook (same underlying data) where i have made progress but i'm stuck on correctly calculating distance, i'm sure it's just a tweak here and there so tweak i go.

           

          Thanks again for your time, if i could buy you a Friday afternoon beer i would.

          • 62. Re: Concentric circles
            Richard Leeke

            I haven't had much chance to look at it, but it has peeked my interest so I've been thinking about it in the background and will definitely make time to have more of a look over the weekend.

             

            One thing I have realised is that my earlier comment about pre-calculating all the distances if you go the extract route just isn't viable.With 12 million households and 220 shops that would mean 2.6 billion distance calculations - and 2.6 billion rows in the extract. That's not going to fly.

             

            I have had a brief go at seeing how fast (or slow) the queries would be if you do it using SQL Server. I used those 1.2 million NZ street addresses and timed a few things like counting addresses and also returning the actual addresses within 1 km and 5 km of a set of 250 or so points. That took about a minute for a 1 km radius and 3 or 4 minutes for a 5 km radius. Assuming NZ and Australian housing densities are roughly the same, I wouldn't expect that to take significantly longer on your household data set (the spatial index will mean the query only has to process roughly the same number of locations near the shops in each case).

             

            I also had a bit of a go at working out what the queries would need to be like to support the sorts of visualisations you are likely to want - but I haven't managed to get my head around anything that works so far. But I tend to be a dog with a bone on this sort of thing, so I'll keep worrying away at it.

             

            I'll take a look at your latest version and see if I can help with that route, too.

             

            Thanks for the virtual beer - it's a nice thought!

            • 63. Re: Concentric circles
              kristian fraser

              Afternoon,

               

              I had a chat with a consultant friend and they suggested creating a persistent column in the tables so when running a calculation the answer is stored there and does not change until there is a variation in the calculation inputs. This sounded like a reasonable suggestion but not having any real knowledge i don't know how practical this is. They also suggested looking at the ordering of the equations to try to remove data before a calculation is made, again a reasonable suggestion but not something i really know how to do without research.

               

              in terms of Server assets/hardware I might have a skunk works option available to me where i can use a server to run the calculations and then pass the results into tableau, join on the household ID's and bring in the other data that way. I also have my local SQL Server instance but i'm not sure how that will actually go. This means pulling all the household ID's out of the current tables with lat and long and storing them in a separate DB which i have just about finished. I have never heard a computer groan under pressure like my little laptop...

               

              anyway, thanks again for your time... it's BEER O'CLOCK right now and god i need it... enjoy

              • 64. Re: Concentric circles
                Richard Leeke

                OK, I've got my head around some options that I think will work well for you and I think you will be able to build something which performs well and doesn't have too much resource needs. I'll put something together which demonstrates what I'm thinking of, but there are quite a lot of options, so it will probably need a bit of to-and-fro (or maybe talking it through with a screenshare).

                 

                What I think will work best is probably a hybrid approach whereby you calculate housing density around all existing stores statically and save that as a data extract, but then have the ability to drill-down to individual stores, for which I think it will be quite viable to do that live against SQL Server - either your shared server or your local one would be fine. That will allow analysis where you focus on the stores with the lowest nearby housing density. I've also been thinking about how best to explore the optimum locations for new stores. I think a different approach may work best there - but I'll experiment to see that it's viable before I get you too excited about that one.

                 

                Just one clarifying question at this stage:

                 

                What sort of radius around stores will you be focussed on? I see from your current test that you are using circles from 1 to 5 km, which I think works well - just wondering how far out to do the analysis for the static extract. That mainly just affects how long it will take to build the extract - but also affects what is workable as a visualisation (if the circles are too big and overlap too much the approach of showing the circles might not work very well).

                • 65. Re: Concentric circles
                  Richard Leeke

                  One other comment.

                   

                  I had a quick look at the approach you are trying for doing all the calculations locally. I can see what you are trying to do - and I think I even remember some old workbooks where I or others have used the approach you're using to work out the location of the selected centre using table calculations - I'm guessing you may have drawn some of your calculations from there.

                   

                  I think I could see at least one reason why the calculations currently aren't working: I'm pretty sure the calculations for Lat Centre Global and Long Centre Global, which use TOTAL(AVG([Lat Centre])) and TOTAL(AVG([Long Centre])) would need [Lat Centre] and [Long Centre] to be NULL rather than zero for all points other than the selected centre.

                   

                  I didn't spend any time trying to get it working, though, because I'm sure that approach isn't going to scale anywhere near what you need for your data volumes. The key to this is all in the smarts of the SQL Server spatial processing - and especially how spatial indexes work to minimise the amount of processing. The local calculation approach would need the distance between every shop and every household to be calculated (that's 2.6 billion evaluations of that distance calculation, which isn't going to happen any time soon).

                   

                  I'm pretty confident we'll get something viable going using the combination of SQL Server and data extracts that I mentioned earlier, though.

                  • 66. Re: Concentric circles
                    kristian fraser

                    Hi Richard, you are a machine.... Thanks for your considerations so far.

                     

                    To your question, i haven't asked anyone exactly what is needed but 5 km seems to be the logical at the moment given other considerations on shopping center locations etc.

                     

                    I think a hybrid approach has the most appeal because i get the feeling that adding any more load to a stretched resource might be challenged in my current environment. I get the feeling this is heading into new thread territory, given the requirement is a little different to the original and i think it would be a great opportunity to let others see how this unfolds for their projects. I can do a screen share as well as bouncing around on the community board, it is really up to your level of comfort there.

                     

                    To your follow on post, this is directly inspired by the Clearlyandsimply blog written by Robert Mundigl where you were a contributing writer as well, that was actually my first find which led me here. I have tried to carbon copy some of the ideas and they look like they will work (once i stop making mistakes) but i agree i think this will be redundant however very instructive, i have learnt allot through that exercise so not a waste at all.

                     

                    p.s. unrelated to this subject: i'm super proud of the response that NZ as a whole demonstrated over the last 48 hours. It's inspiring in the face of the actions of that individual.

                    • 67. Re: Concentric circles
                      Richard Leeke

                      A quick update in case you're wondering why I've gone quiet.

                       

                      I've been having a lot of fun exploring different ways to attack the problem - picking up and/or developing a few new (to me) tricks along the way. I've now got pretty much all of the bits I was thinking of going, though there's a bit of refinement needed and a bit more work to glue it all together into something which I think will address your questions. I'll keep picking away at that over the next day or two - but then of course I have to try to explain it all.

                       

                      As I mentioned, I think a mixture of data extracts and live SQL will work best. Functionally the most flexible and powerful is to drive it all from a live connection to SQL Server - but some of the views I've tried are taking half an hour to refresh and you will have 10 times as much data, so some compromises will be necessary. Those views which are really slow against a live connection are very fast with an extract - though the extracts may take a few hours to create. The other main downside of using extracts is that some of the parameters driving the analysis need to be locked in at the time of creation (for example the radius of the circles around the shops).

                       

                      I'll explain what I'm suggesting in more detail over the next day or two - I'll probably drip-feed it - there's quite a lot to explain. Here's the first installment:

                       

                      I'm thinking you will probably want to take two different approaches to looking at the overall picture: one focused on existing shops and one allowing you to find good candidates for new shops. In both cases you may then want to drill-down to detailed views around a particular (candidate or actual) shop location. Here are a few prototype views showing the sort of thing I'm thinking of.

                       

                      Density of households near shops:

                      This shows the housing density in each of 5 concentric rings around each 'shop' in my sample data. Blue is lowest density, red is highest. You can also colour the rings by total number of households, which may be more effective. This view was taking several minutes to draw for the whole of NZ against a live database connection, but takes a second or two against an extract (though the radius of the circles is locked in to the extract).

                       

                      Filtering the above view will let you focus on shops with lower nearby numbers of households. You can then drill-down on those with queries against the live database to see the detail of where the households are. I haven't yet got the plumbing sorted out to do that, but you should be able to drill to a view like this:

                       

                       

                      Housing density for whole country:

                      The previous views will help review shops with low nearby numbers of households, but won't help answer the question "where should we open new shops?". The approach I'm thinking of here is to display the housing density across the whole country by breaking the country into a grid (hexagons work best). You will be able to filter this view to show only hexagons with more than <N> households and less than <M> shops - probably zero shops would be the starting point.

                       

                      Then zoom in on areas of interest and increase the granularity of the hexagons.

                       

                       

                      Finally you should then be able to drill down to the same detailed view as before showing all households within a few km of a possible location.

                       

                      I initially tried creating the HexBin view dynamically against SQL Server, generating the hexagons as spatial polygons. That works really well and could be a really powerful technique if you then wanted to do spatial operations like intersection between the hexagons and other polygons - so I'll definitely add that to my bag of tricks - but it was way too slow even for NZ data volumes, let alone your volumes. So I've currently got it working off data extracts using table calculations. That is really quick, and even creating the extract for that didn't take too long. The main downside I've seen so far is that it's not easy to work out housing density properly without the spatial functions - which means it's hard to compensate for the distortion due to the map projection - which in your case will be significant (a polygon in Tasmania is going to be a lot smaller on the surface of the earth than a polygon in the Northern Territory).

                       

                      I'll keep picking away at this and provide more detail once I've got it all glued together. In the meantime, let me know your thoughts.

                      • 68. Re: Concentric circles
                        kristian fraser

                        Richard,

                         

                        Thanks, very detailed.... bullets as answers to some of the questions/suggestions:

                         

                        • i'm fine with locking in the radius out to 5km, there has been a few points made asking for the view to be at 500m intervals and not 1km intervals.
                        • i think mission 1 is to review current status of in place shops.
                        • The first view of the various circles highlighting the 'shops' as centers with volume at each radius is very close to what we need. I think the only variant on this is to have a cumulative calculation rather than a calculation from last radius to new radius but i think that will be easy.
                        • Drip feeding is fine, i will keep a close eye on things and respond as quickly as possible

                         

                        As a test i loaded the 12m households into my local DB and ran the custom SQL, it looked to have worked where i could add my shops as points, i also counted the amount of housholds and the volume looked about right.. Downside it took 4 hours to generate the tableau data and shop point view, i didn't even dare to look at the household view on a map or table. It is currently sitting at 767 min to build the extract, i'm letting it go just to see how long it takes. these time frames are not good though as im sure you would see.

                         

                        I am investigating creating a new table in my local DB based on the same type of equation to draw the locations into Tableau to see if this will speed things up but im getting a little stuck. code below:

                         

                        SELECT Top 10       

                        [dbo].[Households].[LocationID] AS 'HH ID',

                        null AS 'Site ID',

                        [dbo].[Households].[HHLocation] AS 'HH Point',

                        null AS 'Site Point'

                        *[dbo].[Households].[HHLocation].STBuffer(5 * 1000) as 'Dynamic Buffer'

                        FROM [Households] union all

                         

                         

                        SELECT Top 10       

                        null AS 'HH ID',

                        [dbo].[Sites].[LocationID] AS 'Site ID',

                        null AS 'HH Point',

                        [dbo].[Sites].[SiteLocation] AS 'Site Point',

                        null as 'Dynamic Buffer'

                        from [Sites]

                        *ON [dbo].[Sites].[SiteLocation].STIntersects([dbo].[Households].[HHLocation].STBuffer(5 * 1000)) = 1

                         

                        looking forward to the next installment and thanks again

                        • 69. Re: Concentric circles
                          Richard Leeke

                          Sorry to have gone quiet on you - a couple of other things have come up and I've got a bit of work on. Very brief comments on your last post below, I'll try to write up a bit more detail on what I have done so you have something to work with in the next few days - probably at the weekend.

                           

                          • i'm fine with locking in the radius out to 5km, there has been a few points made asking for the view to be at 500m intervals and not 1km intervals.

                          Easy to set whatever radii you want - I've just got that driven off a table in the D/B

                           

                          • i think mission 1 is to review current status of in place shops.

                          Understood. I'll write up what I did there first - it's much easier anyway.

                           

                          • The first view of the various circles highlighting the 'shops' as centers with volume at each radius is very close to what we need. I think the only variant on this is to have a cumulative calculation rather than a calculation from last radius to new radius but i think that will be easy.

                          Hah. Cumulative is actually easier - I had to do sums to get the increments. I'll describe both.

                           

                          • Drip feeding is fine, i will keep a close eye on things and respond as quickly as possible

                          Drip.

                           

                          I'm not quite sure what the extract you're creating is and why it is taking that long. Hopefully when I describe the detail of what I've done that will work fast enough for you.

                           

                          I haven't had time to figure out exactly what is going on with the query you posted. It looks as if it has got a bit manged, though: there are two lines starting with an asterisk which shouldn't be there and in the second case it looks as if there is an INNER JOIN to the Household table missing. I also think you might have your STBuffer the wrong way round - it looks as if you are getting all shops within a 5 km radius of each household.

                          • 70. Re: Concentric circles
                            kristian fraser

                            Thanks Richard,

                             

                            Don't worry about the code i included above, i'm just trying to work through things at the same time so i can get me head around things and learn. your point on the STBuffer was a good hint and i am looking into that for a few other tests i am running where the radius was not presenting correctly.

                             

                            I look forward to your next installment.

                            • 71. Re: Concentric circles
                              Erin Thibodeaux

                              I'm stuck on the "Degree" component. where do i add that in my xls file? i'm confused

                              • 72. Re: Concentric circles
                                kristian fraser

                                Hi Erin...

                                 

                                The goal is to have 360 degree points around a central point of interest as well as an amount radii that you need, in the examples that are included in this thread there are 5 individual radii options expanding 1km each time.

                                 

                                The easiest way i made this work was to create a excel single file with a single reference center point (in my case a shop), a list of center points with a radius (in my case 5 radius for 4 shops) and a list of 360 degrees for each shop. I then then merged the various outcomes into a single sheet and connected to it in Tableau. You can see an example of this in the excel snip below. The rest of the process is in in this thread.

                                 

                                As a side note, i am just repeating what i did to get a prototype working,  i'm still working on getting this working faster with a massive data set as the current hardware and SQL process i have is very slow for my application. I am using the hints from Richard above to help guide me, from what i can read he is the only one i can see who has real detailed knowledge and answers on this topic.

                                 

                                    

                                TypeShop IDLatitudeLongitudeDegreeRadius
                                Shop Point4-35.3455149.090
                                Shop Point4-35.3455149.091
                                Shop Point4-35.3455149.092
                                Shop Point4-35.3455149.093
                                Shop Point4-35.3455149.094
                                Shop Point4-35.3455149.095
                                Shop Circle1-35.2796149.1300
                                Shop Circle1-35.2796149.1310
                                Shop Circle1-35.2796149.1320
                                Shop Circle1-35.2796149.1330
                                Shop Circle1-35.2796149.1340
                                Shop Circle1-35.2796149.1350

                                 

                                I hope this helps

                                • 73. Re: Concentric circles
                                  Mrinalini Kulkarni

                                  Hi Richard,

                                   

                                  This post is great. Its exactly what I needed for a project where I wanted to map out certain retail stores using just their ZIP codes and a parameter function that basically drew a circle of an X mile radius around it. I could do that using the tips you provided in this thread. What I am struggling with now is that I want to identify only those stores that don't have any other retail store within the X miles of it. Basically for the view I have in front of me, Tableau should highlight all the stores that don't have an intersecting circle and maybe I could list the names of these stores on the side. Is there a way I can do this? I tried creating some calculation that would calculate the distance of each point on the circle to every other point on the other circles. Doesn't seem to work.

                                  My data source has only the retail store names and the ZIPs they are located in. I added the latitude, longitude and the degree and radius based on your directions.

                                   

                                  Any help is greatly appreciated!

                                   

                                  Mrinalini

                                  1 2 3 4 5 Previous Next