Friday, December 19, 2008

Clustering 20K+ Map Points

Was asked "what is the best way to display 20,000 markers on a map ?". Well, displaying all 20K map points does not make whole lot of sense, as you just see nothing, just a blanket of markers. A smarter approach is to cluster the points and display the clusters. Somebody pointed me to mapeed, and asked me to duplicate this using the Flex API for AGS. Here it is. I've loaded 20,000 map points from geonames and created dynamically clusters based on the scale of the map and its extent. BTW, this is all client side processing - gotta love the JIT in Flash Player :-)
[Update] Sooo many people asked me for the source code of the app - here it is. Any feedback is more than welcome.

21 comments:

bryanc said...

Very, very cool! Any chance that we'll ever see the source?

ThunderHead said...

Sure - email me at mraad@esri.com

vitch said...

Very interesting stuff - I was just about to tackle a problem like this myself... I was wondering about using Geohash and shortening the identifiers to group close places but I'l also be very interested in the algorithm you used...

Paul Hastings said...

mansour, everybody & their uncle's going to want to see the source, can you make it available via "view source"?

thanks.

Rev. Dan said...

Howdy,

Paul Hastings is spot-on. My uncle and I both just read this blog post and thought "man, I hope he shares the source code."

Yours,

- Everybody

ThunderHead said...

Please send me email to mraad@esri.com, and will be more than willing to share :-)

Steve.McKinney@SICS said...

Mansour you are awesome!!!! Thanks again. I have used a TON of your code in our project. Here is the dev site if you get a chance to see all your hard work in play. Thanks again!

Steve.McKinney@SICS said...

http://arms.sicsconsultants.net

ThunderHead said...

You are more than welcome :-)

matthew said...

Thanks u r information





web design company

Christian Junk said...

Hi Mansour,

first of all thank you very much for sharing your work with us. I'm working on a server-side clustering method. I'm very new to this and so I would be glad if you could tell me the name of the algorithm you've used for this client-side clustering.

Much thanks in advance!

Christian

ThunderHead said...

You are very welcome - came up with the algo and have not named it. Check out also http://thunderheadxpler.blogspot.com/2009/05/map-points-cubic-clustering.html

Prash said...

Thanks a lot for sharing the code. Awesome work. I used it in my project and it was a hit. Everybody loved it.

ThunderHead said...

Glad it worked or you :-)

Guiles78 said...

Hye Mansour and thanks for this great sample.

I’m using your sample to cluster entities in my project. In my case, I don’t want to determine clusters only for the current map extent and each time map extent change but for all my set of mapPoints and only when map scale change. In order to do that, I change few things in your code and I’ve got some problem…

It seems to be due of negative values for mapPoint west and/or north of the current map extent when they are converted to screenPoint. That give me some negative value for the key of the cluster dictionary determined by what I call “the magic formula” : var ci : int = (cx << 16) | cy, who is the only point of your sample that I don’t understand (specially the syntax)…

Can you explain to me this formula or/and give me some indication in order to cluster point for every point of my set of mapPoints? I think the solution is to determine unique key value even when I’m working with negative value.

Thanks a lot.

ThunderHead said...

Working on a new clustering - but there is a quick fix for ur negative problem, is to make sure u only cluster points in the map extent.
As to the "magic" formula, yeap - this is the trick - it assumes that the cx and cy values will be positive integers less that 2^16. this way, I can combine the two values (by bit shifting) into _one_ value that becomes a key value into a dictionary - cool eh ? anyway, as I said, there is that assumption which is "inherited" if u cluster the point _only_ in the extent. hope this make sense. Working on a newer version that will be part of version 2 API - maybe will backport it to 1.3.

Andy said...

Hi, first of all. Thank you for sharing your solution. It's just awesome!

I like this approach (http://thunderheadxpler.blogspot.com/2008/12/clustering-20k-map-points.html) better than that (http://thunderheadxpler.blogspot.com/2009/05/map-points-cubic-clustering.html), because the positioning of the clusters don't look like a grid.
However, it is disturbing to my client, that the clusters change even if I just move the map and not zoom. I tried to call "clusterMapPoints()" only if levelChange of "ExtendEvent" is true. That works, but of course clusters outside the extend will not appear. Is it anyway possible with this approach that the clusters don't change while moving the map?

ThunderHead said...

U R Welcome-I'm aware of this problem and have a fix - will be posting it soon.

DC said...

I have impleneted this code into an application but now I need to update the text file that stores the point info (geonames.txt in your example). However I cannot access the text file after the project has been compiled. This is due to the 'include' statement. Is there a way to use the text file and not use the 'include' statement?

Thanks

ThunderHead said...

DC - sure - u can add and remove item from the input collection at runtime - at start up, I just load that input collection with data from a txt file :-)

DC said...

Would you happen to have an example on how to do this?