Monday, March 27, 2017

ArcGIS, Spark and Alluxio Integration

There exist a plethora of backend distributed data stores. I am always using S3 or Hadoop HDFS or OpenStack Swift with my GIS applications to read from these backends geospatial data or to save into these backends my data. Some of these distributed data stores are not natively supported by the ArcGIS platform. However, the platform can be extended with ArcPy to handle these situations. Depending on the data store, I will have to use a different API (mostly Python based) to read and write geospatial information. This is where Alluxio comes in very handy. It provides an abstract layer between the application and the data store and (here is the best part), it caches this information in memory in a distributed and resilient-to-failure manner. So, at the application level, the code to access the data is invariant. On the backend, I can configure Alluxio to use either S3, HDFS or SWIFT. Finally, the advent of a REST endpoint in Alluxio eases the integration with ArcGIS to write, read and visualize Geospatial data.


Like usual, all the source code for this integration can be found here.

Saturday, March 18, 2017

ArcGIS, Spark & MemSQL Integration

Just got back from the fantastic Strata + Hadoop 2017 conference where the topics ranged from BigData, Spark to lots of AI/ML and not so much on Hadoop explicitly, at least not in the sessions that I attended. I think that is why the conference is renamed Strata + Data from now on as there is more to Hadoop in BigData.

While strolling the exhibition hall, I walked into the booth of our friends at MemSQL and got a BIG hug from Gary. We reminisced about our co-presentations at various conferences regarding the integration of ArcGIS and MemSQL as they natively support geospatial types.

This post is a refresher on the integration with a "modern" twist, where we are using the Spark Connector to ETL geo spatial data into MemSQL in a Docker container. To view the bulk loaded data, ArcGIS Pro is extended with an ArcPy toolbox to query MemSQL, aggregate and view the result set of features on a map.


Like usual, all the source code can be found here

Monday, March 6, 2017

GeoBinning On IBM Bluemix Spark

This is a proof of concept project to enable ArcGIS Pro to invoke a Spark based geo analytics on IBM Bluemix and view the result of the analysis as features in a map.


Check out the source code here

Wednesday, March 1, 2017

Space Time Ripples

Start by looking at this application and that one. Make sure to tilt the map by holding down the right mouse button and sliding the mouse up. Then, slide the bottom slider back and forth to see the data "ripple" through time.


This type of visualization is something I have wanted to do for a long time and is now possible with the advent of the new 4.2 ArcGIS API for JavaScript. The new API has "hooks" to enable a developer to invoke WebGL shaders directly, which can render a massive amount of data very efficiently and very quickly.


The authoring of the data for the above applications is based on ArcGIS Pro extended with a custom ArcPy based toolbox. The tool queries features from a user selected feature class, bins the features by space and time and emits a space-time "cube" in the form of a Dojo AMD module to be loaded by a JavaScript application. The source of the feature class can be a geodatabase, a relational data store, or the new SpatialTemporal BigData store.

Yes, I should have written a web service to do that, but this is my blog post and leaving that as an exercise for the reader :-)

I have to admit that I am a bit selfish in building the JavaScript application in "mixing" two languages: JavaScript and TypeScript. I wanted to try out the TypeScript extension to our JavaScript API, and I long for a type-safe language when building front end applications like in my olde Flex/AS3 days. It turned out that TypeScript is very cool, especially when used within IntelliJ :-)

Like usual, all the source code can be found here, and I will be talking about it more next week at my presentation at DevSummit. See some of you in Palm Springs.

Wednesday, May 25, 2016

Snapping Points To Lines And ArcGIS Pro

Been wanting to post on this subject for quite some time (actually over a year) as associating a world coordinate with the proper nearby linear feature provides tremendous insight based on the fusion of their attributes. Moreover, doing that on a massive scale and quickly is even more imperative in today's BigData world, thus the usage of Apache Spark. I’ve posted a standalone implementation that relies on well-documented simple math and published methodology to perform searches on massive datasets in batch mode. What is exciting to me in writing this post was the viewing of the snap results in ArcGIS Pro. My lack of knowledge in extending ArcGIS Pro with downloadable Python modules contributed to the delay (and slight case of procrastination :-). However, with the help of a colleague, I was able to pip install modules that can be imported by my custom ArcPy based toolboxes without any errors.


Also, since this is all based on BigData, well it has to be tested in a BigData environment. The post describes the usage of Docker and the Cloudera QuickStart container to check the snap and the visualization. The following illustrates my development environment.


Like usual, all the source code can be found here.

Sunday, April 10, 2016

Vector Tiles: The Third Wave

When it comes to web mapping, we are surfing on a third wave in our digital ocean. And the “collaborative processing” between the digital entities while surfing that wave is making the ride more fun, insightful and expressive.

The first web wave was back in the mid 1990s, where interactive maps in the form of html image tags relied heavily on the server and requests parameters to regenerate the image when you clicked on the edge arrows to pan and zoom. Remember MapQuest and ArcIMS ?

Then in the mid 2000s came the second wave or more like a tsunami, Google Maps. You hold down the right mouse button on the map and drag to pan, you use the scroll wheel to zoom in and out, and… when you click on the map, a bubble appears on the map showing the details of the clicked location. Disruptive ! And all was smooth, responsive and AJAXy. This is when I believe that this collaborative processing concept took root and materialized itself in the web mappers’ minds. Soon after, more expressiveness was required as HTML was lacking in power and functionalities and the capitalization on browser plugins emerged to create Single Page Applications. Remember Flex and Silverlight ?

We are now in the mid 2010s. Flash is dead because he ate an “Apple”. HTML5, CSS3 and Javascript are in full swing and though Tile Services are fast as the tile images are preprocessed and prepared to be displayed, they are still image based, and dynamic styling of the features in a tile is not easy. In addition, with the ubiquity of GPUs on edge devices, faster rendering for expressiveness is now possible through the elusive “collaborative processing”.

Enter Vector Tiles. Map box has defined a vector tile specification that we at Esri have adopted it in our Javascript API, and demonstrated its versatility at the 2015 User Conference. Andrew Turner has a nice writeup about it. And found this nice in-depth paper that analyses the dynamic rendering of vector-based maps with WebGL.

I wanted to know more about it and I learn by doing. So I implemented two projects, a Mapbox Vector Tile encoder and a visualizer as heuristic experiments to be used with the Esri Javascript API. Again, these are experiments and will report on more updates.

Tuesday, March 15, 2016

ArcGIS For Server On Docker

“But…It works on _my_ machine !!!” How many times did you hear that ? That is exactly one of the use cases of Docker for developers - Create an exact reproducible environment for each developer, even down to the hardware specification. And, that same environment can be on premise or in the cloud.
With the advent of ArcGIS For Server 10.4, I wanted to run it on my mac so I can try out some of the new features like chaining multiple SOIs.
I could have started a Windows based VM and gone through the GUI based setup, which is a pretty straight forward process (My friend Georges G. calls this, a PhD process, Push Here Dummy). But, I wanted to automate the whole install process in a headless way (I’m sure there is a way to do that using Windows, just I do not know how, maybe a blog post for another day)
Enter Docker. After downloading the ArcGIS For Linux tarball and the license file from, you can build a Dockerfile that automates the whole install process in a headless way - DevOps love this - In addition, once a build is done, you can run the image on premise or in the cloud by referencing a docker-machine.
Like usual, you can check out the whole source code on how you can do this here.