Monday, March 27, 2017

ArcGIS, Spark and Alluxio Integration

There exist a plethora of backend distributed data stores. I am always using S3 or Hadoop HDFS or OpenStack Swift with my GIS applications to read from these backends geospatial data or to save into these backends my data. Some of these distributed data stores are not natively supported by the ArcGIS platform. However, the platform can be extended with ArcPy to handle these situations. Depending on the data store, I will have to use a different API (mostly Python based) to read and write geospatial information. This is where Alluxio comes in very handy. It provides an abstract layer between the application and the data store and (here is the best part), it caches this information in memory in a distributed and resilient-to-failure manner. So, at the application level, the code to access the data is invariant. On the backend, I can configure Alluxio to use either S3, HDFS or SWIFT. Finally, the advent of a REST endpoint in Alluxio eases the integration with ArcGIS to write, read and visualize Geospatial data.

img-alternative-text
img-alternative-text

Like usual, all the source code for this integration can be found here.

Saturday, March 18, 2017

ArcGIS, Spark & MemSQL Integration

Just got back from the fantastic Strata + Hadoop 2017 conference where the topics ranged from BigData, Spark to lots of AI/ML and not so much on Hadoop explicitly, at least not in the sessions that I attended. I think that is why the conference is renamed Strata + Data from now on as there is more to Hadoop in BigData.

While strolling the exhibition hall, I walked into the booth of our friends at MemSQL and got a BIG hug from Gary. We reminisced about our co-presentations at various conferences regarding the integration of ArcGIS and MemSQL as they natively support geospatial types.

This post is a refresher on the integration with a "modern" twist, where we are using the Spark Connector to ETL geo spatial data into MemSQL in a Docker container. To view the bulk loaded data, ArcGIS Pro is extended with an ArcPy toolbox to query MemSQL, aggregate and view the result set of features on a map.

img-alternative-text
img-alternative-text

Like usual, all the source code can be found here

Monday, March 6, 2017

GeoBinning On IBM Bluemix Spark

This is a proof of concept project to enable ArcGIS Pro to invoke a Spark based geo analytics on IBM Bluemix and view the result of the analysis as features in a map.

img-alternative-text

Check out the source code here

Wednesday, March 1, 2017

Space Time Ripples

Start by looking at this application and that one. Make sure to tilt the map by holding down the right mouse button and sliding the mouse up. Then, slide the bottom slider back and forth to see the data "ripple" through time.
img-alternative-text
This type of visualization is something I have wanted to do for a long time and is now possible with the advent of the new 4.2 ArcGIS API for JavaScript. The new API has "hooks" to enable a developer to invoke WebGL shaders directly, which can render a massive amount of data very efficiently and very quickly.
img-alternative-text
The authoring of the data for the above applications is based on ArcGIS Pro extended with a custom ArcPy based toolbox. The tool queries features from a user selected feature class, bins the features by space and time and emits a space-time "cube" in the form of a Dojo AMD module to be loaded by a JavaScript application. The source of the feature class can be a geodatabase, a relational data store, or the new SpatialTemporal BigData store.
Yes, I should have written a web service to do that, but this is my blog post and leaving that as an exercise for the reader :-)
I have to admit that I am a bit selfish in building the JavaScript application in "mixing" two languages: JavaScript and TypeScript. I wanted to try out the TypeScript extension to our JavaScript API, and I long for a type-safe language when building front end applications like in my olde Flex/AS3 days. It turned out that TypeScript is very cool, especially when used within IntelliJ :-)
Like usual, all the source code can be found here, and I will be talking about it more next week at my presentation at DevSummit. See some of you in Palm Springs.