Thunderhead Explorer: Spark, Cassandra, Tessellation and ArcGIS

Sunday, January 18, 2015

Spark, Cassandra, Tessellation and ArcGIS

If you do BigData and have not heard or used Spark then…..you are living under a rock!
When executing a Spark job, you can read data from all kind of sources with schemas like file, hdfs, s3 and can write data to all kind of sinks with schemas like file and hdfs.
One BigData repository that I’ve been exploring is Cassandra. The DataStax folks released a Cassandra connector to Spark enabling the reading and writing of data from and to Cassandra.
I’ve posted on Github a sample project that reads the NYC trip data from a local file and tessellates a hexagonal mosaic with aggregates of pickup locations. That aggregation is persisted onto Cassandra.
To visualize the aggregated mosaic, I extended ArcMap with an ArcPy toolbox that fetches the content of a Cassandra table and converts it to a set of features in a FeatureClass. The resulting FeatureClass is associated with a gradual symbology to become a layer on the map as follows:

Like usual all the source code is here.

1 comment:

Unknown said...: Don't think you can publish 3 blogs this month and take a break (haha). Looking forward to February...

/r
Rick; January 23, 2015 at 6:54 PM