Here is a proof-of-concept project that spatially enables a crunch pipeline with a Point-In-Polygon function from a very large set of static point data with a small set of dynamic polygons.
Crunch has simplified so much so the process, that is came down to a one line syntax:
final PTable<Long, Long> counts = pipeline. readTextFile(args). parallelDo(new PointInPolygon(), Writables.longs()). count();
Crunch's strength is in processing BigData that cannot be stored in the "traditional means", such a time series and graphs. Will be interesting to perform some kind to spatial and temporal analysis with it in a followup post.
Like usual, all the source code can be found here.