Gremlin is a free Java/Groovy system for traversing graphs, including but not limited to RDF. This post is based on example code from Marko Rodriguez (@twarko) and the Gremlin wiki and mailing list. The test run below goes pretty slowly when run with 4 or 5 loops, since it uses the Web as its database, via […]
Tag Archives: data processing
Map-reduce-merge and Hadoop/Hbase RDF
Just found this interesting presentation, Map-Reduce-Merge: Simpli?ed Relational Data Processing on Large Clusters by Hung-chih Yang, Ali Dasdan Ruey-Lung Hsiao, D. Stott Parker; as presented by Nate Rober (PDF) Excerpts: Extending MapReduce 1. Change to reduce phase 2. Merge phase 3. Additional user-de?nable operations a. partition selector b. processor c. merger d. con?gurable iterators Implementing […]