Commandline PHP for loading RDF URLs into ARC (and Twinkle for query UI)


#!/usr/bin/php
<?php
if ($argc != 2 || in_array($argv[1], array('--help', '-help', '-h', '-?'))) {
?>
This is a command line PHP script with one option: URL of RDF document to load
<?php
} else {

$supersecret = "123rememberme"; #Security analysts recommend using data of birth + social security ID here
# *** be careful with real msql passwords ***

include_once("../arc/ARC2.php");
$config = array( 'db_host' => 'localhost', 'db_name' => 'sg1', 'db_user' => 'sparql',
'db_pwd' => $supersecret, 'store_name' => 'crawl', );
$store = ARC2::getStore($config);
if (!$store->isSetUp()) { $store->setUp(); }
$profile = $argv[1];
echo "Loading data from " . $profile ;
$store->query('DELETE FROM <'.$profile.'>');
$store->query('LOAD <'.$profile.'>');
}
?>

FWIW, this is what I’m using to (re)load data into an ARC store from the commandline. I’ll try wiring up my old RDF crawler to this when I get time. Each loaded source is stored as a named graph, with the URI it is loaded from being the named graph URI. ARC just needs the path to the unpacked PHP libraries, and connection details for a MySQL database, and comes with a handy SPARQL endpoint script too, which I’ve been testing with Twinkle.

My public sandbox data is currently loaded up as follows. No promises it’ll stay there, but anyway, adding the following to Twinkle 2.0’s config file section for SPARQL endpoints works for me. The endpoint also directly offers a basic Web interface too, with HTML, XML, JSON etc.


<http://sandbox.foaf-project.org/2008/foaf/ggg.php>
a sources:Endpoint; rdfs:label "FOAF Social Graph Agggregator sandbox".