Interesting applied SemWeb job, from public-semweb-lifesci:
The National Center for Biomedical Ontology (NCBO) is one of the seven National Centers for Biomedical Computing supported by the NIH Roadmap. The NCBO is administered at Stanford University, with partners at the Mayo Clinic, the University at Buffalo, the University of Victoria, and UCSF. The Center provides national technological infrastructure to support the creation, dissemination, and management of biomedical information and knowledge in machine-processable form.
The laboratory of Dr. Mark Musen, principal investigator of the NCBO, is seeking a highly motivated and independent post-doctoral trainee to conduct research projects at the interface of the life sciences and the Semantic Web. The post-doc will be involved in ongoing collaborative work that concerns archiving, querying, and reasoning about biological data over the Web.
See Mark Musen’s post for full text. I guess the majority of likely candidates will already be on public-semweb-lifesci, but I thought I’d air this more widely just in case. BTW I don’t have any further information than offered here, except to say it seems like a great project to be involved in.
Just found this interesting presentation,
Map-Reduce-Merge: Simpli?ed Relational Data Processing on Large Clusters
by Hung-chih Yang, Ali Dasdan Ruey-Lung Hsiao, D. Stott Parker; as presented by Nate Rober (PDF)
Excerpts:
Extending MapReduce
1. Change to reduce phase
2. Merge phase
3. Additional user-de?nable operations
a. partition selector
b. processor
c. merger
d. con?gurable iterators
Implementing Relational Algebra Operations
1. Projection
2. Aggregation
3. Selection
4. Set Operations: Union, Intersection, Difference
5. Cartesian Product
6. Rename
7. Join
[for more detail see full slides]
Conclusion
MapReduce & GFS represent a paradigm shift in data processing: use a simpli?ed interface instead of overly general DBMS.
Map-Reduce-Merge adds the ability to execute arbitrary relational algebra queries.
Next steps: develop SQL-like interface and a query optimizer.
Research paper: Map-reduce-merge: simplified relational data processing on large clusters (PDF for ACM people)
Linked from HRDF page in the Hadoop wiki, where there appears to be a proposal brewing to build an RDF store on top of the Hadoop/Hbase infrastructure.
Nearby: LargeTripleStores in ESW wiki
Not entirely unrelated: Google Social Graph API (which parsers FOAF/RDF from ‘The Web’ but discards all but the social graph parts currently)