WordPress, TinyMCE and RDFa editors

I’m writing this in WordPress’s ‘Visual’ mode WYSIWYG HTML editor, and thinking “how could it be improved to support RDFa?”

Well let’s think. Humm. In RDFa, every section of text is always ‘about’ something, and then has typed links or properties associated with that thing. So there are icons ‘B’ for bold, ‘I’ for italics, etc. And thingies for bulleted lists, paragraphs, fonts. I’m not sure how easily it would handle the nesting of full RDFa, but some common case could be a start: a paragraph that was about some specific thing, and within which the topical focus didn’t shift.

So, here is a paragraph about me. How to say it’s really about me? There could be a ‘typeof’ attribute on the paragraph element, with value foaf:Person, and an ‘about’ attribute with a URI for me, eg. http://danbri.org/foaf.rdf#danbri

In this UI, I just type textual paragraphs and then double newlines are interpreted by the publishing engine as separate HTML paragraphs. I don’t see any paragraph-level or even post-level properties in the user interface where an ‘about’ URI could be stored. But I’m sure something could be hacked. Now what about properties and links? Let’s see…

I’d like to link to TinyMCE now, since that is the HTML editor that is embedded in WordPress. So here is a normal link to TinyMCE. To make it, I searched for the TinyMCE homepage, copied its URL into the copy/paste buffer, then pressed the ‘add link’ icon here after highlighting the text I wanted to turn into a link.

It gave me a little HTML/CSS popup window with some options – link URL, target window, link title and (CSS) class; I’ve posted a screenshot below. This bit of UI seems like it could be easily extended for typed links. For example, if I were adding a link to a paragraph that was about a Film, and I was linking to (a page about) its director or an actor, we might want to express that using an annotation on the link. Due to the indirection (we’re trying to say that the link is to a page whose primary topic is the director, etc.), this might complicate our markup. I’ll come back to that later.

tinymce link edit screenshop

tinymce link edit screenshot

I wonder if there’s any mention of RDFa on the TinyMCE site? Nope. Nor even any mention of Microformats.

Perhaps the simplest thing to try to build into TinyMCE would be for a situation where a type on the link expressed a relationship between the topic of the blog post (or paragraph), and some kind of document.

For example, a paragraph about me, might want to annotate a link to my old school’s homepage (ie. http://www.westergate.w-sussex.sch.uk/ ) with a rel=”foaf:schoolHomepage” property.

So our target RDFa would be something like (assuming this markup is escaped and displayed nicely):

<p about=”http://danbri.org/foaf.rdf#danbri” typeof=”foaf:Person” xmlns:foaf=”http://xmlns.com/foaf/0.1/”>

I’m Dan, and I went to <a rel=”foaf:schoolHomepage” href=”http://www.westergate.w-sussex.sch.uk/”>Westergate School</a>

</p>.

See recent discussion on the foaf-dev list where we contrast this idiom with one that uses a relationship (eg. ‘school’) that directly links a person to a school, rather than to the school’s homepage. Toby made some example RDFa output of the latter style, in response to a debate about whether the foaf:schoolHomepage idiom is an ‘anti-pattern’. Excerpt:

<p about=”http://danbri.org/foaf.rdf#danbri” typeof=”foaf:Person” xmlns:foaf=”http://xmlns.com/foaf/0.1/”>

My name is <span property=”foaf:name”>Dan Brickley</span> and I spent much of the ’80s at <span rel=”foaf:school”><a typeof=”foaf:Organization” property=”foaf:name” rel=”foaf:homepage” href=”http://www.westergate.w-sussex.sch.uk/”>Westergate School</a>.

</p>

So there are some differences between these two idioms. From the RDF side they tell you similar things: the school I went to. The latter idiom is more expressive, and allows additional information to be expressed about the school, without mixing it up with the school’s homepage. The former simply says “this link is to the homepage of Dan’s school”; and presumably defers to the school site or elsewhere for more details.

I’m interested to learn more about how these expressivity tradeoffs relate to the possibilities for visual / graphical editors. Could TinyMCE be easily extended to support either idiom? How would the GUI recognise a markup pattern suitable for editing in either style? Can we avoid having to express relationships indirectly, ie. via pages? Would it be a big job to get some basic RDFa facilities into WordPress via TinyMCE?

For related earlier work, see the 2003 SWAD-Europe report on Semantic blogging from the team then at HP Labs Bristol.

Local Video for Local People

OK it’s all Google stuff, but still good to see. Go to Google Maps, My Maps, to find ‘Videos from YouTube’ listed. Here’s where I used to live (Bristol UK) and where I live now (Amsterdam, The Netherlands). Here’s a promo film of some nearby art installations from ArtZuid, who even have a page in English. I wouldn’t have found the video or the nearby links except through the map overlay. I don’t know exactly how they’re geotagging the videos, I can’t see an option under ‘my videos’ in YouTube, so perhaps it’s automatic or viewer annotations. In YouTube, you can add a map link under ‘My Videos’ / ‘Edit Video'; I didn’t see that initially. I made some investigations into similar issues (videos on maps) while at Joost; see brief mention in my Fundamentos Web slides from a couple of years ago.
Oh, nearly forgot to mention: zooming out to get a Europe or World-wide view is quite striking too.

Open social networks: bring back Iran

Three years ago, we lost Iran from Internet community. I simplify somewhat, but forgivably. Many Iranian ISPs cut off access to blogs and social networking sites, on government order. At the time, Iran was one of the most active nations on Orkut; and Orkut was the network of choice, faster than the then-fading Friendster, but not yet fully eclipsed by MySpace. It provided a historically unprecedented chance for young people from Iran, USA, Europe and the world to hang out together in an online community. But when Orkut was blocked at the ISP level in Iran, pretty much nobody in the English-speaking blog-tech-pundit scene seemed to even notice. This continues to bug me. Web technologists apparantly care collectively more about freeing Robert Scoble’s addressbook from Facebook, than about the real potential for unmediated, uncensored, global online community.

Most folk in the US will never visit Iran, and vice-versa. And the press and government in both states are engaged in scary levels of sabre-rattling and demonisation. For me, one of the big motivations for working (through FOAF, SPARQL, XMPP and other technologies) on social networking interop, is so young people in the future can grow up naturally having friends in distant nations, regardless of whether their government thinks that’s a priority. If hundreds of blog posts can be written about the good Mr Scoble’s addressbook portability situation, why are thousands of posts not being written about the need for social networking tools to connect people regardless of nationality and national firewalls?

Some things are too important to leave to governments…

Update: a few hours after writing this, things get hairy in Hormuz.  Oof…

Microblogs and the monolingual

Twitter-like microblogging seems a nice granularity for following thoughts expressed in languages you don’t speak.

In 1988 I passed my French language GCSE exam; it’s been downhill all the way since. A year ago in Argentina, I got to the stage where I could just about express myself in Spanish. But it’s been fading. Nevertheless I’m on the websemantique (french) and web-semantica-ayuda (spanish) lists (this is the good influence of Chaals from W3C and SWAD-Europe days), and will try to follow and occasionally respond (usually just a perhaps-relevant link). It’s a good exercise for English speakers to do, to remind them how many folk experience the primarily English-language dialog that dominates the technology scene.

So just now I happened to notice CharlesNepote‘s name on Twitter, from the websemantique list. And in reading his twitter stream, I realised this: twitter posts in a foreign language are easier to follow than either full blog posts, email threads or realtime IM/IRC chat. It’s a nice level of granularity that can bridge language communities a little, since someone with fading schoolkid or tourist knowledge of another language can use and reinforce it by reading microblogs in it.

A related SemWeb use case: yesterday in #foaf we had someone asking about RDF who would rather have spoken Italian. It should be easier to find the members of a multi-language community who can help in such cases. We have SemWeb vocabulary that lets people declare what language they speak, read, or write; and there are ways of expressing interests in topics, or membership of a community. What we’re missing is stable, reliable and queryable aggregates of data expressed in those terms. I think we can change that in 2008…

HTML Imagemap authoring tool for MacOSX?


I’ve searched around in vain for one. I want to annotate my FOAF spec diagram with mouseover text and links into the documentation. Most of the tools I find are a decade or more old, or pay-to-play. I remember the Gimp image editor can do imagemaps, but it crashes on startup on my MacBook. I did find a nice Javascript editor the other day, but it had no “undo” function, which made complex work impossible. Maybe I’ll try Amaya. Suggestions very much welcomed! Are imagemaps *that* uncool these days? They’re just SVG and image metadata in another notation… (and imho one of the the more interesting scenarios for HTML-based microformattery). That last link has missing photos and out of date SVG, but might still be of interest. A surviving screenshot included here.

Nearby: (from SWAD-Europe hacking days) an imagemap2svg.xslt thanks to Max Froumentin.

Update: I installed Amaya 9.99 for Intel MacOSX. Sadly it couldn’t even display my JPEG properly, although it did do a better job showing OmniGraffle’s SVG output than Firefox.


Symbol languages and the Semantic Web

OK so I just stumbled upon this…

Bomb in Baghdad

…via Jonathan Chetwynd’s ever-inventive and SVG-happy Peepo.com.

The “Car Bomb in Baghdad” story is from a site created by Widgit Software, and explains itself as follows:

Symbolworld has been set up to provide a web site with material suitable for symbol readers of all ages. The internet is an important medium which many people really like to use. Sadly there is very little material that is appropriate or accessible by people with learning difficulties.

The copyright statement for Symbolworld says “symbols on this page are copyright of the commercial owners. They may not be copied or used in any other format without the written permission from the owner.“, which initially struck me as a potential challenge to any use of this particular symbol-set for online communication. But I don’t really know this scene, and I guess this copyright could be just the same as the way eg. fonts are copyrighted.

So I don’t know much about this particular project/company/product, but it reminded me of some similar work I heard about a few years ago. Back when the EU, in their occasionally infinite wisdom, funded SWAD-Europe to run around talking to interesting people about standards and the Semantic Web (and giving them t-shirts) , Chaals organised a great developer workshop in Madrid on Image annotation. We had the usual fun with SVG and RDF (which btw I’m still betting on) and I got to learn a bit about CCF. Seeing the Baghdad example this morning reminded me of all this. I’ve been clicking around and trying to gather my sprawling thoughts.

CCF, the Concept Coding Framework, is kinda image annotation in reverse. Instead of focussing on the description of the content, concepts etc associated with images, the emphasis is on the use of images to illustrate some enumerated set of concepts. From Chaals’ workshop report re outcomes, I’m reminded that we discussed…

How to use Creative Commons and similar vocabularies to determine whether a particular symbol can be freely used (typically in commercial systems the symbols themselves are proprietary, which can be a major barrier to communication between people who have different systems).

CCF was using some variant of SKOS (another SWAD-Europe activity). This found its way into SKOS itself, where we now have prefSymbol and altSymbol relationships that associate a skos:Concept with a dcmitype:Image. Borrowing an illustration from the SKOS Core guide here:

skos symbol diagram

The guide also notes a distinction between symbolic labelling and “depiction” in the FOAF sense; some symbols are purely symbolic, and have no representational content.

So, catching up with this area of work a little, I find the Bliss Symbolics, the WWAAC project final report, and various other accessibility-related efforts. But I’ve not really figured out where I’d start if I wanted to build something simple using a freely available symbol-set, nor what the main options/projects currently are. But there’s plenty of reading, including pages from a recent Bliss “think tank” meeting.

The latest I can find on CCF is that it has moved sites and that there are some Web interfaces available, but “sorry – there are no downloads for this project yet.”. Ah, apparently the SYMBERED project is continuing development of CCF (aside: Bliss with swedish translation; doubly incomprehensible to me!). There is a nice example on their site showing the multilingual aspect to this work, as well as contrasting Bliss with a more representationally-oriented symbol set; see their site for details.

Here’s a simple example just in English:

I want coffee and milk and cookies.

In case anyone thinks this whole exploration is a bit niche and obscure, take a look at how people use MSN and other IM systems sometime. And of course the Jabber/XMPP guys have been exploring specs for standard emoticons. Chaals also points out some connections to VoiceXML where “there are a handful of options available in an interaction designed to be through voice, and developers will define assorted ways of recognising from a user’s speech which of the relevant concepts is being matched.”

We’re also only a few clicks away from the survey conducted by the dreamily named W3C Emotion incubator group into use cases and technologies for the description of emotion using markup languages.

If an RDF-ized Wordnet is also thrown into the mix, assigning URIs to Synsets, WordSenses, Words, I think we might actually be getting somewhere. The version of Wordnet in RDF published at W3C doesn’t currently use SKOS, although this was discussed, and of course there are other representations that make more use of RDF class hierarchies for nouns (at the expense of linguistic lossyness and losing non-noun content). Princeton’s original English-language Wordnet has spawned many related projects and translations, but as far as I know, there is little integration amongst them, and not all of the data is public or freely re-usable.

I once had a silly dream of taking a photo to go with each and every noun term in Wordnet. A kind of SemWeb I-Spy, a cousin to Immuexa’s FOAF bingo. Or better, of doing that with friends. The rise of Flickr and tagging means that we’d probably do this now by aggregate using Flickr and similar sites. But it seems conceivable to me that such an “illustrated wordnet” could be made, using either photo-oriented or symbol-oriented illustrations.

OK what might that buy us? Let’s try these two samples.

Not perfect, … but imho Wordnet gives a nice set of common and identifiable concepts that can be used as a hub for all kinds of different projects. And all we’d need is a huge pile of shared data shaped like this:

wordnet:word-cookie skos.prefLabel wikicommons:Explosion.svg .

OK it’s not going to bring about world peace and the return of esperanto, and of course there’s much more to language and communication than nouns and verbs (the easiest part of wordnet to turn into visual symbols), but it does strike me as a fun little (big) project…. Wordnet is too huge to be useful in every context where we’d want a modest-sized symbol set (eg. IM emoticons), … but it is nicely searchable, and would provide a framework for such subsets to evolve and be interconnected.

GIS and Spatial Extensions with MySQL

GIS and Spatial Extensions with MySQL.

MySQL 4.1 introduces spatial functionality in MySQL. This article describes some of the uses of spatial extensions in a relational database, how it can be implemented in a relational database, what features are present in MySQL and some simple examples.

I’m hoping to understand the commonalities between this and PostGIS. PostGIS follows the OpenGIS “Simple Features Specification for SQL“. As do the MySQL extensions, apparently. The MySQL pages summarise the extensions as follows:

Data types. There needs to be data types to store the GIS information. This is best illustrated with an example, a POINT in a 2-dimensional system.

Operations. There must be additional operators to support the management of multi-dimensional objects, again, this is best illustrated with an example, a function that computes the AREA of a polygon of any shape.

The ability to input and output GIS data. To make systems interoperable, OGC has specified how contents of GIS objects are represented in binary and text format.

Indexing of spatial data. To use the different operators, some means of indexing of GIS data is needed, or in technical terms, spatial indexing.

I’m currently working on some ideas to prototype a new project (to fill the gap that the completion of SWAD-Europe leaves in my schedule). I’ll be revisiting my Gargonza plan to add a basic SemWeb RDF crawler to personal weblog installations, initially prototyping with Redland addons to WordPress. Ultimately, pure PHP would be better, unless Redland finds its way into the default PHP installation. Since WordPress requires MySQL anyway, it seems worth taking a look at these geo-related extensions. A more thorough investigation would take a look at reflecting GIS SQL concepts into RDF, perhaps exposing them in a SPARQL query environment. But that’s a bit ambitious for now.

What I hope to do for starters is use a blog as a personal SW crawler, scooping up RSS, FOAF, calendar, and photo descriptions from nearby Web sites. It isn’t clear yet exactly how photo metadata should most usefully be structured, but it is clear that we’ll find a way to harvest it into an RDF store. And if that metadata has mappable content, whether basic lat/long tags, richer GML, or something in between, we’ll harvest that too. My working hypothesis is that we’ll need something like MySQL spatial extensions or PostGIS to really make the most of that data, for eg. to expose location-specific, app-centric RSS, KML, etc. feeds such as those available from the flickr-derrived geobloggers.com and brainoff flickr.proxy sites. See mapufacture.com for one possible client app; Google Earth as KML browser is another.

That’s the plan anyway. So the reading list grows. Fortunately, OGC’s GIS SQL spec at least has some nice diagrams…

GIS datatype hierarchy