The multiplication of social websites and semantic web technologies
Andrew
recently invited me to Dopplr,
where I can see his travels, and of course where I can add my own
planned trips. Now why would I want to add yet another web 2.0 site to
the LinkedIn,
Flickr, Dopplr, Frappr, Twitter, and all others
(including this blog)
where I can tell things about myself? Also, shortly after I accepted
Andrew's invitation, Ugo
added me to his Dopplr contacts. Now why would I want to one more time
build a network on that site with the same people I'm already linked to
in other sites?
The Web 2.0 allows allows people to publish things about themselves and build their network. But the incredible explosion of the number and the variety of social networking websites makes it a real pain for connected people to maintain their personal information everywhere.
So, rather than users entering and maintaining their data on every site they want to appear in, the contrary should happen. I should describe myself and my network only once, and those websites I find useful and I want to register to should pull my information, or at least those parts of my information I want to show them.
Now how can we do this? What common format should I use so that all these websites are able to understand my information? How can I describe my connections to other people? The answer is simple: semantic web. There are RDF vocabularies for most of the things you usually tell about yourself on social networking websites, such as FOAF, RDFCal or SiOC.
Now this radically changes the social web as we know it today. We first need semantic publishing tools (e.g. RDFa-enabled blog engines) so that creating and publishing our information is easy and natural. Registering to a social website will then simply mean giving the URL of our semantic information (or a subset thereof, which the semantic blog tool should help us to define), and of course our OpenID URL (or even not, since it can be in our metadata). We change from a web of websites hosting people descriptions to a web of people publising their descriptions.
But this also comes with some problems: if you don't own your domain name, the hosting provider will "own" your data, and moving to a different hosting will be a pain since your URL changes. Sure, we can use abstract URIs and a naming service to resolve it to the possibly changing URL, but this is introducing a level of complexity for average users, and who will own the naming service?
This will also be a great opportunity for marketers and spammers, since they can know a lot about you by harvesting your data and those of people in your network. Something the current balkanization of data on many websites makes difficult. But our semantic blog engine can check the OpenID of robots that want to harvest our data, with the dedicated HTTP authentication scheme.
So, the time is approaching in my opinion when the use of semantic technologies to represent our personal information only once will be worth it, because of the multiplication of social websites that actually bring some value but require to enter the same information again and again. But this is a chicken and egg problem: websites won't harvest our data if we don't publish it, and we won't publish it if no website harvests them. Piggy Bank or GRDDL allow to extract RDF from existing websites, we now need tools that do it the other way around, posting our RDF data to the social websites, until they harvest them.
And once people are used to expressing semantically rich information about themselves, they will hopefully understand the value of semantically rich documents.
The Web 2.0 allows allows people to publish things about themselves and build their network. But the incredible explosion of the number and the variety of social networking websites makes it a real pain for connected people to maintain their personal information everywhere.
So, rather than users entering and maintaining their data on every site they want to appear in, the contrary should happen. I should describe myself and my network only once, and those websites I find useful and I want to register to should pull my information, or at least those parts of my information I want to show them.
Now how can we do this? What common format should I use so that all these websites are able to understand my information? How can I describe my connections to other people? The answer is simple: semantic web. There are RDF vocabularies for most of the things you usually tell about yourself on social networking websites, such as FOAF, RDFCal or SiOC.
Now this radically changes the social web as we know it today. We first need semantic publishing tools (e.g. RDFa-enabled blog engines) so that creating and publishing our information is easy and natural. Registering to a social website will then simply mean giving the URL of our semantic information (or a subset thereof, which the semantic blog tool should help us to define), and of course our OpenID URL (or even not, since it can be in our metadata). We change from a web of websites hosting people descriptions to a web of people publising their descriptions.
But this also comes with some problems: if you don't own your domain name, the hosting provider will "own" your data, and moving to a different hosting will be a pain since your URL changes. Sure, we can use abstract URIs and a naming service to resolve it to the possibly changing URL, but this is introducing a level of complexity for average users, and who will own the naming service?
This will also be a great opportunity for marketers and spammers, since they can know a lot about you by harvesting your data and those of people in your network. Something the current balkanization of data on many websites makes difficult. But our semantic blog engine can check the OpenID of robots that want to harvest our data, with the dedicated HTTP authentication scheme.
So, the time is approaching in my opinion when the use of semantic technologies to represent our personal information only once will be worth it, because of the multiplication of social websites that actually bring some value but require to enter the same information again and again. But this is a chicken and egg problem: websites won't harvest our data if we don't publish it, and we won't publish it if no website harvests them. Piggy Bank or GRDDL allow to extract RDF from existing websites, we now need tools that do it the other way around, posting our RDF data to the social websites, until they harvest them.
And once people are used to expressing semantically rich information about themselves, they will hopefully understand the value of semantically rich documents.
Comments
I totally agree with you Sylvain.
We are at the time where we start understanding that the semantic web is needed to solve the overhead of information, the bad organization of the contente etc...
Microformats are a way to social networks interaction. Linekedin is using hresume for example.
We have to take that into account for our project !!
Posted by: cedric | June 15, 2007 12:18 PM
Je suis entièrement d'accord avec toi et je te remercie pour cette vision que tu nous proposes et qui me paraît très intéressante pour expliquer simplement les enjeux liés aux technologies du Web sémantique.
Je m'intéresse pas mal à cette question et à propos de la réunification des réseaux sociaux via FOAF, j'ai fait une petite expérience : http://lespetitescases.net/foaf-et-web-services (je m'aperçois que mon expérience ne fonctionne plus, je la réparerai ce soir).
Il existe une question liée, la centralisation des tags entre les différents services que l'on pourrait envisager via SKOS : http://lespetitescases.net/skos-l-avenir-de-la-folksonomie-y
Enfin sur RDFa : http://lespetitescases.net/amusons-nous-avec-rdfa
Je pense que cela pourrait t'intéresser.
PS : désolé pour le français, je n'ai pas eu le courage de le rédiger en anglais, j'espère que tu n'en porteras pas rigueur.
Posted by: Got | June 15, 2007 04:23 PM
being late on reading this, I'm seeing this only minutes after reading http://intertwingly.net/blog/2007/06/18/Web3S
looks like more people think alike...
Posted by: -marc= | June 19, 2007 04:52 PM
Interesting post :) You might be interested in the foaf:openid thread on foaf-dev ie. http://lists.foaf-project.org/pipermail/foaf-dev/2007-June/thread.html
I've lately been looking around the options for foaf/openid integration. Having a foaf:openid property is the first little step. Also been playing with SPARQL for querying PGP-signed RDF data, where we have named graphs for the result of PGP-checking the RDF/XML doc. FWIW the proposed rdfs:domain of foaf:openid is foaf:Agent, so the possibility of non-human Agents authenticating is left open...
Posted by: Dan Brickley | June 20, 2007 02:55 PM