NovoGeek's Blog (Archive)

Technical insights of a web geek

Semantics-Why should I care?

'Data' is the new currency on the web and there are companies fighting for YOUR data. Facebook blocked Twitter from looking up for friends, Google fought with Facebook over data protectionism and such wars are wide spread across the web. Probably, there are business constraints which are beyond the scope of discussion, but how does this matter to us, the end users or more specifically, the developers?


We log in to many social media sites  and continue registering for even more services every day. We spend a lot of time for this, struggle to maintain so many profiles, send friend requests to people who are already friends in other networks, finally ending up locking our data in stovepipes. What is the value of data which is locked in some data silos? It can't be reused anywhere else and is useful only at a single place, which means, such data isn't much useful. It is YOUR data and it needs to be available quickly, easily and in a way that suits your needs. We have already moved from Web 1.0 era, where applications published data stored in 'private' databases. We are in Web 2.0 era, where open APIs are available and mashups can be made to serve our purpose and share information. Yet, data is locked in private databases and building up mashups remained only as loss of time for developers.  Data on a website can be searched by text matching techniques but cannot be queried. Moreover, we cannot move our content from one place to another or reuse our twitter friends list on flickr.


In the age where data is exploding and number of machines (computers) are competing with number of people, the amount of machine readable data (content that a computer can understand when it drops in at your site) is meager. e.g.,  Data on a webpage does not express the relation between different objects in a way machines can understand. Developing content is expensive and developing another app just to support a new service is not 're-use'. Social networking sites are like independent islands, creating many independent communities of users and data. There is a need to connect these islands, allowing users to move from one place to another, along with their data. This is where semantic web standards peek in.


Semantic web standards help in resolving all the issues listed above by creating rich, standard, machine readable content. They work on 'network effects', which means the more users adopting semantic web standards, the more benefit they can reap. e.g., The benefit of cell phones will be best known if everyone in a community owns one, since communication becomes easier. Did you know that you are already in a part of semantic web? If you are using Facebook or Twitter, you are living in a semantic environment without your ignorance. The working of Facebook's "Like" button is entirely driven by these standards. There are many protocols in place and good work is being done by enterprises as well as open source communities in advocating these concepts. RDFa by W3C, Microformats, Abmeta, Yahoo Search Monkey, Google rich snippets, Facebook Open Graph Protocol, Twitter Annotations, etc., are all the efforts to put more machine readable markup on websites.


Semantic Web, Open Data, Linked Data, Web 3.0, HTML5. Are all these same?

This is a huge topic of confusion, with so many terms lying around and people using them interchangeably. UK government opened up their data to the public and this data refers to 'Open Data'. 'Linked Data' is a way of publishing structured data on the web, based on certain principles outlined by Tim Berners Lee. It is essentially aimed at solving the design issues of semantic web, helping in interlinking different islands of web pages. So, Open Data can still be in isolated island, without being linked to other communities. 'Semantic Web' is a web of structured, machine readable data and it makes sense when data is both open and linked. It is huge tree having linked data, vocabularies (FOAF, SIOC), Ontologies, Rules, Reasoning etc.,  as its branches. 'Web 3.0' is a visionary term, imagining a web where machines can understand, add content and artificial intelligence showing its power in search. It is used synonymously with 'semantic web', but the transition to such a generation is not anywhere in the near future, since there is enormous amount of data to be serialized to machine readable format. HTML5 is one of the many enablers of the vision of semantic web and it's microdata specification is designed to simplify the existing annotation technologies.


The goal of this article is to show why developers should care for semantics, in building the next generation of web applications. Intentionally, it is kept far from deep technical things and subsequent articles would go a bit deeper. Hope this held your interest for sometime Smile

Comments (3) -

  • Rohit Kumar

    3/8/2011 4:25:23 PM |

    seems instead of world war.. its going to be data war Smile

  • NovoGeek

    3/8/2011 4:26:20 PM | already started Smile

  • Raja Sekhar

    3/8/2011 4:27:32 PM |

    Nice and Interesting Article!!

Pingbacks and trackbacks (2)+

Comments are closed