Tag Archives: semantic

Branched Wikis

The more I work with wikis I found that one important thing is missing. Much works goes into making Wikis look like they are none. While I find wiki development is not progressing. The most important development in the last years from my view is the invention of the common wiki language Creole.

What I am missing is the possibility to maintain branches for content. Just like version control software works – where you can check out some content and then pull in some changes. So for example you check out the content of Wikipedia.de. Then what you may want is that you can work on a local copy and then merge the content from the source. New articles would simply be created. Untouched articles could be simply deleted (you may want to select to prevent this from happening given that a lot of important good articles are often deleted). Maybe you also just select to import the content of some articles. Like maybe you have a website about composing music and you want to show some articles in that context rather than link to Wikipedia. But maybe you like to add some content to the articles or remove some sections. Now the source article gets updated. Today you will have to look at the changes on Wikipedia and edit your version by hand. Thats plain stupid given that software version controls like Mercurial already allows to maintain branches of content. So we have all software we need to merge the content either intelligently and automatically – or you get some notifications where your software or Wiki needs your interaction and attention. This feature maybe even could be extended to merge different articles with the same topic. Maybe we need better software algorithms to recognize similarities. It should maybe display two versions right and left and/or show you a mixed version of the two articles. And maybe it marks sections it thinks tell the same . Maybe like an article about a person there are two articles which both mention who the parents of this person are. The similarity can be used to ease the mixing of the content. Maybe one can develop new approaches if one adds the following principles: object orientation and enriched content. So currently Wikis contain a lot of free flow text. It is then segregated in sections sometimes without the software being able to identify a content.

Some people think it is not possible to markup all content. And I am also not sure if it really makes sense to display meta information in a page itself. Rather the meta information should be guessed and automatically added . So back to the article of a person. These articles all contain similar sections. Also one could identify some links rather as an object. If one sees text and content as object oriented it would be stupid to try to markup the content to indicate what it less. if you look at he example of the Semantic MediaWiki in Wikipedia:

… the population is [[Has population:=3,993,933]] …

one could also think: Why is the wording “the population is…” not enough indication of that a number is the population? Sure I know that computers do not recognize all content today. But I think if one would have a recognition engine which concentrates on similarities and is trained to identify some content I do not think this is all a big problem. Like all city articles in english Wikipedia are classified. So we can identify hat articles are about cities with no trouble and then you often find a table where population is indicated. So I think it would be nearly no effort to find out about the population.

If this does not work one could try to find this information itn the flow text. This could be made by a proximity detection between the word “population” and a number. If there are any doubts a human can still open an article and markup a text like you get a menu für city category article and have the task to mark a text section as containing the population and then save that information. The knowledge of that example then can be taken to find the population data in a new article. Maybe it would also be nice if those city article classes could be extended easily. In fact “population” itself is not really saying very much. Like it does not tell you when this population was counted. I could also imagine that Wikipedia articles could be written by robots. Like tell him to fill in the class information for a city article. It then could identifiy the information in the WWW just like it can do in the wiki itself. And then it could write an article with some given templates. Or one could implement a search inside a wiki where you formulate a question with elements you attach boolean to each other . Then you get search results about what pages in the WWW contain the information you are looking for. Then you can tell the engine which of the results contained that information and you may also be able to import content by marking text or clicking on an image, video or music file .

I haven’t seen much of these ideas mentioned anywhere and not implemented in any wiki I have seen. But if organizing information is the goal of wikis we sure need that next steps. If anybody can point me to implementations of any of those ideas I would be glad to get it!

Leave a comment

Filed under Technology