Freebase - Now Open For Searching

Phil Butler,


freebase logoFreebase is an accessible database that is editable like Wikia and Wikipedia. The startup has been in private alpha testing but just opened beta doors to the public. The service is aimed at organizing the world's data. The database has been seeded with over 2 million topics from Wikipedia and other sources and this data is currently in "read" form for everyone with "write" capabilities reserved for registered users. Freebase aims to deliver on Google's promise to organize the world's data.

The heart of Freebase is the searchable database that essentially filters down data via keyword narrowed queries. Essentially the user narrows subjects until the desired result is obtained. Freebase allows users to edit incorrect data in their path and submission of new data as well. The similarity to Wikia or Wikipedia is readily evident as far as user contributions to the database but not in the look or feel. Freebase is tackling the vast data of the Web in an interesting if not unique way by seeking user collaboration within a new interface.

Freebase has added a great deal of data since I first visited with tens and hundreds of thousands of entries on topics from sports to film. Freebase is also available via its API for applications offsite in read-only format.

Testing Relevance

Semantic is another term for "meaning", and several innovative ventures are attacking improved relevance via the use of advanced technology combined with either natural language and/or human filtering. I performed a simple test of my own to see just how effective Freebase's variant could be. I was sure that my favorite search engine hakia would be able to eclipse anything that Freebase could come up with at this stage of their development. I was surprised to find that the Freebase result was interestingly relevant while the hakia results provided more depth in a keyword search for "Island and Thera".

freebase result

The hakia result has captured these keywords contextually within sentences, while the Freebase top results have pinpointed exactly the Island of Santorini. This demonstrates to a degree the power of human filtered search. The hakia results rendered better choices for a broad search intended for narrowing, while the Freebase one rendered rather exactly relevant top results (however limited in scope) for the "thing". This is by no means a clinical analysis of the two entities but it is interesting to see the two philosophies in action.

Man vs. Machine

Obviously Freebase is in a rather embryonic stage of development, as is hakia to some extent. Increased meaning in search has gone from a novelty idea to an accepted eventuality for most people. Jimmy Wales has approached this from the human search angle with Search -Wikia and Riza Berkan and the great scientists of hakia are attacking the problem from an advanced technological standpoint as are Powerset and others.

It is rather obvious that human search has two fundamental obstacles to overcome. First, the amount of data to be gleaned is massive - the results for overall queries will remain limited by submissions - this is a function of time. Secondly, the sources of the information will be subjective unless further refinements are made to results - this is a quality function. Human search could be the most relevant of all given time and scrutiny, but making a data base of the necessary scope is daunting.

Hakia, Powerset and the others are confronted with one big problem in my view - that is the random occurrence (or inserted in the case of SEO) of sentences and even whole pages of seemingly relevant content within a huge number of documents. In the example below, it is evident that a hotel site has crept into the mix. It is obvious that hakia is increasingly "trimming" down these kinds of results, but the problem of semantics within somewhat irrelevant documents is a problem. I think this can be overcome (I have an idea Melek) and we should also not loose sight of the fact that the human that is "us" individually needs to be the final filtering agent.  

hakia search

Conclusion - Alberts

Freebase has great potential for helping users find relevance (on some level) easily, and also for collaboration within its community. The power of user generated content has been revealed via the conduits of Wikipedia, MySpace, Facebook and a host of other communities, but there are limitations. Freebase is dependent on Wikipedians and their cousins resident on the Web. In the end, Search-Wikia will obviously aggregate the flock into a much larger human knowledge base over time.

I have always asserted that the perfect search engine is a collaboration between these two philosophies. Hakia, Powerset and Search Wikia combined could search out and narrow until what is presented is geometrically more "exact" than anysingular method. I know my good friends at hakia are capable of producing AI, but that one "personality" will still be just one Einstein amidst a sea of Schweitzer's. A sea of Alberts face a daunting task when confronted with an unfiltered universe.