Search and Social Networking

November 5, 2007

With Hakia releasing a new social networking feature on their search engine (see entry in Read/Write Web) and Google sponsoring OpenSocial, clear strides are being made to integrate the search and social networking spaces. To many, including us, this almost begs the exclamation “About time!”

Social networking, for the most part, has been built around imitating (and hopefully positively effecting) the physical relationships we already have in the real world (think LinkedIn, Friendster, Facebook). Only very recently has progress being made in linking people based on their interests (think StumbleUpon to a degree and third party applications like uPlayMe, etc) and now with the recent announcement of Google’s OpenSocial we expect this to accelerate the development of a richer, more meaningful social networking experience. There is still significant work that needs to be done in terms of connecting us to people we ought to be connected to. This is why we at Youlicit are excited about the “Meet Others” feature on Hakia which is being called a “peer-to-peer transactional platform” and are further building upon this concept at Youlicit.

Richard McManus poses an almost rhetorical question on his blog entry asking if search and social networking go together. We believe that the purpose of a search tool is to help you find the information you need with the least effort possible (see Relevance/Effort metric). To this end, if there is someone who has, and is willing to share, the information or expertise you are looking for, then what better medium to connect you to him than that in which you already go to find your information. Granted there is a spectrum of modes that different users fall into depending on their personality types (and time constraints) ranging from solitary to the very social (as pointed out by Charles Knight in his blog). In the end everyone can and wants to benefit from accessing the information (and people) they need as quickly as possible. This is why we are including “Related Users” for every query you perform on Youlicit (this feature is soon to come and can currently be seen on your Personalized Recommendations and User pages). We are using this as a base to build out a social networking aspect to our website recommendation service.

As you read this, we are working hard to better determine what users are interested in as well as allow users to share with others what they are recommending on a certain topic. The end goal is to become an enabler of collaboration between users to better facilitate the discovery and sharing of information. Building a social network based on your real world relationships with people you already know can help improve and extract more value out of those relationships but isn’t the most effective means to introduce you to other people you ought to know. A higher value social network connects you to people who share your interests and can help you not only discover the information you need quicker but ultimately increase your productivity and introduce you to more “meaningful” resources in your area of interest (see Expert Systems entry).

This is obviously not an easy feat to accomplish (otherwise it would have already been done!) and there are many hurdles that need to be crossed. How do you learn a user’s interests while safeguarding and protecting his right to privacy? How do you maintain the credibility and quality of such a “transactional platform” (i.e. how do you prevent unwanted information such as spam from diluting the quality of the service)? How do you enable varying levels of collaboration (from direct synchronous communication to asynchronous communication) with minimal distraction and effort from users? How, if possible, do you most optimally monetize such a transactional platform that will incentivize further collaboration?

The creation of such a platform inherently requires the cooperation of users (and of course technology) to make it all happen but we are confident that this is possible and have no doubt that a need for such a platform exists and must be met. As we develop and roll out this platform, we would love to hear your thoughts on this matter and get your feedback on what you would like to see on such a platform.

Facebook flyers experiment – Pt. 2

October 12, 2007

(See Facebook Flyers Experiment Pt 1 for background info)

After running the facebook flyer campaigns for a week, the results are out. To recap, the five campaigns were:

  1. All high school students (ages 13-19) with $0.05 per click.
  2. All high school students (ages 13-19) with $0.03 per click
  3. All people between the ages of 17-40 who live in the top technology cities as identified by Wired and some others (Seattle, San Francisco, LA, Austin, Orlando, New York, Boston, Philadelphia, Washington DC, Pittsburgh & Chicago)
  4. All college students attending some of the top technical schools (MIT, Stanford, Harvard, Princeton, UPenn, Columbia, Cornell, Carnegie Mellon, Illinois, U. Texas, U. Maryland, Georgia Tech, Cal Tech, Berkeley, UCLA & Penn State).
  5. All college students majoring in Engineering and otherwise technical majors (engineering disciplines, computer science, web design, etc)

As expected, most impressions occurred in campaigns 1 and 2, high school students, being these were the least specific, hence most probably the largest, user group targeted. What was also interesting to note was that there was a negligible difference in impressions between $0.03 and $0.05 campaigns (6.35% more impressions on the 0.5 cents per click) indicating that the demand for targeting these groups is less than the supply (i.e. number of impressions being demanded are far less than the number of potential impressions). In an ideal world, the number of impressions for the $0.05 per click campaign would be 40% more.

The second most impressions came from college students in technology majors, followed by adults in “the top technology cities” and then college students in the top technology schools. Again these are directly related to the specificity of the groups targeted.

As for the click through rates? They were low, atrociously low. The most clicks for any campaign were a whopping 1 click! The table below breaks down the CTRs for each campaign:

Campaign Impressions CTR (%)
1 5,904 0
2 5,551 0.018
3 894 0
4 852 0.117
5 1001 0.0999

Clickthrough rates of 0% to 0.1% were about as low as predicted. No conclusive statements can be made about which campaigns were better targeted due to the incredibly low number of clicks.

Being avid facebook users ourselves, we rarely find ourselves clicking on such flyers. In analyzing our own actions, we hypothesized some possible reasons to explain these poor click through rates:

  1. Facebook users are rarely in search for “external” information (information not available within Facebook) as opposed to say when one is searching on Google.
  2. Users find Facebook content far too engaging to click on a link that will direct them away from the site.
  3. The placement of the Facebook flyers is not at an optimum place on the pages.
  4. The flyer we created failed to capture the interest of the audience.

To perform a more rigorous study, one would need to run these campaigns for a much longer time than just a week. However, it is hard to imagine the CTR’s being significantly higher if the campaigns were to be prolonged (see Mashable post). It would be interesting to examine how using more captivating flyer designs (specifically targeted to each user group) would affect the click through rates, if at all. Needless to say, we are exploring other possible ways in which we can tap into the Facebook userbase as a means to generate traffic. The Youlicit team would love to hear from you if you have fared better than us, in terms of CTRs, with Facebook Flyers or are interested in sharing ideas to better target Facebook users.

On a side note, for all campaigns, the highest impressions occurred on a Friday, Saturday and Sunday – on average 86.8% of the total impressions – highly indicative of Facebook usage habits.

Facebook flyers experiment – Pt. 1

October 4, 2007

Facebook recently launched an updated version of their Flyers product called Flyers Pro. The main difference between this and the now Flyers Basic product is that advertisers can now pay per click instead of paying per impression.

In theory, advertising on Facebook has immense appeal. Given its incredible user base (now over 30 million active users) and extremely high visitor-return frequency (ranking second in visits/visitor) it is an advertiser’s haven. With Facebook’s improved targeting options – you can now filter on keywords (“Beatles”, “Lord of the Rings”, “iPhone”), countries, cities, age, workplace (“Google”, “Goldman Sachs”) and colleges – advertisers can now (theoretically) target their ads much better and reach the exact demographics they are looking for. Social networking sites, however, are infamous for their abysmal click-through rates (ranging from 0.01% to less than 1%). Unheeding these statistics, however, we decided to try out this new platform.

To test out the new Flyers platform, we created several campaigns to see:

  1. what demographics get us the most impressions for a given cost per click (CPC), and
  2. what demographics have the highest click through rates (CTRs).

The first question is to gauge the current demand of advertising to certain demographics on Facebook (independent of product and creative) and the second question is for us to see which Facebook demographics Youlicit appeals to the most (highly dependent on the ad, the placement of the ad, the product being advertised and the audience). The different demographics we targeted were as follows:

  • All high school students (ages 13-19)
  • All college students majoring in Engineering and otherwise technical majors (engineering disciplines, computer science, web design, etc)
  • All college students attending some of the top technical schools (MIT, Stanford, Harvard, Princeton, UPenn, Columbia, Cornell, Carnegie Mellon, Illinois, U. Texas, U. Maryland, Georgia Tech, Cal Tech, Berkeley, UCLA & Penn State). Nothing personal if your school isn’t listed, this was just a random sampling.
  • All people between the ages of 17-40 who live in the top technology cities as identified by Wired and some others (Seattle, San Francisco, LA, Austin, Orlando, New York, Boston, Philadelphia, Washington DC, Pittsburgh & Chicago)

We created this flyer to be used on all the campaigns:

Facebook Flyer

Given our current download rates per visit on Youlicit, we set our maximum costs per click to $0.03 in order to keep our user acquisition costs to under $1/user (however we did set 2 campaigns for the high school demographics to compare number of impressions for CPC’s of $0.05 and $0.03). We launched these campaigns this week for a period of 7 days and will see how each one fares after that. Stay tuned for the results…

(see the results here)

What Comes After Google?

October 4, 2007

Question: What Comes After Google?

Yahoo just released a new Search Assistant feature this week (TechCrunch) (Read/WriteWeb). Ask has been trying a new interface lift for a while (TechCrunch) (Read/WriteWeb). While these are all very nice incremental improvements to search, are they enough to supplant Google? Do they tackle the fundamental problem of information retrieval in a paradigm shifting way? The answer is probably not.

Now imagine several years into the future. Will you find information in the same way in the future as you do today? Again, probably not.

This may sound like an obvious “duh”-ism, but its ramification certainly is not. As unfathomable as it may seem, Google, as we know it today, will probably not be how we find most of our information in a few years. Since Youlicit is an information retrieval company, we had to ask ourselves, “If not Google, then what?”

What is the logic that dictates the evolution of information retrieval paradigms?

Evolution of Information Retrieval Paradigms

To answer this question, we first plotted the different paradigms of information retrieval on a timeline. If we can figure out what the axis of this graph represents, then we should be able to predict which new solutions will succeed and which will not by simply identifying the solutions that maximize the metric along this axis.

If you’re a start-up, this understanding can guide you in building a successful innovative product. If you’re a venture capitalist or technology evaluator, this insight gives you a criterion for determining which technologies to invest in and which ones will fade away as fads of the day.

Evolution of Information Retrieval Paradigms

After plotting them on a timeline, we then explored the three major paradigms of information retrieval:

  1. Manual Organization
  2. Algorithmic Search
  3. User-Generated Recommendations

Manual Organization

Information retrieval, during its infancy, started off as a very rigid and structured process. Those who remember Gopher or Jerry and David’s Guide to the World Wide Web (later known as Yahoo) know how attempts were made to organize sites into a pre-determined hierarchy. However, as the number of web sites exploded exponentially, manually organizing sites into structured directories became practically impossible:

“A universal ontology is difficult and expensive to construct and maintain when there involve hundreds of millions of users with diverse background. When used to organize Web objects, ontology faces two hard problems: unlike physical objects, digital content is seldom semantically pure to fit in a specific category; and it is difficult to predict the paths, through which a user would explore to discover a digital object.”
Clay Shirkey

Algorithmic Search

Too many sites to categorize? No problem. Algorithmic search to the rescue. Web search engines, such as Altavista and Google, arrived and allowed the web to grow in its chaotic unstructured way while still providing a level of organization in the form of keywords. Now instead of having to know the correct directory hierarchy, users only needed to know the keyword combination (and page number) for sites they were looking for.

Counter to intuition, search engines actually decrease the relevance of individual results as compared to those in a manually organized directory. A hand-picked set of results are always better than an algorithmically generated set of results. However, since search engines have a much greater coverage of the Web, the average relevance of search results from a given set of topics is usually better than the average relevance of directory results on the same set of topics.

The other improvement made by search was the replacement of directory hierarchies by keywords as the primary recall mechanism. While still not perfect, guessing and checking keywords took a lot less effort than guessing and checking hierarchies. Seach engines effectively decreased the recall effort.

User-Generated Recommendations

Recently, we’ve witnessed the niche adoption of tagging, voting, stumbling and other “user-generated relevance” as a means of finding information. Why? It’s because they improved something along either the average relevance dimension or the recall effort dimension.

Take and Digg for instance. In the scope of technology related content, the average relevance of results from these folksonomy sites is better than from search engines because these folksonomy sites have been able to increase coverage by effectively crowdsourcing an easier manual organization process.

StumbleUpon went the other route. Instead of improving average relevance, it decided to reduce the recall effort from guessing and checking keywords to a one-click no-thought “stumble.” In doing so, it did something ingenious: StumbleUpon removed the world’s most scarce resource from the information retrieval process… human thought.

Answer: The Solutions that Maximizes the “Search Metric”

As well as they’ve done, both kinds of user-generated recommendation services are plateauing well before crossing the chasm into the mainstream market. Why? We think it’s because they’ve only focused on a singular improvement, either average relevance or recall effort, but not both.

In order for an information retrieval solution to penetrate into the mass market, the solution has to take a dual approach. It has to concurrently maximize average relevance and minimize recall effort. It’s simply a matter of optimizing the Average Relevance / Recall Effort ratio, or as we like to call it, the “Search Metric.” The solution that does this best will probably gain the most mindshare and supplant algorithmic search as the primary mode of information retrieval. And that is what comes after Google.


Does this imply that algorithmic search will become extinct anytime soon? Absolutely not. It just means that more and more people will find larger percentages of their daily information through means other than search. Our bets are on online “word of mouth” or user-generated recommendations.

Evolution of Information Retrieval Paradigms

We’ve builit Youlicit with this assumption at the core. Youlicit is a “word of mouth” or recommendation engine (as opposed to a search engine). We’re trying to maximize the Search Metric by combining user-generated relevance with one-click no-thought recall. We want to improve the information retrieval landscape by enabling the average user to harness the wisdom of the crowds with very little effort. If you’re as obsessed with this problem as we are, we’d love nothing more than to hear from you!

— The Youlicit Team

Expert Systems & Personalized Recommendations

September 27, 2007

Where is the web going? In an interesting post by Steven Spalding, How to Define Web 3.0, he discusses many trends of the web that are taking place right now and how he thinks they will evolve in the near future.

According to Steven, a user will start his/her journey on the web with one of three tasks – seeking information, seeking validation, or seeking entertainment. The word journey here implies a quest the user embarks on to find new information on the web (hence, this does not include activities such as email, chatting, etc which are increasingly become more integrated into the web). I want to point out the difference he makes between seeking information and seeking validation. He describes seeking information as essentially how we search online today – using search engines to find specific information via keywords. His definition for seeking validation is as follows:

“If I am not necessarily looking for information, but instead am looking for “news” (I use news in as loose a fashion as I can) the way I would use search would be slightly different. Along with the specialized search engines, People Search would be available. You could type in what you were looking for, “conservative viewpoint on Darwin” for example and it would pull up results ordered by relevance (algorithms), tagging, and validation through user voting.”

Here “news” can be extended to seeking opinions on various topics, finding what people are reading or blogging on a given subject, or researching trends. This is primarily a mode of casual information discovery using these “specialized search engines” mentioned that aggregate relationships between objects and people. Given that the nature of this information is inherently “peopled-driven”, it must largely be derived from the “wisdom of the crowds”. There is therefore an undeniable need to aggregate all such information (i.e. tagged, voted, commented information) and their relationships (quantitatively) and make it accessible to everyone on a very on-demand and contextually relevant basis.

Steven further makes an interesting prediction in regards to “Expert Systems” (which he defines as systems containing subject-specific information and the knowledge and analytical skills of one or more human experts):

“Ten years from now, Expert Systems won’t only be designed for general cases, but will be able to be easily generated to understand individual’s tastes. Already we see contextual advertising and contextual search, but what if you could extend this concept to a web browser or to your mobile phone. Imagine a world where your computer would generate a profile, a meme map about you based on your interactions with the web and refine your experience based on this map.

This is precisely what we are working to accomplish at Youlicit. Our vision is to create a Youlicit community of users and model dynamic interest maps of a user based on his online interactions with the web. This includes looking at explicit recommendations made by the User as well analyzing browsing patterns (if he so chooses) to create such a meme map. We can then use this information to create time-sliced profiles of the user and connect him to other users and relevant content on the web based on this interest map. Picture doing research on where to look for financial aid for college and being lead instantly not only to very relevant content (eg. graduate vs. undergraduate financial aid) that matches your current interest profile but also to users who have expressed strong interests in the college application process and various financial aid sites and have an abundance of relevant and seminal data that you can access. Or better yet, being able to reach out to these users (if they so choose) and communicating directly with them to leverage their expertise. This can all be possible with the wealth of information already present on the web and the evolution of the web from a relatively passive medium to a very dynamic, interactive, collaborative platform. What is needed is a tool to aggregate & analyze this information and provide it to the masses in an effortless, easy-to-use manner.

We’re back…

September 17, 2007

… and darker than before. The team just returned from a relaxing yet productive weekend in Orlando and Tampa. After spending endless hours on the beach and in the pool, we’re back in good ol’ New York. See below for some pictures:

Team Retreat – Orlando!

September 13, 2007

The Youlicit Team is taking a much needed weekend retreat to the sunshine state! We’ll be in Orlando for the weekend for some fun in the sun and the chance to do some serious thinking about our direction (and by serious thinking I mean brainstorming on Space Mountain). Stay tuned for some pictures of the weekend…