Hitchhiker’s Guide to 650 :: November :: 2005

Start-UpsNovember 29, 2005 7:50 pm

Over at John Battelle’s blog a debate was started by Raul Valdes-Perez of Vivisimo on the actual “returns” of investing in personalization for a search engine.

Rual’s arguments can be structured into two main ones . . .

Problem #2 - The surfing data used for personalizing search is weak. The data that online booksellers like Amazon use is strong: I’m paying $20 for a book and committing ten hours of my life to reading it (let’s ignore the problems with gift purchases). Surfing data involves the minimal commitments of a mouse click and a few seconds to look at a page before
leaving.
Problem #3 – If the data used for inferring user profiles is the whole web page that the user visited, then it’s misleading because the user’s decision to visit the page is based on the title and brief excerpt (snippet) that are shown in the search results, not the whole page.
Problem #4 – Home computers are often shared among family members, whose surfing interests obviously diverge.
Problem #5 – Queries tend to be short. My own spouse couldn’t figure out my interests from
a one or two word utterance, so how is a computer going to?

Argument 2-5 are idiosyncratic to search engines rather than to datamining. . . ie . .. the way search engines are architected and the use cases around it does not provide enough data points to mine meaning data to use as a predicitve variable for future behavior. These problems are actually only applicable to small search engines like Vivisimo. MSN and Yahoo have treasure trove of data from its portal properties and registration information that they can combine with prior search behavior to predict future search behavior (ie purchases on YahooShopping). Google recognized this pretty early on and is trying to build out its own suite of products and building single sign-on solutions. Furthermore, with its Wi-Fi initiative, Google will be on equal footing with Yahoo and MSN on owning access businesses which will allow all of them to mine clickstream data of users ENTIRE time on the web. (BTW, not sure of Yahoo or SBC have access to that data so this is just a conjucture). With such a data, Google no only knows when a person clicks through a search result, but whether he/she added an item to a shopping cart and checkout of Amazon! Personalization for a standalone search engine is really HARD but not if you know additional information on your searchers like these gorilla companies. Is short, Economies of Scale & Scope exists for internet companies in the form of data accumulation and knowledge discovery.

Rual’s first point is well taken . . .

Problem #1 - People are not static; they have many fleeting and seasonal interests. A student might intensely research Abraham Lincoln for a school project but may care nothing at all about it later. I’ll read about spectacular tragedies such as a recent fire in Paraguay that led to hundreds of deaths, but am not generally concerned with death, fire, Paraguay, or supermarkets. Seasonal phenomena like elections, the Olympics, sports leagues, etc. also lead to variable interests: I’ll follow the Olympics for the next month or so, but will pay no attention for another four years.

For many dataminers, or people involved in the KDD cirlces, its wellknown that Personalization (personal demographic and behavioral) is useful and indispensable for predicting user behaviour. However, in the last few years another method called “occasionalization” have showned that in many instances, user behavior correlates much higher with an “occasion” than with historical data such as demographics or past purchases. In layman terms, its easier to interprate what you want based on what you are doing at that exact momemt or a few moments prior . . . pretty obvious eh? The problem is past data is exactly what Rual mentions . . . that user behavior is often a “response” to an external stimuli which cant be model unless we know what that stimuli is . . .

There is actually a very good article called “Seize The Occasion” download here published by bunch of Booz Allen consultants in 2001. . . the paper is very good, the data is not so much since the model they built are pretty simplistic . .. BUT the point is well written and taken . . .

As for what search engines could do with personalization? They should take a different approach suggested by “occasionalization” which is to stop treating each search query as an “independent” set of events unrelated to eachother, but rather to treat a set of queries within the same visit as a series of attempts to find ONE particular information with each query acting as refinement . . . (confused?) Simple and obvious example . . .

Person A: “Car” - result set A
Person B: “Honda” - result set B

Person C: “car” -result set A
Person C(same visit, next query): “honda” - result set C

result B /= result C even though both person C and person B typed in honda . . . that is because we know that result set C should skew towards cars rather than honda lawn mowers . . .

The applications are much much wider and algorithms more sophisticated than this. . . we can infer things about that visit/occasion through the queryw ord and the person’s clickthrough to help decide what the next result set might be if the user refines that word (again needs algorithms to define adjacency of words/queries to identify whether the next qeury is a refinement or really a completely different intention. . . .

Lastly . . .

If not search personalization, then what? Many companies, including my own, are placing bets on a display of search results that goes beyond simple ranked lists. The idea is to analyze the search results, show users the variety of themes therein, and let them explore their interests at that moment, which only they know. The best personalization is done by persons themselves.

god . . . I wish it was so simple. . . good luck Rual. . . the unfortunate reality is that searchers dont “talk” back to the search engines . . . I wrote an entire post on this topic (and check out the comment section) . . . Google has created an entire generation of lazy surfers :) (or I can blame it on us, that we have yet to figure out the right UI to make it work. . . nah. . .:) )

More interesting posts from Pannus, Clickety Clack, Datamining.

Other, Half Baked IdeasNovember 26, 2005 12:15 am

Last week I loaded up my AIM and got the following mesage

AIM added a new AIM Bots Group to your Buddy List
Send IM to moviefone and shoppingbuddy for great holiday flicks and gift ideas

I looked around the blogosphere to find out if I was the only one and found this post on PaidContent which referenced a WSJ article. After confirming it was from AOL and not some random IM spam I was pretty confused . . . I IM’ed with the bots for a while, and came to two conclusions . . .

1. hey thats pretty cool. . .
2. damn AOL is fucking around with my buddy list . . . if there ever was a egregious violation of privacy/trust this is it

Ideas like these bots have been around since bubble 1.0 but the “warn” feature in AIM prevented scalable implementation of these applications since anyone (including competitors) could shut down a bot app. AOL, being the draconian walled garden enterprise that it was, refused to open up their API to allow “certified” bots to be built that could not be “warned.” As a result, a slew of entrepreneurs threw away their business plans and went after some sort of 2.0 opportunities instead.

Instead of railing on AOL for being a “sbot” master, I’m going to take another (perhaps more harsh) approach. That is . . . I can’t believe AOL hasn’t learned its lesson from the last 5 years. After squandering the chance to kick MSN’s ass and failing to realize that it already is what Yahoo! hopes to but may never become . . . (nothing on Google . . . we all fucked up on that one) . . . its following a web 1.0, proprietary approach to interactive IM again. . . ie . . AOL is thinking way too small and way too closed . . . Is virginia really this far from the valley that these guys just dont get it?

This is what I think AOL should do with this market opportunity

1. Open up the AIM API
2. Build hosting infrastructure (open not proprietary) for IM bots (with slews of technical backends - jboss, php, ruby on rails etc)
3. Build a certification, re-certification, and dispute resolution system (think Verisign + Truste + whitelist infrastructure) for bots
4. Create a simple to use “if-then-like” XML based language for creating simple bots (but still support standard rest and/or soap interconnects for more advanced apps) and simple web based development enviroment (like wordpress is for content)
5. Charge for hosting (which is optional) and the certification system
6. Build a marketplace for discovering and rating these bots
7. Build a payment mechanism for bot builders to charge for these bots (and take a slice from each transaction)
8. Build a text based advertising network (more on this later)
9. Lastly, integrate voice XML standard into the API

Now, if you ask me if this is really that big of a market for this kind of investment I’ll say two words . . . mobile applications. The premium SMS market is way too screwed up (I talked about this in this post. . BTW before eBay bought Skype). Essentially greedy mobile carriers are killing the potential of the market by insisting on taking 30%-70% of GROSS REVENUE of premium text messages. With AIM having decent penetration into mobile devices, this is an end around for entreperneurs to build ANY type of mobile text applications which they could not before due to economics of the value chain (such as selling movie tickets through the cellphone). The idea here, which I hope AOL gets, is that anyone could get up and running, with less than $1,000 in startup cost, his or her version of 4info. The opportunity for running a location based SMS advertising network embedded into the hosting infrastructure as a monetization tool for app developers is HUGE as well (ie adsense for mobile apps). Furthermore, the voip/mobile voip opportunity is just as huge if not bigger . . . BUT I’m not going to talk about it since everyone knows about Skype and Nuance. . . (plus other personal reaons).

Come on AOL, you missed out on web 2.0, this is your chance to be the ultimate enabler and aggregator of Mobile 2.0 . . . dont fuck it up by spamming my AIM instead. . .

Large CapsNovember 24, 2005 5:52 pm

Jeff Nolan has already called me out for my Google obsession but I cant really help it given that its a part of my job (search, finding, etc etc) to track google and everyone in the industry as closely as I can :) Last week’s launch of local shopping on froogle sheds some light on Google’s vertical search strategy. It is actually quite brilliant really. . . and quite obvious too now that I think of it . . .

Google’s biggest asset is no longer its technology, its the traffic that it can generate. Thus, instead of trying to compete with vertical search engines head on with technology by doing the things that vertical search do best (meta tags, application layers etc . . . see the comment section of my previous post . . here) , Google is going to throw its weight around, as well as go back to its roots by remaining an aggregator of aggregators. (or a search engine of search engines)

Froogle’s local shopping product is built on top of other vertical search players/aggregators - Getauto, Stepup, ShopLocal (interestingly owned by the Tribune Co. . . hmm … syndicated classified by Google imminent?), and others. Why bother with trying to create semantics around millions of data sources when other little startups are already doing it for you? Now, google only needs to leverage GoogleBase’s folksonomy based semantic engine/database to aggregate from a few sources. These little players could always block google but given that they need the traffic, they can’t say no (like Craigslist did to Oodle) but will instead feed their content to google through googlebase. And as long as each of the verticals remain fragmented, Google does not face serious threat of re-mediation (think what google did to yahoo or MSFT did to IBM). I wonder which verticals Google will move into (leveraging Googlebase) and how other vertical search/application players will respond. . . fight the power or fall in line?

OtherNovember 21, 2005 10:06 pm

From the Asiapundit.

Just the other day a video of the “Back Dorm Boys” was passed around the office email . . .

Well, these guys have gone big time . . . they were on live TV in China. . .

here

Large CapsNovember 16, 2005 5:17 pm

Gonna bring back a post I did thousands of year ago

For e-tailers, Google is like a drug. It makes you feel real good about yourself (cause its profitable and helps your business grow), but you have a sneaking suspicion that it is really the one in control and one day you might need it more than it needs you.

Continue reading here . . .

Fred Wilson agrees with me by calling GoogleBase, FreeBase.

Start-Ups, Half Baked IdeasNovember 15, 2005 9:04 pm

Back in 2000, Clay Shirky wrote a seminal piece against micropayments.

The original thesis for the need for micro payments goes something like this . . .

P2P creates two problems that micropayments seem ideally suited to solve. The first is the need to reward creators of text, graphics, music or video without the overhead of publishing middlemen or the necessity to charge high prices. The success of music-sharing systems such as Napster and Audiogalaxy, and the growth of more general platforms for file sharing such as Gnutella, Freenet and AIMster, make this problem urgent.

However, Shirky argues that

The Short Answer for Why Micropayments Fail

Users hate them.
The Long Answer for Why Micropayments Fail

Why does it matter that users hate micropayments? Because users are the ones with the money, and micropayments do not take user preferences into account.

In particular, users want predictable and simple pricing. Micropayments, meanwhile, waste the users’ mental effort in order to conserve cheap resources, by creating many tiny, unpredictable transactions. Micropayments thus create in the mind of the user both anxiety and confusion, characteristics that users have not heretofore been known to actively seek out. Embedding the micropayment into the link would seem to take the intrusiveness of the micropayment to an absolute minimum, but in fact it creates a double-standard. A transaction can’t be worth so much as to require a decision but worth so little that that decision is automatic. There is a certain amount of anxiety involved in any decision to buy, no matter how small, and it derives not from the interface used or the time required, but from the very act of deciding.

Micropayments, like all payments, require a comparison: “Is this much of X worth that much of Y?” There is a minimum mental transaction cost created by this fact that cannot be optimized away, because the only transaction a user will be willing to approve with no thought will be one that costs them nothing, which is no transaction at all.

Thus the anxiety of buying is a permanent feature of micropayment systems, since economic decisions are made on the margin - not, “Is a drink worth a dollar?” but, “Is the next drink worth the next dollar?” Anything that requires the user to approve a transaction creates this anxiety, no matter what the mechanism for deciding or paying is.

Of course Shirky is arguing this from the buyer angle. Will I pay 10c for viewing/reading this article? If its a pain in the ass to pay than I will not. . . and thus his argument holds completely.

Time have changed however. The hottest topic right now is revenue share with content providers. Pete Cashmore has the latest on the Shoposphere rev share story - with inspirations from Kevin Burton and TechCrunch. This time it is no longer about will READERS pay for content. They are free to read whatever they want for nothing. The advertisers themselves are not paying in micro-chunks either. . . no they are spending colletively billions on search advertising. . . so who are the ones getting paid in a few cent or dollar at a time? The so called “slaves” who graciously fill blogs and whatever social networks full of content will get paid based on ad clicks they generate.

So then, the better question Shirk would have asked today is. . . Will these content generators want to get recieve $.96 a month from 15 different sites on 15 different checks? Of course not. I rather you pay me one large check. . . better yet . . . I hate going to the bank so acrue it till its atleast $20 before sending it to me. . . or even better yet. . . ACH it to my bank account directly. . . As a blogger I know how LITTLE google is actually paying me. So if I actually have one of these random “pick lists” that should be about the amount I am expecting, so I think the use case is completely reasonable.

To do everything above, you need a micropayment micro-commission system. If I was paypal (which I kinda am :) ) I’m licking my chops right now for a chance to dominate this market. . . and for bitpass, yahoo wallet (eek), or google wallet (forgone conclusion it coming out), I’m thanking god for the latest turn of the events. . .

If I was an entrepreneur (I think I am :) ), I’m letting the newsvines, shopospheres, fotolia (sp?) of the world work on their little niches. . . I’m gonna go sell pans and shovels to the gold rushers instead. . . ie provide the infrastructure to make this all happen. My pre-money starts at $12M . . .just like 4info :)

Start-UpsNovember 14, 2005 7:38 pm

The Monk and The Riddle was one of my favorite business/venture books (along with Startup)

Apparently even the funeral portal idea have a 2.0 incarnation. . . MEM, a startup that I found out through Alarm Clock.

Move over Flickr, Riya, Shutterbook and Co., Cincinnati-based MeM (Making Everlasting Memories) is an application-specific photo-sharing service that targets the funeral business. The company claims 2M uniques per month and raising rapidly.

We just read one MeM for Maxamilian Warner LeRoy who worked for Yoko Ono in the Dakota apartment and who died at the young age of 30. MeM provides a biography, a movie, slide-show, tributes, and the ability to easily send gifts, flowers, or leave tributes. We can see application specific media sharing sites like this becoming more prevalent.

Geeze . . . is this the ultimate sign for bubble 2.0 ?

Start-Ups 12:32 pm

Vertical search engines add value to the END USERS in two main ways . . . creating semantics around data (metadata) and using those meta data to enable a more interactive search experience (which hopefully translates to higher relevancy). As Google become smarter and smarter around solving both problems (look at froogle and google local), vertical search engine are beginning to feel the crunch. In any industry the natural response to competition coming from a horizontal player is to move up the value chain in order to focus on the entire end user experience using integration as the main competitive weapon.

For content providers, the main value proposition of vertical search engine is simply traffic. The problem with vertical search engines is that most of them do not have the traffic scale of Google to be able to hold content owner hostage - preventing site owners from closing their site to search engines despite of the obvious threat of dis-intermediation. Om has taken up the torch in grilling these guys on this topic at the Search SIG and no one have a good response. (mp3 here). Again, the “out” for vertical search engines is to go the “application” route. The hope is to convince the content owners that veritcal search is creating an application ontop of their content rather than simply adding value through aggregation (which translates to disintermediation).

At SDForum’s Search SIG , a lot of vertical search engines are beginning to reposition themselves for the incoming onslaught by calling themselves vertical search application. Giving credit where its due, Dave McClure of SimplyHired has championed this cause since the very beginning before other players jumped on the bandwagon. His comment in John Battelle’s Service to Application post is characteristic of his usual stance.

There is two major problems with the vertical search application angle. (both could be overcome but still an issue).

1. From the end user perspective, having an more “inter-active” experience is not neccessarily important. Anyone who have been building search engines know that users rarely “interact” with search engines. Perhaps Google have trained searchers to be instantaneously critical but users rarely go beyond the first few page of a search result and rarely do more than hit “next page” What this means is that all the semantics that the vertical players have created around the data is only good for the user to evaluate relevancy but not to IMPROVE relevancy. Users are not likely to learn to click on functionalities to filter and re-rank search results. Perhaps Ajax will solve this problem in the future by improving user friendliness, but for now, this is a serious concern for many vertical search players. That despite their effots to add more application functionality around the search engine, the users refuse to see it as an application.

2. The content providers themselves also want to be application providers. Old school offline content owners are adept at repackaging data and selling them through multiple channels and formats. Building applications will be the obvious growth strategy for jump starting thier flat revenue base and/or stock price. The online content owners will also have to learn to do so to survive. Perhaps a few of the vertical search app providers will be acquired strictly for this purpose, but these players generally do not have the equity or cash horde to be able to pay for premiums of internet acquisitions.

Perhaps there is hope for the vertical search app companies, but I think the answer lies not in search applications BUT in “transaction” applications. Look at eBay, without paypal, it would look not so different to a vertical search app/engine. But with paypal, eBay is a much more valuable and defensible. Moving beyond solving “dicovery” issues to “settlement” efficiency could be the next phase of growth for vertical search apps. . . but again. . . they will run up against web 1.0 startups that already do both (ie. simplyhired versus. monster).

Product ManagementNovember 12, 2005 11:29 pm

. . . for a while . . .

Ken Norton, VP of Product of JotSpot, left up an presentation he gave at Hass on his blog. The content is great (but I really loved the presentation template and the font :) ). I also loved the fact that its all text, no consultant mumbo jumbo diagrams, yet it says everything it needs to say in a clear and insightful converstional tone. The hardest thing to do in a powerpoint presentation is to straddle the line between “argument” and “conversation” . . . how to get your point across yet allow room for exploration, conversation, and eventually knowledge creation between the you and the audience. Its kinda easy to see why Ken is good at what he does just by looking at the presentation.

Anyways, back the actual presentation. . . at big companies, many function of the product manager is split across many people in the company. This presentation made me miss being close to the actual development of a product. Knowing that everyday, real progress (in creating something tangible) is being made is one of the most satisfying feelings in the world. But really, in big companies the stakeholders for anything you do, product or otherwise, is so wide that cross functional teams/projects are the norm. As a result, everything in the presentation would apply generally as “how to get thing done without direct authority.” Its great to argue over trends, memes, strategies, and visions. . . but once in a while, I need something to remind me to what actually pays the bills. . . and that is . . . getting things done, product pushed out the door, end users/customer using the product (paying for them hopefully), and finally coming back for more.

Large CapsNovember 9, 2005 5:19 pm

Google was (and still is) the type of company that proud itself in being one of the few companies in the world where most of the management and strategic decisions are made bottoms up . . . by engineers and managers who are closest to cutting edge technology and users. This was so much so that when Eric Schmidt formed the “corporate strategy” group within Google, it was named Business Operations instead to quell concerns within the company that an elite group of ex-MBA’s and strategy consultants will instead determine the fate of a company that previously had been havens for engineers and alpha-geeks. (Biz Ops as the group is called internally is quite a cryptic set of meaningless words . . . if I saw that on a business card, I would have thought of customer service, IT, or some sort of project/process management group if I didn’t know better)

Only 12 month ago, Google was the type of company that went out and acquired relatively obscure technology leaders like Keyhole and Picasa. I bet those acquisition were driven by Google engineers trying to solve technical problems for a product they want to launch and eventually decide it would be better to acquire rather than build or buy. The idea then bubbled up to some executive who championed the purchase. Since the acquisition cost was relatively cheap, this model was entirely feasible.

Times have changed; however . . . Google is growing up rapidly in front of our eyes. Today these Biz Ops guys are stumping around the Valley, Virginia, Redmond, Beijing, Luxembourg and all across the world looking to put Google’s $100B (close enough) market cap to work. They are not talking some tiny hundred million dollar acquisition . . . no, these guys are rumored to be buying everything from Skype, AOL, to TIVO. . . really anything and everything. Whenever there is an auction for a technology or media company any where in world, these guys will be there waiting to give away some funny money. In short, they are no different than any other large company in the world run amuck with MBA’s trying to take over the world in one swoop and moving on to the next target with little remorse or thought.

The “old” Google is still alive and well, but “new” Google is growing stronger everyday . . . there really is no turning back. If Google wants to play with the big boys, this is the road they need to walk down . . . with the tacit understanding (resignation?) that innovation and product development will not get them where they need to be . . . that they cannot corner the market on talent and innovation, try as they may. That incumbent market share, experience, relationships, and knowledge still meant for something. That PageRank COULD be the only time in Google’s lifetime a disruptive technology they created turned an industry upside down. As most companies have learned by now, you stack the deck in your favor any way you can . . . and if that means using some stock to your advantage, then so be it. Google is a two headed beast now, even if they tried to hide one of them with some innocuous sounding business cards . . . it remain to be seen if the two heads will get along or the infinitely noble and envious bottoms-up culture and ecology will become the victim of Google’s own growth spurt . . . Really no different than every single start-up turned gorilla in the last 50 years . . .

«« Older Items •