Hitchhiker’s Guide to 650 :: Marketplaces

Technology, Marketplaces, CommunityAugust 3, 2006 4:57 pm

Greg Linden’s eBay, scammers, and self-governance brings up a good point. . . that the whole idea of community contributed value (content, commerce, social, etc) scaling infinitely is somewhat of a myth.

There was a time in my life, that I looked at eBay’s buisness model longingly (5 years ago before I joined) and thought that if one day we could just achieve critical mass, We would be PRINTING MONEY and I could just retire and watch the zero’s grow . . . boy was I wrong. . .

Yes the margins are great, better than the traditional one-to-many business models. But it does bring up a whole slew of other unexpected issues that threatens to negate network effects . . .

Here is a simple test that almost all company fails . . . if value added for a company is truly networks effect driven the difference between its cost of capital and ROI should increase ALMOST exponentially and infinitely. Put it another way (without stupid finance speak) Gross Margins should not only increase infinitely but exponentially as well. There not one company on this earth that has done this yet . . .

Even Google has seen its margins decrease and cash flow growth slow; further more, as much as people think Adsense/word is this self sustaining monster, there are thousands of cute/hot/buffed Stanford undergrads doing menial adwords tech support/filtering/placement etc . . . . just for a chance to date Larry (ok low blow . . . but mommy and daddy didnt mortage the house and get you into Stanford so you can attach one of the phone headsets to your head all day) .

Myspace is just beginning to run into this issue as well. To police its community it hired a Chief Whatever Officer to safequard the community from . . . itself . . .

Digg, with the whole issue of selling ID, digging for money etc, will eventually discover that there are certain things algorithm cant solve for and people will be needed to handle the exception cases.

I guess the summary is that nature has a way of finding balance. . . no one organism or company can grow unchecked forever. . . eventually the very thing that made it successfull will created some sort of negative externality and brings balance back into the world. (hmm isnt this in the pre-amble to Star War VI?). . . eBay with trust, youTube with hardware costs, MySpace with sex, and Google with too many Stanford undergrads . . . :)

If network effects can creat a platform which enables certain drivers to create value exponentially it can just as easily enable other drivers to destroy value exponentially.

Technology, MarketplacesApril 27, 2006 9:42 pm

Om Malik is known as the blogger that loves to rag on eBay. Exhibit A, B, C, D, and E . . . . . . . (somebody’s got to keep eBay honest right?)

Most recently Om suggests

1. Come up with eBay 2.0 and figure out a role for the company in the digital future.
2. Focus on core strengths. Buy Intuit (Quicken) to give eBay buyers and sellers accounting features.
3. Focus of the company should be Paypal and turning it into Citibank of online world. (Very Very Important.)
4. Figure out a way to get into shareware sales business. Perhaps acquire eSellerate. This is where Ebay can put its heft to good use.
5. Get into digital media sales. The recent Skype-EMI deal could be a good start.

I havent yet seen a journalist turned CEO become successful (turning VC’s works some time . .. Moritz for one) but I think Om is well on his way :) . . . (as you guys can tell, I’m exercising a huge amount of self control not to write a retort). So the only thing I’ll write is that some of these ideas are obvious, others too vague, and the rest are simply financially infeasible :) (to be honest, I do like one of them though). . ie not much better or worse than a McKinsey deck :)

Start-Ups, Marketplaces, CommunityMarch 14, 2006 7:53 pm

Building network effects is hard . . . but once a startup has gathered traction, network effects can also reverse itself pretty quickly. Its not all rosey, sit back, and collect the money. . . Case in point. . . Miva networks.

The company generated 219 million paid clickthroughs during the quarter. That’s a 13% reduction from what it produced a year earlier, despite growing its base of advertisers by 20%. This isn’t a mixed bag — it’s just bad. Revenues falling faster than clicks means that the company is generating less money per click.

Found it on Motley Fool this morning while checking on my portfolio. . .

These guys was one of the pioneers of the PPC ad network model. However somewhere along the way, “bad” sites got into their network and they are caught in a situation where they are unable to sacrifice critical mass for quality. And thus, the downward spiral begins. . .

1) bad sites join network
2) low quality traffic click through ads
3) low conversion for advertiser
4) bids for words goes lower
5) high quality sites leaves cause they can get higher monetization somewhere else
.. . 1) more bad sites join network that got kicked out of google/overture

So even though partners are increased at 20% word pricing droped significantly more . . . this is a cautionary tale for all the web 2.0 plays out there. . . if you attract the wrong kind of community initially, you are building the wrong kind of network effects that could quickly deterioate. . . . for example, if Digg attracted spammers when it was initially launched, they would not have become what it is today. . . if craigslist was over-ran by best-buy when it first launched, no one would go there . . . chose carefully where you launch your site. . . not all traffic is good. . .

Technology, MarketplacesFebruary 14, 2006 6:53 pm

There has been a lot talk about the “edge” lately. . . not the least of which is Edgeio, which is a classified edge aggregator. The whole idea behind the rise of the edge is that traffic patterns are getting more distributed on the web and that core portals/walled gardens are getting less and less of the total traffic while “edge” sites are getting more. . . or using another term, the long tail of websites is growing longer and longer as niche websites gain traction and usage.

There are some preliminary data that is actually pointing to the fact that the pendulum MIGHT be swinging to the other direction. That “edge” site traffic growth is slowing and that concentration is actually growing. Looking at Google Q4 #, it appears that adsense (edge advertising) revenue is pretty flat and that adword (core advertising) revenue is growing significantly faster. Furthermore, I am willing to bet that TAC concentration is also up for adsense and that the sites that adsense are acquiring now are much lower quality (traffic/click through) then only a year ago. While Google is an edge aggregator, it is still a “core” site (wall garden or not . . . actually very much a one way walled garden!, value comes in but never leaves!).

I think there is actually 2 trends that is driving the slow down in edge growth. . .

1) PageRank or any other relevancy algorithm favors the incumbent/core sites FOR EACH KEYWORD. In 1999, most of the web was dominated by huge portals that had dominated “relevancy” very broadly . . . in the ensuing years, the edge grew as niche sites found dominant niches (symbolized by “keywords”). It looks like each one of those niches has essentially been filled up and that 3rd or 4th (actually more like 20th because average page click is 2 in a search engine) entrants no longer could gain the traffic that they need from search engines to survive.

This will be an even harder issue to solve for edge commerce aggregators who need to rely on “relevancy” to determine not only relevance but trustworthiness. In essense, posting “a classified ad” in any blog will be a futile exercise because traffic will eventually be funneled to a select few sites with encumbent relevancy history. . . . and those sites will eventually be smart enough to “farm” out their reputation to host listings . . . again augmenting the core rather than the edge. This is not to say that Edgeio or the likes wont be successfull. . . it is to say that their key differentiation/barrier to entry will be their DIVERSIFICATION algorithm and not their reputation/relevancy algorithm. (Barrier to entry/exit is extremely low in the aggregation space and will be their achilles heel)

2) For now, the segment of users that are likely to “own” their own digital presence has probably peaked. Most will likely rely on “core” destinations like typepad or myspace (to give 2 really different examples) for hosting needs. These “mainstreet” consumers are truly the long tail, not the blogosphere or the technorati crowd. . . . even more likely they will still rely on vertical sites for specific purposes rather than a generic presence aggregator. In many ways, task specific core sites does a much better job of serving the true long tail than a persistent digital presence. This might not be a permanent trend as Gen Y grows up but for now, the growth of pure edge URL’s might be slowing until then . . .

Large Caps, Research, MarketplacesDecember 6, 2005 8:49 am

Content is a dirty business. . . . more specifically content that has commercial/transactional value. (ie not entertainment and informational content.) I had a taste of the business a few years back at my B2B startup working to get industrial catalogs onto the site. After digging around the industry and meeting with different companies, I quickly realized it was an extremely incestious circle of people that cycle from company to company in an endless merry go round. The hardest part is that every player in this particular industry accuses of each other of various form of copyright infringement and it is next to impossible to figure out the real story. The reason is that the law is extremely fuzzy on the actual ownership the content. For example, if 10 manufacturers hands-off their product specification to a content normalization firm (who is commissioned to do the job by some other company) who really owns the content after all the work was done? Is it the manufacturers who created the content in the first place? the company that hired the content cleansing firm? Or the content normalization firm because their value add is so significant to warrant more than just derivative work rights? Or is it public domain? (how can anyone own the fact that a certain grade of steel piping only come in 6 standard form factors?) Can the content normalization firm resale that content even though it was under some sort of consulting contract? How can anyone prove that the firm “repurposed” the content rather than “started from scratch” if it was caught reselling the content to another company? Does it really matter if the process is manual or done via software or a combination of two? Even if the law is clear, it is next to impossible to prove any type of illegal practices. All it really take is to burn a CD-ROM and walk it out the door. And in fact, I quickly discovered (unfortunately not early enough) that most of these players are really recycling the same content.

I also quickly discovered that even more important than the content itself, the so called schemas - aka semantics, aka meta data, aka attributes - that are the most important “IP’s” in the industry. Once you know how products should be characterized (or how experts in the field define their products) , the job of matching and normalizing content to that schema is significantly easier if not trivial. Furthermore these schemas are what drives the discoverability of your search engine and create true comparability between products. Having a system that is both structured but responsive to change (for example if Apple releases a new 1000G iPod Nano today, I better have those attributes defined ASAP or the discoverability of ipod will suffer) is a competitive advantage. So the harder, more important question is . . . Is the ownership rights of content seperate from that of semantics? Or is it bundled? Is schema or semantic even “ownable”?

So why the long set up? by now you probably figured it out. . . instead of walking out the door with a CD-ROM what if I created a RSS feed and happened to send it to GoogleBase? What happens there? Is someone violating some sort of copyright law? Maybe not the content itself but how about the schema? What if I owned the content but someone else build the semantics around the data? Can I export it and give it to someone else? Can that someone (GoogleBase) use it to their own benefit without notifying me? What if instead of CD-ROM full of data, the data actually reside on web pages? And THAT is the issue we are facing today . . .

Today, vertical search engines are normalizing semantics around the content that that do not own (similar to content normalization firms of yester-years). Certain times these vertical search engines are feeding the content directly into GoogleBase; other times, Google is simply indexing the content through the search engine. In both cases Google is taking not just the content but the schema and repurposing it for their search engine. Do the vertical search engines care? Do the content owners? Who owns the content? who owns the schema? Does anyone have the right to stop Google? I dont have the answers . . . maybe someone does. . . .

In the not too distant future (6 month?) , once they gain critical mass and the bell curve reaches steady state, Google will have a pretty good idea what are the attributes of most of the transactional products and services (any “physical” or “metaphysical” objects really) on the web without lifting a finger by the virtue of their folksonomy based name:value pair attribute engine. At which time, they will be able to extract semantics out of webpages they crawl without the help of vertical search engines or expansive manual design. When it happens, Google will be able to launch a thousand vertical search engines with a switch . . . scary thoughts for vertical search engines which invested numerous man hours in designing their attribute rules and believes it to be their barrier to entry . . .

This is GoogleBase, content is just a means to an end . . . and the end is so called schemas, semantics, metadatas, and attributes. . . this is what I come to realize . . . that content is simply too perishable to be valuable but this other stuff with funny names . . . this is worth more than gold . . . this is the ticket to owning the semantic web . . .

Research, Advertising, MarketplacesJuly 1, 2005 12:25 am

The best and most relevant (to me) web dissertation I’ve ever read was Clay Shirky’s Ontology is Overated. I do not hope to come even close to the clarity and relevance of that manifesto, but I hope to add to the discussion with a narrower (commerce) rather than wider (information retrieval) focus of the topic from the perspective of a specific application. Secondly, I want to take a historical & BROADER view of the topic within the context of e-commerce. Lastly, while there is no ultimate winner (game is not over) nor the right way to architect an “e-commerce information retrieval system,” to this date, there has been a winning methodology as proven by revenue, profits, and even marketcap. How the pendulum will swing in the future, I dont know, but recent technology improvements certainly has allowed various architecture to compensate for the short comming of each.

(BTW, I’m using ontology/taxonomy/attributes as a generalization of any structured content, not technically correct but useful in this case)

At the two ends of the spectrum of e-commerce implementation of a product retreival system are

1. Search Engine + Unstructured Content - Product information is created by product owner (seller, dist, manu etc) in an adhoc manner with minimal regards to standardization or formating. A seach engine is used to find relevant product for buyers based on various algorithms (keywords, pageRank etc)

2. Query + Structured Content - Best way to think about this is a attributed query field and a attributed catalog. Essentially a SQL database with a structured query interface.

There are several examples of along the spectrum.

Google - In the purest sense, Google (not Froogle) is the perfect implementation of such a system with completely unstructured data and search engine

eBay - In the SKU-less world of ebay (circa 2000), seller enter product information in a semi-structured manner. Furthermore, there is no effort to consolidate listings with the same SKU into one giant listing. As far as eBay or any machine is concerned each product listed for sale is completely unique. A search engine is implement to search listing titles and sometime descriptions.

Delicious - There is really no “tag” implementation of a e-commerce search so I’m just gonna let delicious be my straw-man. Some might argue that Delicous and eBay should switch, I however would argue that the act of tagging a product with a set of specific tags is more restricting and thus more structured than eBay’s “Listing Title.” Furthermore, as you’ll see later, eBay and Delicious is creating a Recall/Precision tradeoff consistent with the rest of the spectrum. (BTW, eBay does have a categorization scheme but not in the context of its search engine. The scheme essentially offers an alternative method of navigation. But if you want to, you can switch eBay & Delicious on the spectrum because of this issue)

Amazon - Amazon has a catalog that is SKU centric in that product title and description are standardized for each unique product. Sellers of that product has to list his or her product under that SKU.

Chemdex – A long dead but very relevant example. In many ways represent all B2B e-commerce companies back in 2000. Like a lot of B2B implementation of an e-commerce info retrieval system, aka catalog, Chemdex has a very sophistical, highly attributed, highly structured product content. It has the very definition of an Ontology or Taxonomy (depending on your own interpretation of the word).

As we all know, companies that have taken the critical product strategy decision on the LEFT side of the spectrum on unstructured content has become the dominant players in the e-commerce world. For various reasons I will go into, Google and eBay has garnered a disproportionate amount of the e-commerce spend. Especially in the case of eBay vs. Amazon, the power of the unstructured content has won over rigid standards. While many would argue that eBay has much better business model (no inventory) than Amazon and thus is the leading players, I would argue that because Amazon has adopted this virtual model since 2000 and has yet to narrow the gap, it shows that it is actually the superior product architecture that is the driving force of eBay’s growth. Fundamentally, it is also this unstructured product content architecture that has allowed eBay to maximize its virtual model and thus is the true source of its competitive advantage.

There are several key differences in the spectrum:

1. Sophistication/Effort – On one end, the critical product and differentiation factor is better search algorithm, on the other end, the critical factor is content creation. Essentially, player on the left side of the spectrum decided to spend money on “understand the mess” while on the right side on “cleaning up the mess.”

2. SKU – Due precisely to the decision above, adding & creating content for players on the left is so easy, unlimited # of products can be sold and managed leading to breadth. On the right, because the bottleneck of the commerce system is on the creation of the catalog, companies are forced to focus on the product they can sell and drive inventory turnover for those SKUs (ie depth).

3. Investment – Equally important, Google and eBay lives and die by the “power” of its search engine and thus spend significant money on creating the best of breed algorithm or user experience. Chemdex and Amazon, on the other hand must invest in content creation throughput usually in the form of man power. (Chemdex spends disproportionate amount of its money on this task and eventually went out of business because it too so long, the quality was so bad, and so expensive.)

There are also some key trade offs too

1. Speed – Search Engines are by definition faster than Query Engines. Your SQL results on 100X less magnitude of data is still slower than a Google search. This has serious effects on the user experience especially in B2B.
2. Precision – A key search engine concept. Connoting the “relevancy” of the individual returned results. Structured content typically returns more precious results because more attributes and parameters can be specified by the “buyer”
3. Recall - Another key search engine concept. Connoting the “coverage” of the returned results. IE regardless of # of results returned, as long as all relevant results are included, it has good recall. Unstructured content typically has high recall due to the “fuzziness” & flexibility of its algorithm. Structured content, on the other hand, has serious issues as mentioned by Clay Shirky.
4. Flexibility - This is THE key reason that unstructured content won over structured. The flexibility to sell ANYTHING (kidney on eBay!) allowed eBay to evolve without management interference while Amazon required the creation of new content and new categories.
5. Data Mining - On the other hand, the ability to understand data through structured content is the key competitive differentiator that Amazon has over eBay or Google. It can mine data extensively to create sophisticated cross selling, up selling, recommendation, and personalization features that Google will be hard pressed to implement due to the fact that its data is “dumb.” While this had always been Amazon’s strategy is was still not enough to overcome the rigidity of its product catalog architecture.

These differences and trade offs were made by the various players in the industry. To this date, buyers have shown that a good search engine and unstructured product information source is the superior architecture for creating an e-commerce focused information retrieval system. Thus intelligence has won over brute force. Oh ya, I too, think ontology/taxonomy/attributes is over rated not just philosophically but for business.

I believe the past history of e-commerce search will have serious implications for the so called SEMANTIC web but I’ll save that for the next post when I can think more clearly. (Hint, I’m in Clay’s camp)

Just some of the things I read on tagging recently, there is a lot btw so this is not comprehensive:

Unfolding Ontology from Alex
The Yin and Yan of Tagging
More Clay
More Clay on Tim Bray’s Q
A blog on tagging: You’re It
Fred’s Tags

Newer Items »»