Saturday, September 24, 2005

Do hierarchical classification systems always suck?

Clay Shirky argued recently that "classification schemes are going to be largely displaced by tagging". He points to Amazon and Wikipedia as two examples of how classification systems suck, and it would be hard to disagree. Shirky is a smart guy, and having just implemented our own user-driven classification system in Reef, his essay made me stop and wonder whether ours was destined to suck, too.

More to the point, it seemed like a good time to stop and think about how our classification system will integrate with our tagging system (which we have, too). Right next to Shirky's post is an interesting post by Tom Coates on how tags behave when the things being tagged also inhabit a hierarchical system.

There's a basic point that Shirky is muddying: he conflates classification systems with rigid, top-down, professionally applied metadata. He should not be mentioning Amazon and Wikipedia in the same breath, because the former (rigid, top-down, etc) deserves to be junked, while the latter (flexible, user-driven) is simply an experiment that needs to be improved.

The reason classification in Wikipedia is lousy is not that it's too expensive or too hard, it's that most people don't care about it; it's not terribly useful, because you don't usually browse encylopedias. I use wikipedia all the time, but I go there with specific questions and search not browse is exactly what I need. This is precisely why I rarely (if ever) bookmark wikipedia pages -- and I never tag them. I don't need to tag them.

The situation in Reef is quite different, because there are all sorts of information types (discussion, events, articles, etc.) that flow past you in this system. There's a need to be able to flag and organize anything and everything in whatever way you want -- Reef works like del.icio.us: you bookmark something by tagging it.

At the same time, articles (and for now, only articles) live in a hierarchical page space. If you want to add a new article, you have to add it as the child of some other page, which means that every article has a place in the page hierarchy. This is different than a traditional wiki because we treat the link as a parent/child relationship and let you explicitly view and edit a page's "paths". That's what's turns this into a classification system rather than a link network. This hierarchy is user-created, user-modifiable, and more flexible than your file system because a page can have as many different parents and children as you want it to.

The reason I think it's imperative that Reef support classification is that for 2People browsing is essential. One of our prime use cases is: you don't know what action to take next. You need to be able to, say, go to the section on "green homes" and get an overview of what your options are.

The interesting question is, what's the relationship between the page hierarchy and the tagging system? In Tom Coates' example, they're using tags applied to songs to generate information about albums and artists. This implies a sort of "summation operator" for tags that lets you derive a tagset that could be applied to the "thing" (say, album) that represents the collection of tagged items. I don't think this model really applies in Reef. Let's say you take all the articles that are descendants of the "energy efficiency" article, and look at their tags. I don't see that there would be much benefit in "summing" the tagsets. The nature of this hierarchical relationship is different, and also the tags are different -- people tag songs with tags like "groovy" and "techno", but in Reef the tags are going to be more content-driven.

But there's another way to look at it. Tags form an implicit, hierarchical classification system that is derivable from tag co-occurrences. (Ask me about this, if you're interested in the algorithm.) So, in principle, we could generate alternate views of the page hierarchy based on tagsets. But for us, now, this is too complicated. In the meantime, it makes sense to think of integrating tag info into the page hierarchy -- perhaps by using tags to generate lists of "related" pages that can appear alongside each page's list of "child" pages.

Tuesday, September 20, 2005

Game Theory and Green Consumers

Joel Makower has a thoughtful piece on green consumers and why there's such a gap between green concern and actual buying habits. The gist of his answer is that green products are marketed wrong. I'm oversimplifying a bit, but basically Makower thinks that the average consumer won't make a product choice strictly on ecological concerns, and therefore marketers have to translate green choices into "healthier", "more efficient", or "higher quality" choices. In the political realm, this is the thinking of the Apollo Alliance -- their premise is (again, oversimplifying) that voters will never vote green, so we have to translate green into jobs and security.

As an activist, I think this is fundamentally wrong. It's not that I think Apollo is a bad idea, or that it would be terrible if people bought compact fluorescents just because they save money. But I do believe that we will not market our way to sustainability. If people don't actually "get it", then we will never make the enormous changes we need to make in the tiny amount of time we have to do it. That's why education is a cornerstone of 2People's strategy.

On his side, Makower can point to decades of evidence showing that people don't buy (or vote) green. But he omits some crucial factors in analyzing this evidence. We as consumers have lousy information. Is this product truly green, or is just hype? How much of a difference will product A make versus product B? Is anyone else buying it, or am I just a lone actor making an idealistic statement that no one hears? For heaven's sake, my neighbor asked me the other day if "organic food" was safe, because he heard that people got sick from the manure!

At the same time, we as consumers always have one signal that's crystal clear: how much does it cost? In the absence of good answers to the good questions, the rational thing to do is to pay more attention to the information we have that's most reliable. And that's what we do. And that's why we're in a race to the bottom. Better marketing is not going to create a race to the top; reliable signals about cause and effect will.

Tuesday, September 06, 2005

Seattle-bound

As of Oct., I'll be working full-time on the 2People project. Yay! I just gave notice at Harvard, and I'll be moving in with my brother in Seattle to save expenses. My aim is to launch the site more or less by the end of the year.

I'm thrilled to announce that Carey McKinley has joined us as a development consultant and fundraiser. She and I will sit down next week and draft our first letters of inquiry to funders in search of seed money to support 2People while it's getting off the ground.