The Lifestreaming Problem
This started as a reply to Mark @Krynsky's blog post titled The Year in Lifestreaming for 2009, but it grew embarassingly long so I decided to post it on my own blog, and link back. My thanks to Mark for helping solidify a bunch of scattered thoughts I've had bouncing around in my head for so long.
It makes me a little sad to admit, but I'm one of those developers that gave up on my solution to the Lifestreaming problem, for all the reasons @krynsky mentions and one more; I'm not sure it's even a problem anymore.
Aggregation
Years ago, sites began to fill with user generated content (UGC). Slowly some sites began to open up with RSS feeds, but the problem was;how to get all the content we made in one place. It didn't take much more than a feed aggregator that pulled all your public streams together to solve that problem. Some people found that solution sufficient so long as they could brand it and host it alongside their blog, but those of us without a dedicated audience didn't get much out of it.
Social Filtering
Which leads to the next problem; how to get our Lifestream in front of an audience without making them visit our own site or to put it another way; how do we hide items that aren't posted by people a user knows. At first this required us to build a new friends list on every site we used, but FriendFeed solved this problem better by giving us ways to import our friends' from other sites. Now that most social networks allow third-party access to your social graph, and Facebook itself--the one site that almost everyone you know online visits at least once a day-- has begun importing external activity streams, this problem can't really be solved any better.
Cross-Posting
But there remained a huge audience outside of a user's social network, and as Mark observed, Twitter--who stubbornly refuses to import activity--became the defacto place to watch the activity of people who you wouldn't necessarily call your "friends". So the next problem was;how do I post (or at least link to) my content in multiple places with only one action? Ping.fm, Posterous, Profilactic and other services filled this need. Whether a site pulls our content together to present it to our audience or requires us to push it into it's activity stream ourselves, our content is now everywhere it can be, even if that means lots of duplicates all over the place.
General Filtering
That brings us to the tough problems that still need work. By now we're subscribed to so many streams we can't keep up with them all. Furthermore our creative efforts are being lost in the deluge. So; how does a user find certain kinds of content among all the static? Several of Twitter's recent developments have started to solve that problem already.
First, Lists allow us to group streams anyway we like. In my opinion this is the single best feature Twitter has added since I joined back in '06. At last the streams I care about aren't drowned-out by the multitude that I'm less interested in. Another significant development is the Geolocation API. Streams that originate nearby often talk about people and places that are of interest to me. Finally, trends, hashtags, and searches allow a kind of rough topic-based filter which is sometimes useful.
Weighting
All these filters are fairly "dumb" in that they can only make binary judgements about content; a tweet either contains a hashtag or it doesn't. There are millions of tweets posted every day about #iPhone, but when you view the topic they are merely listed in chronological order. Ideally we could score content based on "interestingness" so we could hide uninteresting content or sort so the most interesting appears first, but how do you measure how interesting a piece of content is to the crowd, or an individual? On Twitter we can determine how many times a tweet has been RTed or @replied to, but these metrics only exist on Twitter. How would you compare a tweet to a blog post? For pull-based lifestreams like FriendFeed, stream providers would have to standardize a protocol to express these metrics, and then decide how to do the math; is a Facebook like worth more or less than a YouTube comment? Furthermore how do you keep spammers from gaming such a system? What if a male-enhancement blog reports that all it's posts get a million comments, does FriendFeed just take it's word for it?
But it doesn't have to be that complicated. PostRank does a great job of sorting feed items based on the number of other feed items that link to them much the same way that Google ranks pages in search results. But even that's hard to determine with so many URL shortening services out there. I also suspect that this method would be suceptible to the same bombing exploits that Google has had to fend off.
For a long time I pursued a machine learning solution to the problem, first with a naive bayesian classifier which I had to train by "liking" or "disliking" items, and then using latent semantic analysis to rank items based on how similar they were to items which appeared in my lifestream before. I was succesfull to some degree, but I wasn't the only one to have the idea. Strands and Popego beat me to the punch. But given that neither really took off, and Strands totally changed their focus, it seems that users just don't care enough about filtering out that last bit of noise.
The conclusion I've come to is that this final problem is just more work than it's worth. It seems the filters we have--most importantly the social one--are doing a good enough job of delivering the best UGC to us. Think about it; is there any shortage of things for you to look at these days? Sure there's still a lot of noise, and sometimes important things slip through, but personally I've started to feel like this is a Sisyphean challenge. There will never be enough time to watch every YouTube video that you may find interesting.
Besides, isn't all this channel surfing distracting us from what was important about Lifestreaming in the first place; creating content? Maybe it's time we stopped focusing on how to share our creations, and more time actually creating.