Order

Twitter is Unstructured Silo Push

March 15, 2010

If you spend some time on this web site, you’ll discover there’s a lot here, much more than just the blog. There’s also a Foundations section, where you can learn more about the foundation concepts of pull and the semantic web. One of those important concepts is my semantic web acid test. If you’ve been watching our news feed, you see that we (usually) label a “solution” as having characteristics of semantic/unstructured/semi-structured, push/pull, and web/silo. I’ve examined FaceBook, and concluded it’s a silo with mostly unstructured information but with some excellent semantics in disambiguating place, company, and school names, but it’s really push media. Now I’ll look at Twitter and see what we can learn by looking through the same lens.

Twitter lets you say anything you want, as long as it’s 140 characters or less. You can include URLs, people’s twitter names, and hashtags. Could it be part of the semantic real-time web? Let’s see.

First, Twitter content is essentially unstructured. The hashtags make it possible for people to refer to the same thing by the same name, but there could be lots of hashtags that mean the same thing, and there could be lots of people not using hashtags the way they should. I would have a difficult time calling it semi-structured. Searching the Twitterverse gives plenty of false positives and false negatives, so in that sense it’s not much better than the text we have online already. As reported by ReadWriteWeb recently, during an emergency it’s practically impossible to get status updates on things like roads, hospitals, airports, and people using Twitter, and Project Epic is trying to establish semantic ways to use the hashtags to address the probelm.

Will the hashtags for Twitter and Buzz be the same? Will there be a lot of overlap or a little? I wonder. If you have any insights or information about this, please leave it in the comments.

Twitter is missing topics. As this Hubspot article on Twitter as a poor vehicle for marketing points out, many people make up hashtags as they tweet, exploding the semantic graph, creating more semantic dispersion, making it practically impossible to get much mileage using hashtags. Wouldn’t it be cool if Twitter had a topic backbone and you could snap your tweets to it as you write them? That would make your tweets more useful and findable. I’m sure people are working on it. Please leave a comment if you know of anything.

Twitter is 99.9% irrelevant. There are people I respect and would love to follow, but they keep talking about airport delays, weather, what they are watching on TV, and what they ordered at which restaurant. I can’t find the signal. I like people like Tim O’Reilly, whose tweets are timely, interesting, and useful, and he keeps the channel relatively clear of clutter. But for those people who tweet about navel lint most of the time and then say something interesting (to me, anyway) only once in a while, I have no way to filter their tweets by topic or sentiment. That would be very helpful, and a few companies, like Evri and The Ellerdale Project are working on it. They call it real-time web intelligence, but it still has a long way to go to pass the acid test.

Twitter names are nicely unambiguous, so that makes it more semantic. Unlike FaceBook, Twitter names are unique, and you must have one to play the game. They don’t necessarily identify you, but they can. Organizations can have them, so one Twitter account can be shared by many people, which isn’t necessarily bad – an organization can be a legitimate player. And one person can have several names, which again isn’t bad, since people have different facets of their personalities. The unambiguity is good, but it doesn’t map outside of the Twitterverse, so it’s limited. You could have dozens of accounts and names, and you might want to hook some of them together, but you really can’t. That’s what i-names are for.

Twitter isn’t on the open web. Unlike Buzz, Twitter keeps all the tweets inside a huge database, and you can only get access to the database by paying huge sums of money or sending queries to the Twitter API. They don’t embrace the concept of the open web, so I have to say Twitter is a very wide, fairly shallow silo. Buzz, on the other hand, is online and can be mined by anyone in quantity.

Tweets have a half life of about 30 minutes.
Twitter is most useful in the seconds, minutes, and sometimes hours after you tweet. Once a tweet is a day old or more, its relevance to others drops dramatically. Not that it has to, but it does, because that’s how Twitter is designed. Yes, you can mine the history of a hashtag, but you really can’t see a coherent discussion. The further back or forward you go in time, the more Twitter breaks down. This gives us something to think about: the information in a daily newspaper has a half life of about 24 hours. Magazine articles have a half life of their magazine’s delivery frequency. And on an even longer scale, books are like that, too. A nonfiction book might have a half life of six months to a few years, depending on the frequency with which other books come out about the same basic topic.

Twitter is push all the way.
Here’s an example. If you use Google calendar and you have an iPhone, you may not know there’s an easy way to sync your iPhone’s iCal application to Google, every few minutes, or every hour. There is a way, but it’s somewhat complicated, and you may not have heard of it. All you need to do is go to this page and follow the instructions. This little tidbit has improved the quality of my life measurably. And it’s probably been tweeted about before somewhere. And now that I’m telling you, you might do it and be a happy camper a few minutes from now. On the other hand, you may be using Google Calendar but not an iPhone, or the other way around. If one of those things changes, you might later try to remember where you read something about syncing the iPhone and Google calendar. You may find your way to that page by keyword using Google, but I don’t think you’ll do it using Twitter. You can’t really store or bookmark that URL for later use if you don’t really know you’ll need it. What you want is something that lets you learn what you need, when you need it, the way you need it. That’s pull. And Twitter doesn’t deliver pull.

In summary: Twitter is the real-time unstructured web, but it may not last that long. It’s too irrelevant. There’s far too much noise. It will have to evolve. Will it become more semantic? I think it has to. But will it every switch from push to pull? I don’t think it will. I think something built to pull from the ground up could eventually take its place. I have ideas about this, in case you’re a VC and want to take me to lunch.

So here is

Siegel’s rule for information life span:
The half-life relevance of a piece of
pushed information is about the same
as the frequency of the medium.


Keep in mind that this rule doesn’t apply to information that’s pulled. I hope you find that useful. If you do, please tweet about it and send people my way.

We're sorry, but comments are closed.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>