luckyrobot.com - Gerry Campbell header image 1

Collecta on Wired.com

June 30th, 2009 · Comments

Good search market landscape article on Wired.com

CommentsTags: Uncategorized


Is your Realtime Really Realtime?

May 6th, 2009 · Comments

I guest blogged today on Altsearchengines about realtime search, social search… differences and interrelationship.

Might be interesting to you if you seek clarification about what realtime means and how it all fits together.

Check it out at altsearchengines.com

CommentsTags: Uncategorized


Swine Flu… in Realtime

April 29th, 2009 · Comments

Realtime search - you can discuss it, theorize about it, and sometimes it’s time to just show it.

I twittered earlier this week that I was watching the swine flu concept evolve in the Collecta realtime search internal product. I got a pretty significant amount of responses asking me to share, so why not.

But first, my reflection…

After spending hours entranced by the range of images, news stories, twitter messages and comments flowing from various points on the web, I became even more sure that realtime search is a thing we will rely on in the future.

Why? Because much of the relevant information being published on this super-timely topic is headed for one of the following places:

  • A single site with limited readership
  • A feed, to be experienced in a flow of other homogeneous feeds
  • And worst of all, an index - to be searched later, when the full impact of the facts, opinions and impact has faded.

All 3 places are less than optimal. Readers interested in a topic deserve to have the best and most timely information come to them. Clean and focused.

Another thing occurred to me - and this is a HUGE point - the story I am assembling in my head about the topic is enlightened and in aggregate it’s editorially comprehensive!

Facts and information are forming before my eyes. Within a couple of minutes I literally watched the public awareness evolve. Can it be transferred by eating pork? No. Does Israel have the unilateral right to rename the disease? Maybe not.

It is amazing what the collective consciousness can share when viewed through this new Collecta window.

There’s also weirdness. Right-wing, Left-wing and everything in between. Did Obama ask media to overemphasize it to mobilize the country? Did Rumsfeld cause it to pump his portfolio? Maybe Bush caused it as an act of vengeance… ? It’s all in there.

My point is that for the first time I am in a position to aggregate the stories, blogs, photos and comments while they’re happening. I can get the whole picture right now. Then it’s up to me to sort out bias. To distill facts. To come to an enlightened opinion on the topic. This can take just a few minutes - the information is flowing right in front of me.

So, this site is just a sampler. There are bugs to work out, user experience and design dimensions to share. A long way to go. And when we launch in the near future with a full-fledged search, it won’t be a replacement for traditional web search.

It will, however, give users a new, more comprehensive view of what’s going on in the world right now.

Click the logo below to see realtime web results for swine flu.

CommentsTags: realtime · search


Gerry Campbell Named CEO of Collecta.com Realtime Web Search

March 24th, 2009 · Comments

Press Release here.

In the past several months I’ve written about the effects of splintering media, the amazing new medium exemplified in Twitter and how social networks are changing news.

Inherent in that has been the idea that the web - or what we know today as our online life - is in mid transformation and we’re in a unique punctuated time in the evolution of the Web. A point where the new winners are being defined and previous leaders risk becoming irrelevant.

I’ve also written about how the online world is operating on old ideas. Our previous approaches are not human enough, intuitive enough or timely enough to carry us into the next phase of the Internet.

It’s a world where infinite bandwidth and constant connection to the digital cloud provide us with news, entertainment, communication and commerce in a seamless flow that keeps pace with our human need for More, Faster and Better. Every depiction of the digital futute paints a picture of rapid-fire information intake and overload. Who doesn’t feel it?

Where is this leading? How is our reality going to be different?

We’ve got more stuff flying by than we know what to do with - we need Realtime search.

Media has splintered down to an individual level, where all of my acquaintances - as well as the traditional sources of information - are generating vast volumes of information every day. Information I may be interested in if I only had better tools to reach into the flow and collect things that are relevant to me. If I don’t tap into it effectively I’ll lose control and miss critical bits of info all together.

That’s a problem.

The answer is simple - but until now nearly impossible to create. I should be able to activate a query - set it out like a net to collect results on anything at all - and wait for the results to flow by.Some things like school lunch menus and research on mesothelioma may not be so timely, but it would be good to have an active query watch those things for me.

Other things like trade-able information on a stock in my portfolio, or details on a TV show while I am watching it, those things may very valuable to me in realtime.

This is the next frontier online, and it’s just beginning. Over time there will be the full range of experiences, business model modifications, and controversies to go with it.

There is an opportunity for a company or set of companies to blaze the trail and create the new, fast-twitch experience for realtime. This is not a small opportunity, John Battelle agrees.

So seeing the vision is the first part. The second is mobilizing to make it a reality.

In meeting up with Collecta (formerly Stanziq) I have found a team with the experience and a technology worthy of attacking to opportunity.

Today we are announcing the creation of Collecta and, with the support and excitement of founders Jack Moffitt, Patrick Mahoney and Brian Zisk as well as Jon Callaghan of True Ventures, I am joining as CEO and rolling in my pet project, Shrty.

Based on its roots in Stanziq, Collecta is already a strong search company driven by a truly innovative group of people: Jack Moffitt, Brian Zisk, Derek Powazek and a deep team of experts - focusing singularly on making a search of the web into a realtime experience. Check out the team bios on Collecta.com

Together we make a solid, rounded team for building a realtime search company.

Collecta has created a platform that promises to open up a new dimension in searching. It complements web search as we know it and add a new level of control and excitement to daily info-gathering.

The platform is in pre-beta, products will come very soon. Keep an eye out.

CommentsTags: search


The rise of Sensor Media

February 25th, 2009 · Comments

In my last blog post Search is Broken I wrote about the emerging existence of realtime, expressed content. I also explained that the significance of that content to me is heavily influenced by my social graph.

We don’t have any full analyses yet, but at this point in time there are at least two main kinds of information in the “expressed” web.

The first is expression of personal thought, feeling and experience. For instance “I just had a burger at FiveGuys with @chesspark and it was good.” There’s a lot of information. I was with @chesspark. We ate at Five Guys and Gerry, being a known connoisseur of greasy meat-foods, likes the burgers there. That’s interesting and it is packed away in the Twitter feed for eternity, for anyone who wants to know.

That’s sort of interesting when it’s about food. It may be really interesting to find an accidental drug interaction. It might also be of value to see that I know @chesspark. I am going to leave that type of information for now and focus on what I think is the real disruptor, news.

News is changing. And I am not only talking about the stuff we get on the 6:00 news. I am talking about anything that is a new piece of information that is relevant to me.

Historically, agencies have been the herald of the timely, the unexpected and the impactful. They have been the agenda-setter for (at least a chunk of) the things we care about. Journalists, schooled in ethics and fact checking techniques - driven by a desire to be the next Bob Woodward - search for the place and time of the next big occurrence.

Problem is that news happens everywhere, all of the time. So the task of staying on top of it with a professional team of journalists is impossible. Infinite places and times for things to happen – finite number of reporters.

Good for us things are changing and we now have umpteen sensors out there, collecting and expressing tidbits of detail that have varying degrees of relevance to us. This has an enormous potential for transforming not just the type of “news” we are exposed to, but the proximity of the news too. Specifically, when our friends and connections express information, there is a high potential for that to be relevant to us, possibly moreso than our favorite newscaster…

Truth is, every jackass with a blackberry/iphone/run-of-the-mill-phone can snap a picture and shoot it up to Twitter, Facebook and the like, creating an ever-ready blanket of sensors… tapping into everyday events and offering them up for consumption. Jim Hanrahan (@manolantern) snapped the first photo of the plane in the Hudson, literally seconds after splashdown. That’s news. Not to say he’s a jackass, though…

No reporter, no pre-ordained credentials. A guy with a smartphone saw something interesting and snapped a shot of it.

This is happening every second of every day and it is accelerating.

This is sensor media: every microblogger, social network lifestreamer is now sensing and reporting on the world as they experience it.

To drive the point home, Facebook just hit 175million users worldwide. If only one percent of those people publish observations about the world as they experience it, they will outnumber professional media by more than a million reporters (rough estimate). The best part is that some of these news sensors are my friends and will discover things that are personally relevant to me.

Is this the end of journalism? Not by a long shot.

Reporting and big media will undoubtedly retain its traditional place in the delivery of news; bringing credibility, focus and fact-filtering to the equation. It will, however increasingly fall behind in the ability to deliver timely information – unexpected information – to readers. This is the place that will be occupied by social networks, automated status bots and an entirely new breed of active content filters.

If this is the future, there are a few things to contemplate here. Opportunities…

- Whoever comes up with a way to filter and concentrate the breaking news, separating the valuable information out of the noise, will be a disruptor and will be an important part of the realtime web

- Relevance of news is, to a large extent, unique to me and my social world. Not all stories have the general/broad interest of planes in the Hudson. Location and topic are huge influencers on personal impact

- There’s a business model in this around sensing and reacting to news with financial/economic impact. Timeliness is extremely valuable to some

- Search, as well as most accepted organizing formats (feeds, news destinations) are wholly inadequate to concentrate and parse the volume and velocity of valuable realtime information that will be generated

As with my last post, I am going to leave the conclusion a little vague. There is a product or set of products here. Without a doubt.

I leave it vague for two reasons. The first is that I am working on my own ideas and they’re not fully baked yet. The second is that I am energized by the thoughtful comments on the topic. I want to continue to discuss the expanse of thoughts without anchoring to a specific solution yet.

CommentsTags: Uncategorized


Search is broken – really broken.

February 6th, 2009 · Comments

We are all really fortunate – in this age of the web, we have access to and can share ideas with the greatest minds of our time.

And, on top of this great connective technology, there’s an accompanying culture that values democracy over hierarchy, open sharing of ideas over closed and authenticity over hidden motives. This really is a great environment for recombination and synthesis – similar to the continental shelf in the Cambrian and Northern Italy at the earliest stages of the renaissance.

So, with great opportunity comes great responsibility…. If you’re reading this I consider you to be on the hook. ;-)

Personally, the responsibility I feel is to help people find information – which enhances the quality of their life, their station in life or simply saves time. I’ve said it before. Still true. I am really unsatisfied right now. We’ve been all distracted. Time to get busy.

In John Borthwick’s post this week, he brings up the idea that Google is soon to suffer from Creative Distruction… So true.

I can go beyond that and say that we – the people who make stuff on the web – are suffering from a bigger-than-big creative draught.

Tom cat starts chasing Jerry mouse. They run in a circle… and soon Jerry stops running and steps off to the side while Tom runs faster and faster in a circle completely forgetting what he was up to in the first place. They were both up to something when the games started and now they’re completely distracted.

That’s how we are with Search and information discovery – and possibly the entire web X.0 product. We’ve been running in circles for years and years, chasing index size, market share and revenue growth and have forgotten that the whole point in the beginning was to help people find information.

[sidebar]

There are exceptions – cases where real communities, rich with all of the facets of life exist together, but they are usually concentrated within small subsets of information, within limited groups of people or are only for the most technologically imaginative.

In fact, Howard Rheingold wrote about it in his insightful and brilliant book The Virtual Community way back in 1994. But it’s still elusive.

[/sidebar]

In the flurry of data, charts, investments revenue and metrics, reorgs and rounds of funding we’ve forgotten that we’re human, and need to build products that scratch the human itch. I’m not talking about the superficial Social thing we’re all a part of, where we connect to 500 of our best friends that we forgot we knew or never knew to begin with… I’m talking about an integrated lifestyle that includes our news, our entertainment, our communication, our shopping and transactions… our emotions and biases… into a coherent whole.

We have it in the physical world. For those of us who identify with a real-life community, it’s completely natural for a trip to the grocery to transform into a discussion about Lost when we bump into a friend. And it’s completely natural for us to skip conversing with the neighborhood acquaintances when we’re at the pool.

And it’s also completely natural for us to hear about a new clothing brand, a new Broadway show, a new car model or a new disease from friends at the coffee shop.

My point is this: in the real world entertainment, transactions, communication and information discovery are facets of a coherent, integrated whole. It’s the richness of life. This integration of things helps us prioritize our interests and time, helps us keep our social relationships in the right proportion and helps us to discover new things with surprising efficiency.

This is not the way it goes online… The web we have built is disassembling our lives and complicating them – not simplifying them.

Since web 1.0, when we took the cost of publishing to near-zero by putting a digital representation of the printed page online, our lives have been fragmenting. Since then, platforms, systems, meta systems and metrics have been built up to generate more content faster. More transaction opportunities. Web 2.0, with its platforms, feeds and social media have continued the trend. If you doubt me… How many email addresses do you have? How many ways can people message you online? Feel on top of it?

Don’t get me wrong – I am not discounting anything that’s been done to date. It’s all great stuff. Necessary for the next phase.

But let’s get on to the next phase, and remember what the potential is here… what history will measure and what our task is now:

Let’s use the building blocks we have in the form of social networks, search technology and semantic inference to RE-INTEGRATE our lives. To create the online replica of our real offline world. To use social inference to prioritize our information intake.

Enough fluff. The real point.

Our daily lives are rich with social inference, and they happen in real time. Search from Google, Yahoo… you name it – they are all based on published (e.g. considered, thought-through) documents that take minutes-to-weeks to update in the search index.

This is broken broken broken.

And blog search is not much better. Categorically identical.

Twitter search is great (go Summize!) as it is realtime and searches expressed information vs published. But the entire world’s entertainment, news, communication and transactions don’t happen solely there.

So there’s an opportunity. Realtime search, using social inference for discovery, ranking and prioritization.

My last major point here is that we need to think creatively about this. I believe that focusing on the query a user inserts into the box and the resulting blue links are WAY too limited. This is the embodiment of the brokenness.

I like what Kosmix is doing in delivering structured/federated results, it’s like AOL Fullview on steroids. And I like what Twitter is doing – there is a solid example in there for how engaging realtime expressed results can be.

I also love all of the approaches to apply “aboutness” and location to the Web - like OpenCalais, Glue, Outside.in, etc. These go a LONG way toward making the web more human.

These things are necessary but not sufficient.

If it’s not the query in the box and the blue links, what is it???

As humans, out life flows from mode to mode – topic to topic with very few breaks. The query doesn’t show up in the box on its own. It doesn’t materialize at the end of our fingers when they hit the keyboard. It comes from somewhere. The desire for information originates from the stimulus of our environment.

So, in this connected web-world full of smart people, I think this is the problem worth solving. I have been throwing the idea around with [insert pundit 1 here], [insert investor 2 here] and [insert super-clever folks N here] (sic.) and I think we are on to something. I’m getting pretty excited about it.

If you have ideas to share on the topic, or think you can contribute to the answer please let me know.

CommentsTags: search


Stocktwits Series A

December 17th, 2008 · Comments

If you haven’t seen it, Stocktwits just announced the close of its latest round of funding. Happy to add to my Stocktwits seed participation.

I like this company because it’s a real user-driven community based on top of the Twitter platform. And I don’t use the term community lightly - there is a social norm forming within the Stocktwits contributors that’s very warm and focused. I don’t see this kind of interaction in the other social networks. Credit to Howard, Soren and Phil.

And anyone who knows me know I am a big believer in the monetization potential of the finance vertical…

To read more, check out Roger Ehrenberg’s post, Fred Wilson’s post or the article in Venturebeat.

CommentsTags: Uncategorized


Darwin, DaVinci and Accelerated Change

December 16th, 2008 · Comments

 Somewhere around 530 million years ago, Opie (opabinia) was swimming around on his favorite continental shelf; propelling himself with his fantail, perusing the scenery with his five eyes and munching on critters he had picked up with his articulated pinching claw-nose. There was a huge mudslide and, unfortunately, Opie and several million of his best friends were trapped in what would become to be known as Burgess Shale.

Burgess Shale is important because it’s the subject of criticism brought up against Darwin’s theory of evolution. Specifically, this fossil-rich rock contained greater diversity of life than ever seen before. So there’s no way that simple evolution could account for the explosion and contraction of diversity. Or could it.

Steven Jay Gould, an American paleontologist and Harvard professor, came up with the idea of Punctuated Equilibrium and published it in 1972.

The theory proposes that there are times of accelerated change, but they’re still driven by the scientific principles of Darwin’s Natural Selection.

I’ve been able to distill, from sources on the Web, primarily but not limited to Wikipedia, that there are three elements in the process that lead to this rapid adaptation:

1) Presence of enabling building blocks.
2) A change in the environment that facilitates thing that’s adapting
3) The recombination of existing building blocks into something uniquely suited to succeed or flourish

In the case of Opie and many of his frightfully mutated friends, the building blocks came in the form of millions of years of genetic changes that were ready to differentiate eyes, tails, mouths and claw-noses out of generalized cells. (HOX genes)

His enabling environmental change came in the form of new shallow waters on emerging continental shelves – created by the breakup of the supercontinent of Gondwana, as well as an increase of oxygen in the atmosphere.

So, with genes raring to go, the warm, oxygenated shallow waters of the Laurentian continental shelf gave the perfectly hospitable environment for an acceleration of diversity. Some with five eyes and claw-noses, some with poisonous spikes and wormy bodies and some that were three-foot-long creatures with pinchers and toothed sucker-mouths… The stuff of horror movies.

Good for us that the mudslide happened.

Interestingly, though, all of our current classifications of living things (Kingdom, Phylum, Order… etc) can be traced back to ancestors of the Cambrian period. So something lasting took place there.

Why the science lesson? I have found the pattern to be broadly applicable to lots of things.

Take the Renaissance. (Trust me, I will get to technology eventually…)

It all started in Italy somewhere in the mid-1300’s. The known world had been the stage for the Crusades. Medieval times, also known as the Dark Ages, were winding down. The Catholic Church was experiencing a decline and the Pope was losing power. Not much progress in terms of society, economics or knowledge had taken place for a while.

If you want to understand what the Dark Ages were about, take a tour of the Tower of London with a Beefeater. It was all about the chopping off of people’s heads, from what I can tell.

An interesting thing was happening, though. Northern Italy – specifically Florence – was a hotbed. It was on the trade route between the Eastern Mediterranean and Europe.

This trade route brought both economic benefit and a flow of knowledge through Italy. It created a virtual continental shelf, complete with sunlight and oxygen – in the form of wealth and knowledge.

The last piece of the environmental picture is that the Medici family took power in the 1360’s, and reigned until the 1730’s. The family is famous for its support of art and architecture. And, by chance, also had a big successful banking enterprise. See how the building blocks assembled themselves?

Then an interesting thing happened. France and England had the 100 years war, effectively ruining the Western side of the trade route. On the other end of the line the Turks took Constantinople and the Greek scholars fled to the West. Many landing in Italy. By 1453 much of the economic power from Europe and the brain power from the Eastern Mediterranean were concentrated in Northern Italy. It was a good time to be there.

So the selective forces of war and power concentrated wealth and knowledge in Northern Italy and the greatest diversity of Art, Architecture and, ultimately, Science was born.

It’s also interesting to note that the men who mastered the emerging activities of the day were deemed “Renaissance Men” and were skilled in all of the relevant fields.

One of the most amazing things about the renaissance time is that we currently live under the structures and celebrate the products of the period. Specifically, our current definitions of the Arts, Sciences and Architecture and Finance were defined then and there. The Renaissance men such as DaVinci, and Michelangelo gave rise to a new form of excellence and innovation that crossed boundaries. Gutenberg, Copernicus, Descartes and Galileo were all men of the Renaissance. It can be argued that there have been no equals to these characters. And there may never be. At least in those specialties.

Where does this leave us?
I’m going to make the case that we are currently experiencing a flourish similar to that captured in the Burgess Shale and the Renaissance.

Our building blocks are different in form but nearly identical in function. Our environment is no less rich than that of the continental shelves of the Cambrian or Northern Italy at the end of the Dark Ages.

Our expansion is different, too, yet very similar. Our lives and the lives of every human from now forward, have been fundamentally restructured by the developments of our age.

Out of this time we will define the structure for countless future generations. We will give the stage to our own polymaths and, ultimately, be subject to natural selection processes that dampen the flourish and determine the “winners” for the future.

Stay tuned to LuckyRobot.

CommentsTags: Instructive History


Interesting gas price tidbit….

December 16th, 2008 · Comments

I snagged this from Swivel.com. Great site for datageeks.

Interesting how Miles traveled is resilient to gas prices. Drivers have a high elasticity in gas prices… I wonder how high/low gas prices have to go to change behavior…

BTW, this data is a little old.

Vehicle Miles Traveled vs. Gasoline Prices

Also, I am crafting a new post right now. It’s required more research than I expected, look for it soon. As a teaser - it compares our conditions today to those that conspired to create the Renaissance… very interesting comparison.

CommentsTags: Uncategorized


Semantics, Search and Big Honking Databases

December 5th, 2008 · Comments

*

In 2003 when I had been heading up AOL Search for a while, we began to bang structured data into search results pages on a query-by-query basis. We used pre-formatted javascript templates that were selected based on keywords, and filled from the freshest, most relevant data we could find.

In today’s parlance we had widgetized search. We called them widgets back then, too. The entire system had to be built from scratch, and it was both costly and time consuming to build.

It was worth it. For the user, this meant that a search for “Turkey Recipe” would pull up – amazingly – a turkey recipe right at the top of the page. A search for “Austin Powers” would take your zipcode plus Moviefone data and present you with reviews and showtimes with a single click to purchasing tickets at the theaters near you.

Users and press raved.

It really was a big deal. In fact, we had this “search programming” on 20% of all queries, across all known categories – sports, autos, entertainment… This was the “Google and More” plan that allowed AOL to go into a relationship with the future juggernaut with confidence – we were going to use real content and clever editors to build an experience well beyond what the bluelinks could provide. It was great being Google’s biggest partner as they entered the world of paid search.

This search editorial program wasn’t an accident. I, as well as several of my contemporaries had been working on the opportunity for several years by then. (In fact, you could claim that when Ram Sriram’s Junglee announced “the Internet Is the Database” this all began.)

It grew out of a few different streams of activity that had been going on – at AltaVista the Web Search team had been doing a small version of this and I was working on the getting shopping data into results completely structured and pre-widgetized. Lycos had been trying things out as well. (AltaVista and AOL people may remember our friend Tim Robinson who was a major visionary behind this)

This program is exactly why I went to AOL. AOL had just merged with TimeWarner and an entirely new range of content – content from a REAL media company – would be available for search enrichment. What an amazing opportunity. My biggest frustration at AltaVista was the lack of content resources to enrich the experience – to keep people from flowing straight through the system and out to the Web without adding real value.

To make an already too long story shorter, we (AOL) made a boatload of cash on search with Google and the TW merger cratered. I moved on to implement a similar vision within the bounded vertical of Finance and News. That left the world’s most evolved and enriched version of search – AOL’s Fullview – at the hands of aggressive costcutting and Fullview was determined to be off strategy. No sour grapes at all. Just a missed opportunity.

Since then, former colleague Jason Calacanis has gone on to create Mahalo under a similar premise, Wikipedia has evolved into an amazing resource, and Google is still pumping out bluelinks, plus just a little more.

Anyway, the title promises that this post is about semantics, and it is. This is just necessary background.

The opportunity is still there, whether in search or in the online world at large to create a virtual fabric of content that can be experienced (browsed and searched) – and even more importantly assembled on-the-fly - based on its relatedness.

What do I mean by that? Whether in search or in socially-relevant widgets or in feed aggregation, we need to link and connect content based on its MEANING and not keyword similarity. Specifically – When I am looking at a Microsoft earnings report and I see related links to “gates” I want it to be about the person, not the thing. The same applies to java, apple and about six million other things. We live in an ambiguous world.

For this to become reality, two things need to happen:

  1. Content needs to be accessible in a format that is native to its type of data. For example, the fundamental information about Microsoft (P/E, market cap, etc) will be in fields, just like a spreadsheet. MSFT’s price history is going to be formatted into a huge list of Bid, Ask and transaction prices (gross generalization). News about the latest earnings will be in text blobs. You can’t index this data with a traditional crawler, and it can’t all be mushed into a single format without losing the unique value.
  2. There needs to be a consistent way, across formats of data, to call out and associate similar items. MSFT is related to Microsoft is related to Steve Ballmer is related to Bill Gates. With this type of linking, we can then understand the interrelatedness of things. In its simplest form, this is semantics.

Now, to the point of this post.

If you look above, there are two things that need to happen for the content/information experience on the Web to be dramatically improved: we need content in a universally accessible repository (or repositories) and we need a technique for connecting it all together. Also, the web is evolving and we now have the challenge of making that all socially aware and realtime.

Let’s take those two chunks separately.

Big Honking Database of Content - In the last 18 months we have seen an amazing set of resources applied against this.

  • Freebase is the first company of note. It promises to be a huge content stash in the sky and is funded to do it. Very very promising.
  • Amazon released public datasets this week. So now if you want economics and scientific data it’s there. And you can make your own data available via AWS if you allow it to be freely accessible. This is a HUGE step.
  • Fluidinfo and Terry Jones. It seems fashionable to say “I know Terry Jones” these days. Here’s why: Terry has quite possibly created the database to handle structured, semistructured, tagged and attributed, social and realtime data. In its native format. That’s why pundits like Tim O’Reilly and Robert Scoble are openly excited about Terry and Fluidinfo. Terry and I have spent many hours together contemplating this. (see, I know Terry Jones too!)

Semantic technologies – There is more work to be done here, but it’s on the way. First of all, to fit into the broad model I have laid out, the semantic tagging technology needs to be at the tool, or platform level. This rules out most of the activity in the Semantic space.

Here’s what I mean: If you want to create, say, a music fan-site application that pulls together artist bios, discography, tour reviews from the Web, user generated content and the ability to purchase both tickets and CD’s, you would assemble the content and then you’d need a semantic tool to tag and generate the connections between bands, releases, tour dates and purchasing.

You can’t do that with a semantic application that only provides related links or delivers search results on only the information in its own index/database. You need a tool you can run on all of the sources to generate consistent metadata. Not that Hakia, Twine, Powerset (now MSFT) and Zemanta aren’t useful, but they’re individual applications on top of a semantic engine. Builders need access to the engine itself in order to build a wide range of products and open up the power of the technology.

Unsurprisingly, I am highly in favor of the OpenCalais approach by ThomsonReuters.

The best part is that v 4.0 of Calais will be releasing the “Linked Data Cloud.” It goes after this in a truly powerful way, providing users not only the ability to get their own tags, but to see how those tags relate conceptually to other things in the OpenCalais data model. Rocket Science.

But this post isn’t about OpenCalais either.

This post is about finding opportunity in the world that is evolving.

If I haven’t lost you yet, and you can agree that content+semantic tagging is useful, you can see that there are some problems and opportunities.

  • Completeness of data – DMOZ (aka The Open Directory) used to be under my domain at AOL. I never could invest in it because the community management model was flawed: communities only want to curate the things they’re interested in (thanks to Andrew Cohen for the analogy). Investment in the infrastructure was only going to feed the weediness and patchiness of the garden. Freebase is showing signs of content spottiness and AWS will too. It’s an issue of primary importance. So I think there is an emerging opportunity to provide curation on top of these open services. Like Redhat to Linux.
  • Quality of data – Just like completeness, If anyone can publish to the datasets, there’s a risk of problematic information. This is where branding, and the associated quality control comes in. The opportunity is for companies who create content to establish and promote their brand as a sign of quality. Quality wins out over crap time and time again.
  • Universality of tags – Zemanta is admirably trying to get a tagging standard adopted across semantic engines. Whether by agreement (standard) or market leadership (default) the emerging content world will benefit from consistent tags to operate on. More things will be “connectable.”
  • Applications Applications Applications! – This is where I get excited. Really excited. After 15+years of helping people find what they’re looking for using technology, I scan the horizon and see the building blocks to finally get it done. We (the tech community at-large) now have raw content feeds, open and free databases, functionality APIs, open source platforms and development methodologies that free up our minds to think about how users really want their content. We are right on the edge of being able to build what we can imagine – quickly and cheaply. We’ve got the tools to measure it and the social context to present it in with personal relevance.

Jason Calacanis recently posted about the responsibility we all have to push forward through this downturn with 120% effort… The part I found specifically valuable was where he calls entrepreneurs and those with the resources to get out there and start something. I can agree with that.

So, the tools are there and hopefully I’ve given you at least one way to think about it… I am definitely making my bets on where this is going and will probably join in on the app-building side soon.

And yes, this ties in with the Splintering of Media. I’ll get to that soon…

(* Photo “Sound and Vision” copyright Rogiro from Flickr)

CommentsTags: Uncategorized · search