Automatic Content Integration

Written February 13th, 2008 by Aaron Holbrook

Syndicated content. We’ve had the discussions about whether or not it’s beneficial, etc etc. I don’t write today to discuss the benefits vs. drawbacks – that’s a discussion for another time, and another article. I write today to ask why there isn’t a better system of integrating syndicated content into your custom content.

I mean really, we spend a ton of money on healthcare content and CMSs (bought or built) which automate a lot of the day-to-day processes and make things possible which weren’t before. Why can’t this system automate a seemingly simple task: such as auto-tagging some content?

I currently use a system that is extremely difficult to use (not to mention frustrating). I estimate it takes me probably 10 or so minutes to add just the tag information. Frankly in my book, that’s unacceptable. So I went looking at what other people do. I’ve seen systems that do what mine does, albeit slightly better, and in a more user-friendly and accessible manner. Ok – that’s cool, I can live with that. Not the ideal solution, but it’ll work.

I’ve also seen a system that dumps all the categories onto one page and lets you figure our what you want to mark it as. Granted, that’s a big page – but still, I like the simplicity and the ease of which you can pick your categories (radio boxes allow you to pick more than one). Again, not ideal in the slightest, but usable. I mean – I can use something like that, but really – we’re not even on the right track.

Here’s my proposition to all you web developers and web vendors: Automatic Content Integration.

That’s right – parse that content! Read it and tell us what it relates to in the syndicated libraries (also: syndicated content companies: could we get a better/standardized tagging system? Please?).

My ideal situation would be something like this:

I sit down with a nice cup of coffee, check my email and find I have a request to add a new page of information about our Diabetes Center. I fire up my CMS, create a new page, and dump in the content. I don’t have to worry about eWebEditPro inserting random spaces, or some other such ridiculousness. I save it (or run the parser) and it comes back with suggested categories, or tags that have the closest matches to the library.

Sweet.

I select the ones that are the most relevant (notice I said ‘ones’, as in more than one) – and finish styling the article.

Done.

Now this automatic tagging would pull in classes, event, physicians, articles from the library, and of course any articles relevant to said tags. It would go in any direction – so if you’re looking at an event about Diabetes, you’d see Diabetes health information, physicians or specialists that can help you with Diabetes, and links to information about your Diabetes Center.

Now – this is what I’d really love to hear:

“We already do that!”

However, I don’t think I’m going to hear that. I’d even settle for someone saying: “We’re actually thinking about doing that” – but again, I’ve got my doubts.

To the best of my knowledge, nothing like this exists – which is a shame, because it’s really a simple concept. Granted, the execution may take some thought – but I’m telling all you web vendors right now, you should really look at this closely and see how this can be done, and how soon you can do it – because someone sooner or later is going to make a CMS that does this, and does it well – and I’ll tell you right now, I’d be on them everyday asking them when I could sign up.

Now, tell me why it can’t be done.

-Aaron Holbrook is the Webmaster at Centegra Health System

10 Responses to “Automatic Content Integration”

  1. Capn Says:

    To go a step further, I imagine this would be a learning functionality/device as well. The more content you write, and the more you refine the “suggested meta tagging” for that content – the smarter the system should become in terms of understand how its content writers formulate those associations.

    Obviously, this functionality would need human intervention to verify suggested tags and check for appropriateness, but besides that, I’m having a tough time coming up with a position as devil’s advocate. Besides the obvious that is – that any unattended or mislead system will get out of control – but if this system could not only learn how its content managers tag their content, but if it would also interface with traffic reporting systems to see how the site’s visitors are getting to the content in the first place? You’d have people concentrating on writing qualified content, while the system would ‘review’ the content, determine the appropriate meta data, match that data up against historical traffic data of similar content, and then determine the best way to put the readers in touch with the content … it really looks win-win.

    This whole concept takes content management another step closer to becoming a singularity: write some content, then let it figure out what back-office ties it needs to establish to existing material. [A perfect singularity would have content writing then tagging itself. We'd be out of a job.] But this really is the next step in content management.

    I’d almost put money behind Google being the first to come up with something that addresses this; if not Google themselves, then I can picture someone building this with Google’s algorithms or search appliance.

    Sign me up.

  2. Aaron Holbrook Says:

    That, truly would be amazing. However – right now, I would be happy to see someone working on a simple auto-parsing system. Of course it would definitely need an approval step – I didn’t elaborate on that, but that’s one of the key parts.

    Also, Google could probably easily do something similar to this, but what I’d really like to see is a healthcare web vendor do something like this. Because as far as I know, Google doesn’t provide web development for hospitals :P

  3. Capn Says:

    Well, no, Google doesn’t provide web dev for healthcare, but healthcare isn’t the only market using content management and tagged meta data either. I just meant Google’s algoithms could used to first parse the content then browse existing resources to compile a list of best matches. It’s not a far stretch from what they have going on already.

    I think we may be onto something, quick Aaron – hide this thread! LOL ;)

  4. Aaron Holbrook Says:

    Hahahaha. I can only hope that someone will take this idea and run with it – that’d be spectacular.

    And if not, what are you doing next month? ;)

  5. Neal Linkon Says:

    If we’re starting a new business, count me in! What a great sounding idea. Keyword associations is as close as we’ve come to making it automatic, and that’s imperfect at best. Most of them fit, but some make no sense. There’s got to be a better way!

  6. Aaron Holbrook Says:

    Haha, why the hell not? I mean – if no one else is going to do it, why shouldn’t we?

    Shit – at least then I’d start seeing the kind of tools I’d really like to use!

    And yea, keyword associations are a half-assed attempt, imho – let’s get on the line to google! I’m gonna submit my resume to them tomorrow :P

  7. Capn Says:

    Easy tiger, you never know who’s reading the blog here. One word for ya: dooce LOL ;)

    Otherwise I’m game. Let’s build this thing. Then retire early on the vast amount of profit it creates and write blogs all day while sipping mint julips on the veranda, overlooking the Seine … am I getting ahead of myself?

  8. Aaron Holbrook Says:

    Dooce rocks, if I get fired I’ll just start writing here full time. I mean, that’ll support me, right?

    I’m game, we should probably erase our tracks and any mention of this on the internet. Oh wait, it’s the internet… woops :P

    You can waste your time in France, I’m going to be in London looking out of my castle :D

  9. Chris Sadler Says:

    Great post, Aaron, and this is something that’s been bugging me for a long time. We’ve stayed away from 3rd-party content (well, kind of) for this very reason. When I asked a content vendor why nobody provides the content in some sort of well-tagged XML format for me to display it with my own content, the answer I got was not something I really thought about: copyright violation.

    I guess the concern is that if the syndicated content is melded into the local content, the copyright lines are lost.

    Thoughts on that?

  10. Aaron Holbrook Says:

    That is absolutely ridiculous – I don’t understand how providing it in a machine-readable (and therefore much more semantical) format violates copyright.

    This bugs me because one of Greystone’s selling points is that you can completely customize their health information. I could rip it apart if I wanted and there’d be no worries. I’m not sure on ADAM, but I’d be pretty disappointed if this was the ultimate barrier to better functionality. If so, it definitely needs to be addressed.

Leave a Reply