So what is the big deal about Regulated Document Management?

There has been so much discussion about this topic lately that I feel compelled to write about it.  We have so many customers asking: can SharePoint be validated?  The answer is a resounding YES!  My colleague Les Jordan has written about this extensively in his Blog.  We even have a Guidance on Configuring SharePoint for Part 11 Compliance and there is also a very useful recorded Webcast that our partner NextDocs has made available recently:  There is so much FUD spread by our competitors about SharePoint, which we need to address again and again.  Obviously, it is in their best interest to maintain the status quo, and for their customers to keep on paying for their gold-plated expensive and complex legacy systems.  There is a whole generation of Informatics people whose career was built on these systems, and reason and common sense has often given way to ‘religious debates’ about repositories.  I have spoken at the DIA EDM Conference for the last two years about the topic, and to some people I may sound like a heretic when I say ‘It is not about the repository!’.  And then I usually add ‘It is more about the Metadata’ (see my other Blog postings about this).  The fact of the matter is that legacy document management systems were initially designed to overcome the limitations of file systems, i.e. the lack of version control, metadata, object-level permissions, audit trails etc.  And then when the FDA’s guidance on 21 CFR Part 11 was published in March 2000, companies paid through the nose to upgrade their document management systems to be 21 CFR Part 11 compliant.  This was a really big deal to them, and cost millions.  Of course it was a big deal, because these systems store all the critical content that a pharmaceutical company has to send to the FDA to get their drugs approved.  This is their lifeline, and companies were willing to pay any amount to be compliant, and not to delay the approval of their drug by even a single day (a single day of delay could mean millions in lost revenue).

But despite all this, they are still using these ‘glorified file systems’.  I do not mean to trivialize the importance of document management, because file systems are clearly not suitable for compliant applications.  However, there is no reason why these legacy document management systems should be so complex and expensive!  When I came to work for Microsoft, I was really excited by the power of SharePoint as a platform.  I saw huge potential to build then next-generation document management systems on the SharePoint platform.  So one of the first things I set out was to write a White Paper called Enterprise Content Management in Regulated Industries, so we could establish our vision.  Here is also another excellent White Paper on Compliance: Compliance Features in the 2007 Microsoft Office System.  The next step was to realize our vision, and to recruit partners to develop SharePoint-based solutions for 21 CFR Part 11 applications.  When we first demoed our solutions back in 2007, even the analysts started taking notice.  Today, we have several large pharmaceutical companies who have already validated SharePoint on their own, or via some of our systems integrator (SI) partners.  See the recently published Case Study about Roche Diagnostics, where they replaced Documentum with SharePoint for all validated IT documentation.  We recently released another Case Study on Affymetrix, who also replaced Documentum with SharePoint.

I also need to address some misconceptions around validation.  First of all, there is no such thing as FDA validated software.  Any software vendor who states is showing their ignorance.  Validation is the responsibility of the customer.  And it is not the software alone that needs to be validated, but the whole environment, which includes hardware, software and even internal processes.  And then there is the question of why one would want to validate SharePoint itself.  For sure, SharePoint needs to be a ‘validatable’ platform, which it is.  But instead of validating all SharePoint, only the application that runs on SharePoint for a particular GxP application needs to be validated.  This means that the application needs to be built, and then validated.

However, I am strongly opposed to building one-off applications for several reasons.  I have found that in over 95% of the cases, pharmaceutical customers need the same kind of capabilities.  Therefore, why not use off-the-shelf applications, and configure them?  I always recommend considering this approach first!  This way, the costs of building the application, and developing the validation test scripts and protocols are amortized over many customers, and the overall costs are far less.  Unfortunately, many companies who are still used to the old ways of doing things still don’t seem to understand this.  There are several partners who have built off-the-shelf or ready-to-deploy SharePoint-based solutions for 21 CFR Part 11 compliance: NextDocs, OrniPoint, Qumas, FirstPoint by CSC Life Sciences, Montrium, Court Square, GxPi, and several additional solutions along nicely.

Among the off-the-shelf solutions, NextDocs has been gaining a tremendous amount of market momentum, and they have done a superb job with their applications.  I love to see people’s faces when they get a demo of the solution and they compare it with the old legacy systems they are struggling with, and get really excited.  It really brings out the best of SharePoint.  I also love their slogan ‘Compliance without the complexity’ – it is spot on!  They have just posted a series of recorded Webcasts on their Site – they are great!  I know that we also have to be realistic, because these legacy highly customized systems are so deeply embedded within corporations that they cannot be just ripped out and replaced overnight.  That is in nobody’s interest, and way too disruptive.  However, there are many GxP applications where people have a real need for such solutions, and are still doing everything on paper because they just cannot afford these gold-plated ancient solutions.  Validation alone for these legacy solutions could take 6-9 months and up to 7 figures, whereas we have several cases where one of the above off-the-shelf solutions had been installed and validated within a manner of weeks!  AS CIO’s are under increasing pressure to cut costs, there are no more sacred cows, and they will be looking at every single legacy system they can replace, and save millions in the process.  I know of several major pharmaceutical companies where they have spent in excess of $50 million just upgrading and consolidating their legacy ECM systems.  A brand new implementation of a SharePoint-based ECM system (including migration, or integration) would be a small fraction of this.  And now they have just locked themselves in for the next 5-10 years, and are at the mercy of armies of consultants to keep these monolithic behemoths running, and integrated with their other enterprise systems.  Someone with decision-making powers needs to stand up, and say ‘stop this insanity’!

And now I need to address the issue of scalability.  The legacy vendors are spreading FUD that SharePoint is not enterprise-ready and scalable.  As an example, see here for some information about Pfizer’s implementation of SharePointGlaxoSmithKline has announced that they are rolling out SharePoint Online to over 100,000 users.  BMS runs on SharePoint, and so is the Univadis Site that Merck has launched.  The U.S. Air Force operates what is probably largest Extranet in the world with over 750,000 users – built on SharePoint.

I have also posted some documents here about SharePoint scalability.  The results are amazing: scalability up to 2 million users!  And here are the latest results of the SharePoint Server 2010 performance and capacity test results and recommendations.  I doubt if any legacy ECM system can produce similar results, when all the parameters are compared.  The main point here is that like with any other system, it has to be architected right, and deployed in the right manner!


You are wasting time. Find out why – The cost of ineffective search

I have briefly written about this topic in earlier Blog postings, but I wanted to elaborate a bit more on the topic.  Here is an article of fundamental importance that I have kept as reference material over the years:

I think Susan Feldman at IDC is one of the leading thinkers in this area, and I completely agree with her views.  Simply put: content without context is incomplete, just as search without metadata is incomplete.  Here is another article from back in 1999:

The key highlights are as follows:

  • Metadata is one of the biggest critical success factors to sharing information.
  • Metadata can make your information sharing and storage efforts great successes, or great failures. Metadata can get you in trouble with the law, or keep you out of such trouble.
  • The alternative to metadata management is information chaos.

Even the Government is starting to understand the importance of metadata for information sharing:

I have not worked with a single company who has addressed the problems above, and that means that there is still information chaos within every single company.  I know this is a strong statement, but I am willing to stand by it!

There are some great search tools out there.  But search tools can only find what is ‘indexable’ (and a bit more, via combining it with text mining, and semantic approaches).  But this this is still not enough.  I strongly believe that what is needed is to track all metadata on the object level across the enterprise, and to combine this with search results, in a faceted result set.  Why is this important?  Because Enterprise Search is not Web Search!  We are not looking for Web pages, and ranking algorithms based on how many hyperlinks point to Site!  We are looking for documents, and we often need to find every single one of them, for compliance or other reasons.  The only way to do this is a faceted result set, which allows us to drill down precisely in the result set.  And metadata is the metadata is the ‘sorting mechanism’ that allows us to do that.  Now: what kind of metadata do we need exactly?  We need the following: taxonomy-driven metadata, folksonomy-driven metadata, user-defined metadata (on an individual level), and semantic (or meaning-based) metadata.

The above may sound scary and complicated, no doubt.  But the good news is that a whole generation of new technologies is coming along to solve this problem.  First of all, the Office 2007 System in and of itself is a revolutionary product.  For the first time, what we have is an ‘encapsulated nugget of information’ – that means that the metadata ‘travels with the document’, given that there is a separate ‘document part’ for metadata within the document.  This combines content and context, and solves the problem all legacy ECM systems have: when a document is checked out, it knows nothing about itself any more.  The document has been removed from the system, but its metadata still reside within a database table in the ECM system.  This leads to a huge compliance-related risk that companies are not equipped to handle.  But I will admit that only a small fraction of corporate content resides in Office 2007 today.  However, we have a great set of tools to manage metadata on the back end, on an enterprise level.  As I have written about earlier, the the NextPage Information Tracking Platform is able to track any content across the enterprise via its unique ‘digital threading technology’.  When all this comes together in an integrated fashion, we can finally start addressing the information chaos that has been reigning across the enterprise.  And, as I also stated earlier, it is not about technology (which is the enabler, of course), but more about people, processes and change management.  All this has to be seamless, easy to use, and the complexity has to be hidden from the end user.  But I think we are finally getting to that point.   And once we do, then the whole process of e-Discovery will become a far less onerous problem than it is today!  I know of several large companies who are spending between $10 million and $70 million just to address their e-Discovery requirements.  That is almost too hard to believe, but true.  Of course, when we think about the amount of money involved in class action lawsuits, then we can understand their motivation.  It still boggles the mind, though.