Scientist Innovation – exciting things Microsoft is doing with Office and SharePoint

I am really excited about the capability of the Office stack as a platform: this means Office 2007(running on OpenXML) and Microsoft Office SharePoint Server 2007 (MOSS).

First of all, there is a treasure trove of information on the projects that Microsoft Research is working on.  The work that is the most relevant and exciting for the Life Sciences Scientist community is what our Scholarly Communication team is doing.  There is way too much material to cover, but I wanted to focus on areas of specific interest to Scientist Innovation in Life Sciences.  First of all, there is the Research Information Center project we are working on with the British Library.  With some modification, this Portal could very well serve as a foundation for a Scientist Workbench.  Talking about Scientist Workbenches, the Trident Scientific Workflow Workbench is absolutely amazing, and I am thinking about ways this could be integrated with LIMS systems to automate their workflows.  Microsoft Research has also released the Microsoft Biology Foundation.  I think it is particularly exciting that MBF will also run on the Azure Services Platform.

Recently, Accelrys also announced that it now supports MOSS as a publication platform for their scientific applications.  There is also a very interesting and powerful application built by some very talented people at Pfizer UK called OnePoint, the combination of OneNote and MOSS.  Here is also a link to a Blog Posting about it: I have also uploaded the video and the written case study here.

Pablo Fernicola and his team have built an impressive Ontology Add-in for Word 2007 (make sure you check out the recorded demo at the bottom of the page).  Pablo’s team has also built the Article Authoring Add-in for Microsoft Office Word 2007.  There is also the Chem4Word project.  Our team in the Developer and Platform team has also built an amazing OBA Composition Reference Toolkit, which adds two Word add‐in components that enable searching/looking up compound information using an external service ( and also from a local drug database.  For configuration, see the User Guide.  We are also working on the new Microsoft Semantic Engine – the demo can be watched on-demand here:

Health Language has also built a Patient Journey Demonstrator on Office and Silverlight technologies that is truly amazing.

I am also very excited about the capabilities of OpenXML as a Scientific Authoring tool for E-Lab Notebooks (ELN’s).  Many of our customers have invested heavily in specialized solutions which keep the data in a proprietary format.  As a result, finding or sharing the information is impossible or very difficult, and they are having major performance issues as well.  Some of the top pharmaceutical companies have contacted us to help them with this.  The idea is to be able to extract and convert the relevant data to OpenXML format, and store it as Office 2007 content in SharePoint.  This way, the content can be indexed, and searchable, and can contribute to Corporate Knowledge, instead of being buried.  We are also doing some really exciting work with Neudesic in addressing the E-Lab Notebook challenges.  In addition, we are now also working with Neudesic and SchemaLogic to integrate MetaPoint with the Neudesic ELN.  This is all truly groundbreaking stuff, made possible via Office 2007and SharePoint.

There is also the groundbreaking work being done by my friend and colleague Sam Batterman.  Sam is a real wizard at data visualization.  I never cease to be amazed by his demos.  He is working on a project called InfoMesa.  It is sort of a Collaborative Digital Scrapbook on steroids, using the whiteboard metaphor.  It uses Windows Presentation Foundation as the UI.  Sam has posted additional information here and is now looking at InfoMesa and Databases in the Cloud and how to make all this work on the Microsoft Azure Services Platform.  Amazing stuff – this can really fundamentally change the way scientists collect, view and annotate, tag (yes – there is a MetaPoint play there, too….) and collaborate on data!  Sam has just posted a new Blog entry on his latest amazing endeavor: Firefly – a Network Sensemaking tool.  Where does this guy come up with this stuff? 

I think Silverlight technologies have amazing potential enhance scientific research by enabling annotation of any visual object rendered in WPF.  See here for a simple application called the InkPresenter.  I have also uploaded information to SkyDrive about the Collaborative Molecular Environment (CME) that our partner InterKnowlogy has built for the Scripps Research Institute.  This application uses SharePoint to store images, and uses WPF to add and visualize annotations.  The Case Study is also available for download.  Also check out some of the amazing applications that the InterKnowlogy team is building here.  They also recently recorded a short demo on their vision of the future of Distributed Computing – check out the demo here.


Thoughts about Federated Records Management and E-Discovery

A colleague recently sent me the link to this article:

I am currently involved in several EDRM projects.  This is a hot area right now, and all large companies are actively thinking about E-Discovery.  I would not call myself an expert at E-Discovery, but I have spent quite a bit of time speaking to the leading experts in this area, and developing my own views.  I have come to the conclusion that most companies are not taking the right approach to solve the problem.  Throwing a bunch of technology will just end up costing a lot of money, and lead to failed projects.  I have seen this before time and time again in other areas.  The first things to think about are change management, setting up the right internal processes, and addressing the simple question end users always have ‘What’s in it for me?’.  Individuals can’t be made wholly responsible for corporate compliance when it comes to Records Management, for example, because you just can’t have the ‘compliance police’ checking up on everyone to make sure they apply the right classification to email and documents (although there are some really good tools on the market today, which are tightly integrated with Outlook and Office, such as Titus Labs).  I tend to believe that the right approach to address the issue of E-Discovery. in the larger space of Information Governance (new term, not in WikiPedia yet….) is to have the right approach to ECM to begin with.  When it comes to E-Discovery, certainly you need all the the right technology tools to support the process, and I am also particularly bullish about combining our FAST Enterprise Search technology with Concept Searching to support the E-Discovery process.  However, what I am finding is that most companies are also looking to implement Federated Records Management tools as part of the EDRM project.  And this is an area I have been thinking about a lot lately.  Does Federated Records Management really work?  I actually don’t think so.  The main problem is that every single large legacy ECM vendor sees this as an opportunity as a Trojan Horse approach to putting their own repository underneath.  I do not believe in this approach at all.  Here is where I like to bring up one of my favorite quotes: “It is not about the repository, it is about the metadata”.  This why I am such a huge fan of the promise of MetaPoint.  Why would enterprises need yet another repository, just to be able to do Records Management across the enterprise?  All this adds is unnecessary cost and complexity, which is downplayed at the beginning, until it is too late.   I believe the NextPage Information Tracking Platform gives companies a much better alternative.  This innovative technology can track documents across the enterprise (and soon even embedded within email), no matter where they are located!  Not only can it be used to clean up all the redundant content (I already wrote in a previous posting that according to Cohasset Associates, each content artifact has up to 18 identical copies scattered all over the place), but also enables companies to track down the authoritative original of the document which is subject to records retention policies, and also to move it to an ECM system that supports Electronic Records Management, and to get rid of the 17 other redundant copies which, when unmanaged, also expose the company to liability and risk.  This approach is far more pragmatic, and simpler from an architectural perspective than the complicated and expensive ‘dual repository’ approaches proposed by the legacy ECM vendors.  And, since SharePoint is the fastest-growing ECM platform within enterprises, and it has a pretty robust Electronic Records Management module included for free, why not just use it?  I am not claiming that it is the most advanced Electronic Records Management system there is, and that it meets every single customer’s needs (even though I also know of some really powerful enhancements that some of our partners like Applied Information Sciences and OmniRIM have built), but I do think that in combination with the NextPage Information Tracking Platform, it can present an elegant alternative to address the requirements of many Records Management scenarios.  And as I stated earlier, it is more about people, processes and change management, than about technology!