Infobesity in the Healthcare Industry: A Well-Balanced Diet of Predictive Governance is needed


Fat TwitterWith the rapid advances in healthcare technology, the movement to electronic health records, and the relentless accumulation of regulatory requirements, the shift from records management to information governance is increasingly becoming a necessary reality.

In a 2012 CGOC (Compliance, Governance and Oversight Counsel) Summit survey, it was found that on the average 1% of an organization’s data is subject to legal hold, 5% falls under regulatory retention requirements and 25% has business value. This means that 69% of an organization’s ESI is not needed and could be disposed of without impact to the organization. I would argue that for the healthcare industry, especially for covered entities with medical record stewardship, those retention percentages are somewhat higher, especially the regulatory retention requirements.

According to an April 9, 2013 article on ZDNet.com, by 2015, 80% of new healthcare information will be composed of unstructured information; information that’s much harder to classify and manage because it doesn’t conform to the “rows & columns” format used in the past. Examples of unstructured information include clinical notes, emails & attachments, scanned lab reports, office work documents, radiology images, SMS, and instant messages. Despite a push for more organization and process in managing unstructured data, healthcare organizations continue to binge on unstructured data with little regard to the overall health of their enterprises.

So how does this info-gluttony, (the unrestricted saving of unstructured data because data storage is cheap and saving everything is just easier), affect the health of the organization? Obviously you’ll look terrible in horizontal stripes, but also finding specific information quickly (or at all) is impossible, you’ll spend more on storage, data breaches will could occur more often, litigation/eDiscovery expenses will rise, and you won’t want to go to your 15th high school reunion…

To combat this unstructured info-gain, we need an intelligent information governance solution – STAT!  And that solution must include a defensible process to systematically dispose of information that’s no longer subject to regulatory requirements, litigation hold requirements or because it no longer has business value.

To enable this information governance/defensible disposal Infobesity cure, healthcare information governance solutions must be able to extract meaning from all of this unstructured content, or in other words understand and differentiate content conceptually. The automated classification/categorization of unstructured content based on content meaning cannot accurately or consistently differentiate the meaning in electronic content by simply relying on simple rules or keywords. To accurately automate the categorization and management of unstructured content, a machine learning capability to “train by example” is a precondition. This ability to systematically derive meaning from unstructured content as well as machine learning to accurately automate information governance is something we call “Predictive Governance”.

A side benefit of Predictive Governance is (you’ll actually look taller) previously lost organizational knowledge and business intelligence can be automatically compiled and made available throughout the organization.

Ask the Magic 8-Ball; “Is Predictive Defensible Disposal Possible?”


The Good Ole Days of Paper Shredding

In my early career, shred days – the scheduled annual activity where the company ordered all employees to wander through all their paper records to determine what should be disposed of, were common place. At the government contractor I worked for, we actually wheeled our boxes out to the parking lot to a very large truck that had huge industrial shredders in the back. Once the boxes of documents were shredded, we were told to walk them over to a second truck, a burn truck, where we, as the records custodian, would actually verify that all of our records were destroyed. These shred days were a way to actually collect, verify and yes physically shred all the paper records that had gone beyond their retention period over the preceding year.

The Magic 8-Ball says Shred Days aren’t Defensible

Nowadays, this type of activity carries some negative connotations with it and is much more risky. Take for example the recent case of Rambus vs SK Hynix. In this case U.S District Judge Ronald Whyte in San Jose reversed his own prior ruling from a 2009 case where he had originally issued a judgment against SK Hynix, awarding Rambus Inc. $397 million in a patent infringement case. In his reversal this year, Judge Whyte ruled that Rambus Inc. had spoliated documents in bad faith when it hosted company-wide “shred days” in 1998, 1999, and 2000. Judge Whyte found that Rambus could have reasonably foreseen litigation against Hynix as early as 1998, and that therefore Rambus engaged in willful spoliation during the three “shred days” (a finding of spoliation can be based on inadvertent destruction of evidence as well). Because of this recent spoliation ruling, the Judge reduced the prior Rambus award from $397 million to $215 million, a cost to Rambus of $182 million.

Another well know example of sudden retention/disposition policy activity that caused unintended consequences is the Arthur Andersen/Enron example. During the Enron case, Enron’s accounting firm sent out the following email to some of its employees:

 

 

This email was a key reason why Arthur Andersen ceased to exist shortly after the case concluded. Arthur Andersen was charged with and found guilty of obstruction of justice for shredding the thousands of documents and deleting emails and company files that tied the firm to its audit of Enron. Less than 1 year after that email was sent, Arthur Andersen surrendered its CPA license on August 31, 2002, and 85,000 employees lost their jobs.

Learning from the Past – Defensible Disposal

These cases highlight the need for a true information governance process including a truly defensible disposal capability. In these instances, an information governance process would have been capturing, indexing, applying retention policies, protecting content on litigation hold and disposing of content beyond the retention schedule and not on legal hold… automatically, based on documented and approved legally defensible policies. A documented and approved process which is consistently followed and has proper safeguards goes a long way with the courts to show good faith intent to manage content and protect that content subject to anticipated litigation.

To successfully automate the disposal of unneeded information in a consistently defensible manner, auto-categorization applications must have the ability to conceptually understand the meaning in unstructured content so that only content meeting your retention policies, regardless of language, is classified as subject to retention.

Taking Defensible Disposal to the Next Level – Predictive Disposition

A defensible disposal solution which incorporates the ability to conceptually understand content meaning, and which incorporates an iterative training process including “train by example,” in a human supervised workflow provides accurate predictive retention and disposition automation.

Moving away from manual, employee-based information governance to automated information retention and disposition with truly accurate (95 to 99%) and consistent meaning-based predictive information governance will provide the defensibility that organizations require today to keep their information repositories up to date.

Predicting the Future of Information Governance


Information Anarchy

Information growth is out of control. The compound average growth rate for digital information is estimated to be 61.7%. According to a 2011 IDC study, 90% of all data created in the next decade will be of the unstructured variety. These facts are making it almost impossible for organizations to actually capture, manage, store, share and dispose of this data in any meaningful way that will benefit the organization.

Successful organizations run on and are dependent on information. But information is valuable to an organization only if you know where it is, what’s in it, and what is shareable or in other words… managed. In the past, organizations have relied on end-users to decide what should be kept, where and for how long. In fact 75% of data today is generated and controlled by individuals. In most cases this practice is ineffective and causes what many refer to as “covert orunderground archiving”, the act of individuals keeping everything in their own unmanaged local archives. These underground archives effectively lock most of the organization’s information away, hidden from everyone else in the organization.

This growing mass of information has brought us to an inflection point; get control of your information to enable innovation, profit and growth, or continue down your current path of information anarchy and choke on your competitor’s dust.

 

img-pred-IG

 

Choosing the Right Path

How does an organization ensure this infection point is navigated correctly? Information Governance. You must get control of all your information by employing the proven processes and technologies to allow you to create, store, find, share and dispose of information in an automated and intelligent manner.

An effective information governance process optimizes overall information value by ensuring the right information is retained and quickly available for business, regulatory, and legal requirements.  This process reduces regulatory and legal risk,  insures needed data can be found quickly and is secured for litigation,  reduces overall eDiscovery costs, and provides structure to unstructured information so that employees can be more productive.

Predicting the Future of Information Governance

Predictive Governance is the bridge across the inflection point. It combines machine-learning technology with human expertise and direction to automate your information governance tasks. Using this proven human-machine iterative training capability,Predictive Governance is able to accurately automate the concept-based categorization, data enrichment and management of all your enterprise data to reduce costs, reduce risks, enable information sharing and mitigate the strain of information overload.

Automating information governance so that all enterprise data is captured, granularity evaluated for legal requirements, regulatory compliance, or business value and stored or disposed of in a defensible manner is the only way for organizations to move to the next level of information governance.

Finding the Cure for the Healthcare Unstructured Data Problem


Healthcare information/ and records continue to grow with the introduction of new devices and expanding regulatory requirements such as The Affordable Care Act, The Health Insurance Portability and Accountability Act (HIPAA), and the Health Information Technology for Economic and Clinical Health Act (HITECH). In the past, healthcare records were made up of mostly paper forms or structured billing data; relatively easy to categorize, store, and manage.  That trend has been changing as new technologies enable faster and more convenient ways to share and consume medical data.

According to an April 9, 2013 article on ZDNet.com, by 2015, 80% of new healthcare information will be composed of unstructured information; information that’s much harder to classify and manage because it doesn’t conform to the “rows & columns” format used in the past. Examples of unstructured information include clinical notes, emails & attachments, scanned lab reports, office work documents, radiology images, SMS, and instant messages.

Who or what is going to actually manage this growing mountain of unstructured information?

To insure regulatory compliance and the confidentiality and security of this unstructured information, the healthcare industry will have to 1) hire a lot more professionals to manually categorize and mange it or 2) acquire technology to do it automatically.

Looking at the first solution; the cost to have people manually categorize and manage unstructured information would be prohibitively expensive not to mention slow. It also exposes private patient data to even more individuals.  That leaves the second solution; information governance technology. Because of the nature of unstructured information, a technology solution would have to:

  1. Recognize and work with hundreds of data formats
  2. Communicate with the most popular healthcare applications and data repositories
  3. Draw conceptual understanding from “free-form” content so that categorization can be accomplished at an extremely high accuracy rate
  4. Enable proper access security levels based on content
  5. Accurately retain information based on regulatory requirements
  6. Securely and permanently dispose of information when required

An exciting emerging information governance technology that can actually address the above requirements uses the same next generation technology the legal industry has adopted…proactive information governance technology based on conceptual understanding of content,  machine learning and iterative “train by example” capabilities

 

The lifecycle of information


Organizations habitually over-retain information, especially unstructured electronic information, for all kinds of reasons. Many organizations simply have not addressed what to do with it so many of them fall back on relying on individual employees to decide what should be kept and for how long and what should be disposed of. On the opposite end of the spectrum a minority of organizations have tried centralized enterprise content management systems and have found them to be difficult to use so employees find ways around them and end up keeping huge amounts of data locally on their workstations, on removable media, in cloud accounts or on rogue SharePoint sites and are used as “data dumps” with or no records management or IT supervision. Much of this information is transitory, expired, or of questionable business value. Because of this lack of management, information continues to accumulate. This information build-up raises the cost of storage as well as the risk associated with eDiscovery.

In reality, as information ages, it probability of re-use and therefore its value, shrinks quickly. Fred Moore, Founder of Horison Information Strategies, wrote about this concept years ago.

The figure 1 below shows that as data ages, the probability of reuse goes down…very quickly as the amount of saved data rises. Once data has aged 10 to 15 days, its probability of ever being looked at again approaches 1% and as it continues to age approaches but never quite reaches zero (figure 1 – red shading).

Contrast that with the possibility that a large part of any organizational data store has little of no business, legal or regulatory value. In fact the Compliance, Governance and Oversight Counsel (CGOC) conducted a survey in 2012 that showed that on the average, 1% of organizational data is subject to litigation hold, 5% is subject to regulatory retention and 25% had some business value (figure 1 – green shading). This means that approximately 69% of an organizations data store has no business value and could be disposed of without legal, regulatory or business consequences.

The average employee creates, sends, receives and stores conservatively 20 MB of data per day. This means that at the end of 15 business days, they have accumulated 220 MB of new data, at the end of 90 days, 1.26 GB of data and at the end of three years, 15.12 GB of data. So how much of this accumulated data needs to be retained? Again referring to figure 1 below, the blue shaded area represents the information that probably has no legal, regulatory or business value according to the 2012 CGOC survey. At the end of three years, the amount of retained data from a single employee that could be disposed of without adverse effects to the organization is 10.43 GB. Now multiply that by the total number of employees and you are looking at some very large data stores.

Figure 1: The Lifecycle of data

The above lifecycle of data shows us that employees really don’t need all of the data they squirrel away (because its probability of re-use drops to 1% at around 15 days) and based on the CGOC survey, approximately 69% of organizational data is not required for legal, regulatory retention or has business value. The difficult piece of this whole process is how can an organization efficiently determine what data is not needed and dispose of it automatically…

As unstructured data volumes continue to grow, automatic categorization of data is quickly becoming the only way to get ahead of the data flood. Without accurate automated categorization, the ability to find the data you need, quickly, will never be realized. Even better, if data categorization can be based on the meaning of the content, not just a simple rule or keyword match, highly accurate categorization and therefore information governance is achievable.

Healthcare Information Governance Requires a New Urgency


From safeguarding the privacy of patient medical records to ensuring every staff member can rapidly locate emergency procedures, healthcare organizations have an ethical, legal, and commercial responsibility to protect and manage the information in their care. Inadequate information management processes can result in:

  • A breach of protected health information (PHI) costing millions of dollars and ruined reputations.
  • A situation where accreditation is jeopardized due to a team-member’s inability to demonstrate the location of a critical policy.
  • A premature release of information about a planned merger causing the deal to fail or incurring additional liability.

The benefits of effectively protecting and managing healthcare information are widely recognized but many organizations have struggled to implement effective information governance solutions. Complex technical, organizational, regulatory and cultural challenges have increased implementation risks and costs and have led to relatively high failure rates.  Ultimately, many of these challenges are related to information governance.

In January 2013, The U.S. Department of Health and Human Services published a set of modifications to the HIPAA privacy, security, enforcement and breach notification rules.  These included:

  • Making business associates directly liable for data breaches
  • Clarifying and increasing the breach notification process and penalties
  • Strengthening limitations on data usage for marketing
  • Expanding patient rights to the disclosure of data when they pay cash for care

Effective Healthcare Information Governance steps

Inadvertent or just plain sloppy non-compliance with regulatory requirements can cost your healthcare organization millions of dollars in regulatory fines and legal penalties. For those new to the healthcare information governance topic, below are some suggested steps that will help you move toward reduced risk by implementing more effective information governance processes:

  1. Map out all data and data sources within the enterprise
  2. Develop and/or refresh organization-wide information governance policies and processes
  3. Have your legal counsel review and approve all new and changed policies
  4. Educate all employees and partners, at least annually, on their specific responsibilities
  5. Limit data held exclusively by individual employees
  6. Audit all policies to ensure employee compliance
  7. Enforce penalties for non-compliance

Healthcare information is by nature heterogeneous. While administrative information systems are highly structured, some 80% of healthcare information is unstructured or free form.  Securing and managing large amounts of unstructured patient as well as business data is extremely difficult and costly without an information governance capability that allows you to recognize content immediately, classify content accurately, retain content appropriately and dispose of content defensibly.

Coming to Terms with Defensible Disposal; Part 1


Last week at LegalTech New York 2013 I had the opportunity to moderate a panel titled: “Defensible Disposal: If it doesn’t exist, I don’t have to review it…right?” with an impressive roster of panelists. They included: Bennett Borden, Partner, Chair eDiscovery & Information Governance Section, Williams Mullen, Clifton C. Dutton, Senior Vice President, Director of Strategy and eDiscovery, American International Group and John Rosenthal, Chair, eDiscovery and Information Management Practice, Winston & Strawn and Dean Gonsowski, Associate General Counsel, Recommind Inc.

During the panel session it was agreed that organizations have been over-retaining ESI (which accounts for at least 95% of all data in organizations) even if it’s no longer needed for business or legal reasons. Other factors driving this over-retention of ESI were the fear of inadvertently deleting evidence, otherwise called spoliation. In fact an ESG survey published in December of 2012 showed that the “fear of the inability to furnish data requested as part of a legal or regulatory matter” was the highest ranked reason organizations chose not to dispose of ESI.

Other reasons cited included not having defined policies for managing and disposing of electronic information and adversely, organizations having defined retention policies to actually keep all data indefinitely (usually because of the fear of spoliation).

One of the principal information governance gaps most organizations haven’t yet addressed is the difference between “records” and “information”. Many organizations have “records” retention/disposition policies to manage those official company records required to be retained under regulatory or legal requirements. But those documents and files that fall under legal hold and regulatory requirements amount to approximately 6% of an organization’s retained electronic data (1% legal hold and 5% regulatory).

Another interesting survey published by Kahn Consulting in 2012 showed levels of employee understanding of their information governance-related responsibilities. In this survey only 21% of respondents had a good idea of what information needed to be retained/deleted and only 19% knew how  information should be retained or disposed of. In that same survey, only 15% of respondents had a general idea of their legal hold and eDiscovery responsibilities.

The above surveys highlight the fact that organizations aren’t disposing of information in a systematic process mainly because they aren’t managing their information, especially their electronic information and therefore don’t know what information to keep and what to dispose of.

An effective defensible disposal process is dependent on an effective information governance process. To know what can be deleted and when, an organization has to know what information needs to be kept and for how long based on regulatory, legal and business value reasons.

Over the coming weeks, I will address those defensible disposal questions and responses the LegalTech panel discussed. Stay tuned…

Do organizations really have formal information disposal processes…I think NOT!


Do organizations really have formal information disposal processes…I think NOT!

Do organizations regularly dispose of information in a systematic, documented manner? If the answer is “sure we do”, do they do it via a standardized and documented process or “just leave it to the employees”?

If they don’t…who cares – storage is cheap!

When I ask customers if they have a formal information disposal process, 70 to 80 percent of the time the customer will answer “yes” but when pressed on their actual process, I almost always hear one of the following:

1.    We have mailbox limits, so employees have to delete emails when they reach their mailbox limit
2.    We tell our employees to delete content after 1,2, or 3 years
3.    We store our records (almost always paper) at Iron Mountain and regularly send deletion requests

None of these answers rise to an information governance and disposal process. Mailbox limits only force employees into stealth archiving, i.e. movement of content out of the organization’s direct control. Instructing employees to delete information without enforcement and auditing is as good as not telling them to do anything at all. And storing paper records at Iron Mountain does not address the 95%+ of the electronic data which resides in organizations.

Data center storage is not cheap. Sure, I can purchase 1 TB of external disk at a local electronics store for $150 but that 1 TB is not equal to 1 TB of storage in a corporate data center. It also doesn’t include annual support agreements, the cost of allocated floor space, the cost of power and cooling, or IT resource overhead including nightly backups. Besides, the cost of storage is not the biggest cost organizations who don’t actively manage their information face.

The astronomical costs arise when considering litigation and eDiscovery. A recent RAND survey highlighted the fact that it can cost $18,000 to review 1 GB of information for eDiscovery. And considering many legal cases include the collection and review of terabytes of information, you can imagine the average cost per case can be in the millions of dollars.

So what’s the answer? First, don’t assume information is cheap to keep. Data center storage and IT resources are not inexpensive, take human resources to keep up and running, and consume floor space. Second, information has legal risk and cost associated with it. The collection and review of information for responsiveness is time consuming and expensive. The legal risks associated with unmanaged information can be even more costly. Imagine your organization is sued. One of the first steps in responding to the suit is to find and secure all potentially responsive data. What would happen if you didn’t find all relevant data and it was later discovered you didn’t turn over some information that could have helped the other side’s case? The Judge can overturn an already decided case, issue an adverse inference, assign penalties etc. The withholding or destruction of evidence is never good and always costs the losing side a lot more.

The best strategy is to put policies, processes and automation in place to manage all electronic data as it occurs and to dispose of data deemed not required anymore. One solution is to put categorization software in place to index, understand and categorize content in real time by the conceptual meaning of the content.  Sophisticated categorization can also find, tag and automatically dispose of information that doesn’t need to be kept anymore.  Given the amount of information created daily, automating the process is the only definitive way to answer ‘yes we have a formal information disposal process’.

Knowledge Management is Dependent on Effective Information Governance


Last week I presented at the Janders Dean Legal Knowledge & Innovation Conference in Sydney Australia. This conference is one of the leading knowledge management and technology forums for the legal industry in the world. The forum was extremely interesting with a great venue and agenda.

Much of the content was directed at knowledge management within law firms and corporate legal departments i.e. how knowledge is created, collected, and shared within these organizations to maximum benefits and ROI.

The whole event was somewhat hair-raising for me in that I found out I was to travel to Sydney to speak at this forum the Thursday before the Monday I was to leave. It occurred to me on the Saturday before that I was to present at this forum and I had no idea what I was to speak on much less have the time to create an effective presentation. After looking at the agenda on-line I determined that 1) It was for the legal industry and 2) knowledge management was somehow involved.

That Saturday and Sunday I put together a presentation addressing what I thought would add to the discussion which included eDiscovery, Information Governance (because it’s the same as knowledge management – right)and some local Australian precedents. As I landed in San Francisco on Monday to catch my flight to Sydney I noticed an email from the Janders Dean organizer asking me for my presentation so the forum laptop could be loaded and ready to go with all presentations. Thinking that for once I was ahead of the curve I happily replied to the email with my presentation.

Dreading the 15 hour flight in “Economy” I noticed the departure board at the airport was now saying my 10:30 pm flight to Sydney was delayed for 11 hours due to weather and would take off at 9 am Tuesday morning (by the way, as I boarded the next day, the crew admitted it was not weather, but an equipment problem in Chicago). As I was furiously burning up my laptop keyboard looking for a room for the night I got a very nice email from Janders Dean telling me my presentation I had sent off really didn’t hit the mark and was much too eDiscovery heavy…the audience is knowledge management professional, not attorneys.

After getting the last available room in San Francisco (my 747 flight crew slept on the floor in the airport that night) I tried to put together something more “knowledge management (?)” focused and send it off before I got on my flight the next morning. Turns out the Janders Dean organizer (Justin North) was completely right in very politely telling me my first presentation attempt was not a fit. The forum was heavily weighted towards non-attorneys specializing in knowledge management.

The above description was a long winded opening to allow me to get to my main point (and complain about my travel experiences), which is this; really effective knowledge management is dependent on effective information governance. The creation and dissemination of knowledge within an organization is impossible without the ability to create, store, and share useful information while disposing of useless information.

Content auto- categorization and indexing techniques are the first step in getting control of an organization’s information. If a system can conceptually understand and auto- categorize content as it occurs so that all content in the enterprise is searchable and managed via the correct retention periods including immediate deletion of useless information, then information is much more available to be turned into real knowledge within the organization.

Information Security in the Cloud


Information Governance managers as well as individuals need to be aware of possible risks when utilizing external cloud storage providers.

CNN has reported that Dropbox, the popular cloud-storage service, is investigating whether a security breach is to blame for a recent wave of spam e-mail sent to Dropbox users. Dropbox has stated that they haven’t had any reports of unauthorized activity within Dropbox accounts, the suspicion is that email addresses were taken to use for spamming purposes. Dropbox has roughly 50 million users who,according to the site, upload a billion files to the service every 48 hours. So far several users in Europe have reported spam from gambling sites sent to email addresses users created specifically for setting up Dropbox accounts.

This possible security breach brings up the question of how secure these cloud storage sites are. I for one use Dropbox and consider it a fantastic service, especially the desktop icon use model. Individuals and companies need to take the lead in ensuring their data is secure either by not utilizing these services or by securing their data before they upload it.

I always encrypt data before I upload it to any cloud storage service. I use two free encryption utilities; Kryptelite and Iron Key both from Invsoftworks. Krypteliteallows you to encrypt files by simply dragging and dropping files onto the Kryptelite desktop icon. To decrypt the files once they’re encrypted, you must drag the encrypted file back onto the Kryptelite desktop icon and type in the file password. This means you cannot decrypt a file unless you have a running version of Kryptelite on the PC you are using at the time.

Iron Key allows you to create self decrypting files which are completely stand alone and can be decrypted anywhere by simply double clicking on it and typing in the password.

Incorporating this additional encryption step into your utilization of cloud storage will add an additional layer of security beyond what the cloud storage providers are already doing.