Do organizations really have formal information disposal processes…I think NOT!


Do organizations really have formal information disposal processes…I think NOT!

Do organizations regularly dispose of information in a systematic, documented manner? If the answer is “sure we do”, do they do it via a standardized and documented process or “just leave it to the employees”?

If they don’t…who cares – storage is cheap!

When I ask customers if they have a formal information disposal process, 70 to 80 percent of the time the customer will answer “yes” but when pressed on their actual process, I almost always hear one of the following:

1.    We have mailbox limits, so employees have to delete emails when they reach their mailbox limit
2.    We tell our employees to delete content after 1,2, or 3 years
3.    We store our records (almost always paper) at Iron Mountain and regularly send deletion requests

None of these answers rise to an information governance and disposal process. Mailbox limits only force employees into stealth archiving, i.e. movement of content out of the organization’s direct control. Instructing employees to delete information without enforcement and auditing is as good as not telling them to do anything at all. And storing paper records at Iron Mountain does not address the 95%+ of the electronic data which resides in organizations.

Data center storage is not cheap. Sure, I can purchase 1 TB of external disk at a local electronics store for $150 but that 1 TB is not equal to 1 TB of storage in a corporate data center. It also doesn’t include annual support agreements, the cost of allocated floor space, the cost of power and cooling, or IT resource overhead including nightly backups. Besides, the cost of storage is not the biggest cost organizations who don’t actively manage their information face.

The astronomical costs arise when considering litigation and eDiscovery. A recent RAND survey highlighted the fact that it can cost $18,000 to review 1 GB of information for eDiscovery. And considering many legal cases include the collection and review of terabytes of information, you can imagine the average cost per case can be in the millions of dollars.

So what’s the answer? First, don’t assume information is cheap to keep. Data center storage and IT resources are not inexpensive, take human resources to keep up and running, and consume floor space. Second, information has legal risk and cost associated with it. The collection and review of information for responsiveness is time consuming and expensive. The legal risks associated with unmanaged information can be even more costly. Imagine your organization is sued. One of the first steps in responding to the suit is to find and secure all potentially responsive data. What would happen if you didn’t find all relevant data and it was later discovered you didn’t turn over some information that could have helped the other side’s case? The Judge can overturn an already decided case, issue an adverse inference, assign penalties etc. The withholding or destruction of evidence is never good and always costs the losing side a lot more.

The best strategy is to put policies, processes and automation in place to manage all electronic data as it occurs and to dispose of data deemed not required anymore. One solution is to put categorization software in place to index, understand and categorize content in real time by the conceptual meaning of the content.  Sophisticated categorization can also find, tag and automatically dispose of information that doesn’t need to be kept anymore.  Given the amount of information created daily, automating the process is the only definitive way to answer ‘yes we have a formal information disposal process’.

Information Governance and Predictive Coding


Predictive coding, also known as computer assisted coding and technology assisted review, all refer to the act of using computers and software applications which use machine learning algorithms to enable a computer to learn from records presented it (usually from human attorneys) as to what types of content are potentially relevant to a given legal matter. After a sufficient number of examples are provided by the attorneys, the technology is given access to the entire potential corpus (records/data) to sort through and find records that, based on its “learning”, are potentially relevant to the case.

This automation can dramatically reduce costs due to the fact that computers, instead of attorneys conduct the first pass culling of potentially millions of records.

Predictive coding has several very predictable dependencies that need to be addressed to be accepted as a useful and dependable tool in the eDiscovery process. First, which documents/records are used and who chooses them to “train the system”? This training selection will almost always be conducted by attorneys involved with the case.

The second dependency revolves around the number of documents used for the training. How many training documents are needed to provide the needed sample size to enable a dependable process?

And most importantly, do the parties have access to all potentially relevant documents in the case to draw the training documents from? Remember, potentially relevant documents can be stored anywhere. For predictive coding, or any other eDiscovery process to be legally defensible, all existing case related documents need to be available. This requirement highlights the need for effective information management by all in a given organization.

As the courts adopt, or at least experiment with predictive coding, as Judge Peck did in Monique Da Silva Moore, et al., v. Publicis Groupe & MSL Group, Civ. No. 11-1279 (ALC)(AJP) (S.D.N.Y. February 24, 2012, an effective information management program will become key to he courts adopting this new technology.

The ROI of Information Management


Information, data, electronically stored information (ESI), records, documents, hard copy files, email, stuff—no matter what you call it; it’s all intellectual property that your organization pays individuals to produce, interpret, use and export to others. After people, it’s a company’s most valuable asset, and it has many CIOs, GCs and others responsible asking: What’s in that information; who controls it; and where is it stored?

In simplest terms, I believe that businesses exist to generate and use information to produce revenue and profit.  If you’re willing to go along with me and think of information in this way as a commodity, we must also ask: How much does it cost to generate all that information? And, what’s the return on investment (ROI) for all that information?

The vast majority of information in an organization is not managed, not indexed, not backed up and, as you probably know or could guess, is rarely–if ever–accessed. Consider for a minute all the data in your company that is not centrally managed and  not easily available. This data includes backup tapes, share drives, employee hard disks, external disks, USB drives, CDs, DVDs, email attachments  sent outside the organization and hardcopy documents hidden away in filing cabinets.

Here’s the bottom line: If your company can’t find information or  doesn’t know what it contains, it is of little value. In fact, it’s valueless.

Now consider the amount of money the average company spends on an annual basis for the production, use and storage of information. These expenditures span:

  • Employee salaries. Most employees are in one way or another hired to produce, digest and act on information.
  • Employee training and day-to-day help-desk support.
  • Computers for each employee
  • Software
  • Email boxes
  • Share drives, storage
  • Backup systems
  • IT employees for data infrastructure support

In one way or another, companies exist to create and utilize information. So… do you know where all your information is and what’s in it? What’s your organization’s true ROI on the production and consumption of your information in your entire organization? How much higher could it be if you had complete control if it?

As an example, I have approximately 14.5 GB of Word documents, PDFs, PowerPoint files, spreadsheets, and other types of files in different formats that I’ve either created or received from others. Until recently, I had 3.65 GB of emails in my email box both on the Exchange server and mirrored locally on my hard disk. Now that I have a 480 MB mailbox limit imposed on me, 3.45 GB of those emails are now on my local hard disk only.

How much real, valuable information is contained in the collective 18 GB on my laptop? The average number of pages of information contained in 1 GB is conservatively 10,000. So 18 GB of files equals approximately 180,000 pages of information for a single employee that is not easily accessible or searchable by my organization. Now also consider the millions of pages of hardcopy records existing in file cabinets, microfiche and long term storage all around the company.

The main question is this: What could my organization do with quick and intelligent access to all of its employees’ information?

The more efficient your organization is in managing and using information, the higher the revenue and hopefully profit per employee will be.

Organizations need to be able to “walk the fence” between not impeding the free flow of information generation and sharing, and having a way for the organization as a whole to  find and use that information. Intelligent access to all information generated by an organization is key to effective information management.

Organizations spend huge sums of money to generate information…why not get your money’s worth? This future capability is the essence of true information management and much higher ROIs for your organization.