In my last blog, I discussed the concept of Defensible Disposal; getting rid of data which has no value to lower the cost and risk of eDiscovery as well as overall storage costs (IBM has been a leader in Defensive Disposal for several years). Custodians keep data because they might need to reuse some of the content later or they might have to produce it later for CYA reasons. I have been guilty of over the years and because of that I have a huge amount of old data on external disks that I will probably never, ever look at again. For example, I have over 500 GB of saved data, spreadsheets, presentations, PDFs, .wav files, MP3s, Word docs, URLs etc. that I have saved for whatever reason over the years. Have I ever really, reused any of the data…maybe a couple of times, but in reality they just site there. This brings up the subject of the Data Lifecycle. Fred Moore, Founder of Horison Information Strategies wrote about this concept years ago, referring to the Lifecycle of Data and the probability that the saved data will ever be re-used or even looked at again. Fred created a graphic showing this lifecycle of data.
Figure 1: The Lifecycle of data – Horison Information Systems
The above chart shows that as data ages, the probability of reuse goes down…very quickly as the amount of saved data rises. Once data has aged 90 days, its probability of reuse approaches 1% and after 1 year is well under 1%.
You’re probably asking yourself, so what!…storage is cheap, what’s the big deal? Storage is cheap. I have 500 GB of storage available to me on my new company supplied laptop. I have share drives available to me. And I have 1 TB of storage in my home office. I can buy 1TB of external disk for approximately $100, so why not keep everything forever?
For organizations, it’s a question of storage but more importantly, it’s a question of legal risk and the cost of eDiscovery. Any existing data could be a subject of litigation and therefore reviewable. You may recall in my last blog, I mentioned a recent report from the RAND Institute for Civil Justice which discussed the costs of eDiscovery including the estimate that the cost of reviewing records/files is approximately 73% of every eDiscovery dollar spent. By saving everything because you might someday need to reuse or reference it drive the cost of eDiscovery way up.
The key question to ask is; how do you get employees to delete stuff instead of keeping everything? In most organizations the culture has always been one of “save whatever you want until your hard disk and share drive is full”. This culture is extremely difficult to change…quickly. One way is to force new behavior with technology. I know of a couple of companies which only allow files to be saved to a specific folder on the users desktop. For higher level laptop users, as the user syncs to the organization’s infrastructure, all files saved to the specific folder are copied to a users sharedrive where an information management application applies retention policies to the data on the sharedrive as well as the laptop’s data folder.
In my opinion this extreme process would not work in most organizations due to culture expectations. So again we’re left with the question of how do you get employees to delete stuff?
Organizational cultures about data handling and retention have to be changed over time. This includes specific guidance during new employee orientation, employee training, and slow technology changes. An example could be reducing the amount of storage available to an employee on the share or home drive.
Another example could be some process changes to an employee’s workstation of laptop. Force the default storage target to be the “My Documents” folder. Phase 1 could be you have to save all files to the “My Documents” folder but can then be moved anywhere after that.
Phase 2 could include a 90 day time limit on the “My Documents” folder so that anything older than 90 days is automatically deleted (with litigation hold safeguards in place). This would cause files not deemed to be important enough to moved to be of little value and “disposable”. The 3rd Phase could include the inability to move files out of the “My Documents” folder (but with the ability for users to create subfolders with no time limit) thereby ensuring a single place of discoverable data.
Again, this strategy needs to be a slow progression to minimalize the perceived changes to the user population.
The point is it’s a end user problem, not necessarily an IT problem. End users have to be trained, gently pushed, and eventually forced to get rid of useless data…