The Lifecycle of Information – Updated

Organizations habitually over-retain information, especially unstructured electronic information, for all kinds of reasons. Many organizations simply have not addressed what to do with it so many of them fall back on relying on individual employees to decide what should be kept and for how long and what should be disposed of. On the opposite end of the spectrum a minority of organizations have tried centralized enterprise content management systems and have found them to be difficult to use so employees find ways around them and end up keeping huge amounts of data locally on their workstations, on removable media, in cloud accounts or on rogue SharePoint sites and are used as “data dumps” with or no records management or IT supervision. Much of this information is transitory, expired, or of questionable business value. Because of this lack of management, information continues to accumulate. This information build-up raises the cost of storage as well as the risk associated with eDiscovery. In reality, as information ages, it probability of re-use and therefore its value, shrinks quickly. Fred Moore, Founder of Horison Information Strategies, wrote about this concept years ago as the Lifecycle of Data. Figure 1 below shows that as data ages, the probability of reuse goes down…very quickly as the amount of saved data rises. Once data has aged 10 to 15 days, its probability of ever being looked at again approaches 1% and as it continues to age approaches but never quite reaches zero (figure 1 – blue shading).

Lifecycle of Data 1

Figure 1: The Lifecycle of Information

Contrast that with the possibility that a large part of any organizational data store has little of no business, legal or regulatory value. In fact the Compliance, Governance and Oversight Counsel (CGOC) conducted a survey in 2012 that showed that on the average, 1% of organizational data is subject to litigation hold, 5% is subject to regulatory retention and 25% had some business value (figure 2 – green shading). This means that approximately 69% of an organizations data store has no business value and could be disposed of without legal, regulatory or business consequences. The average employee creates, sends, receives and stores conservatively 20 MB of data per day. This means that at the end of 15 business days, they have accumulated 220 MB of new data, at the end of 90 days, 1.26 GB of data and at the end of three years, 15.12 GB of data (if they don’t delete anything). So how much of this accumulated data needs to be retained? Again referring to figure 2 below, the red shaded area represents the information that probably has no legal, regulatory or business value according to the 2012 CGOC survey. At the end of three years, the amount of retained data from a single employee that could be disposed of without adverse effects to the organization is 10.43 GB. Now multiply that by the total number of employees and you are looking at some very large data stores.

Lifecycle of Data 2

Figure 2: The Lifecycle of information Value

The above Lifecycle of Information Value graphic above shows us that employees really don’t need all of the data they squirrel away (because its probability of re-use drops to 1% at around 15 days) and based on the CGOC survey, approximately 69% of organizational data is not required for legal, regulatory retention or has business value. The difficult piece of this whole process is how can an organization efficiently determine what data is not needed and dispose of it using automation (because employees probably won’t)… As unstructured data volumes continue to grow, automatic categorization of data is quickly becoming the only realistic way to get ahead of the data flood. Without accurate automated categorization, the ability to find the data you need, quickly will never be realized. Even better, if data categorization can be based on the value of the content, not just a simple rule or keyword match, highly accurate categorization and therefore information governance is achievable.


“Move to Manage” versus “Manage in Place”

Traditional approaches to information management are generally speaking no longer suitable to meet today’s information management needs. The legacy “move-to-manage” premise is expensive, fraught with difficulties and contradictory to modern data repositories that (a) are either cloud-based, (b) have built-in governance tools, or (c) contain data that best resides in the native repository.

In reality, traditional records management and ECM systems only manage a small percentage of an organization’s total information. A successful implementation is often considered 5% of the information that exists. What about all the information not deemed a “record”?

Traditional archiving systems tend to capture everything and for the most part cause organizations to keep their archived information for much longer periods of time, or forever. Corporate data volumes and the data landscape have changed dramatically since archiving systems became widely adopted. Some organizations are discovering the high cost of getting their data out while others are experiencing end-user productivity issues, incompatible stuns or shortcuts and a lack of support for the modern interfaces through which users expect to access their information.

The unstructured data problem, along with the emerging reality of the cloud, have brought us to an inflection point; either continue to use decade-old, higher-cost and complex approaches to manage huge quantities of information, or proactively govern this information where it naturally resides  to more effectively identify, organize and advance the best possible outcomes for security, compliance, litigation response and innovation.

Today’s enterprise-ready hardware and storage solutions as well as scalable business productivity applications featuring built-in governance tools are both affordable and easily accessible. For forward-thinking organizations, there is no question that in-place information management is the most viable and cost-effective methodology for information management in the 21st century.

An Acaevo white paper on the subject can be downloaded here