[00:00:01] An introduction to a back file characterization project hello. Thank you for joining us. This is what counts, a podcast created by Trailblazer consultant. Here we highlight proven solutions developed through our experience working with companies across various industries. We talk about how you can apply these solutions to your company. We share our experience solving information management challenges like creating and implementing a records retention schedule, creating an asset data hierarchy, or helping with email management.
[00:00:29] This is Lee, and in this episode I will introduce a back file characterization project. I'll go through what it is and why an organization might want to do such a project.
[00:00:42] First of all, every organization should strive to implement a comprehensive information governance program to meet business needs and address the legal and regulatory requirements associated with handling of business information.
[00:00:56] Among the many challenges facing organizations today is the proliferation of electronic information stored in email databases and electronic files. In addition, the paperback file continues to expand both through organic growth as well as through acquisitions.
[00:01:13] A couple key statistics brought to you by an Internet search of records management statistics.
[00:01:19] On average, documents are photocopied around 19 times. Remember that for later and approximately 60% of office space is dedicated to paper file storage. This means that a lot of copies are floating around cubicles and offices, as well as being boxed up and sent to off site storage.
[00:01:39] I found these and more statistics in a 2023 blog post by pdfreaderpro.com dot another statistic. Around 83% of employees will recreate a document rather than spend time searching for it in their company's network. Again, just another situation showing uncontrolled information proliferation.
[00:02:01] I will interject that the time during COVID did help organizations move towards the use of electronic records and cause some organizations to realize just how much they depended on physical paper records. However, creating more electronic records doesn't necessarily mean that it's done in a controlled manner. Insert metadata tagging here, but that's another subject for another episode.
[00:02:25] Nonetheless, your information governance program should strive to get a handle on existing records in both electronic and paper, because the cost and risk of having unmanaged records is staggering. I won't go into any scare tactics by throwing out numbers related to fines, penalties, or the increased likelihood of a data breach. I'm going to assume you get what I'm talking about now. Specifically, I want to talk about backfile records. You know, when you do a scanning project, you decide whether you want to go day forward or do you want to do backwards? The backfile these are the records stored in a multitude of repositories that are not used on a daily basis. In fact, these records are most likely past their retention timeframe and should have been dispositioned some time ago. It happens. Large volumes of paper records and the lack of or active controls governing the creation, use, and storage of electronic information causes this back file information to grow and grow. And like I said earlier, documents are printed and photocopied around 19 times each, which adds the amount of physical paper storage increasing as well.
[00:03:38] Laura and I had a brief conversation about this episode, and we compared some of the offsite storage box counts that we've seen at different organizations, and 200,000 boxes in storage is not far fetched from electronic standpoint. More than five terabytes of data is also not far fetched. Those numbers might be small for large corporations, but I wanted to be conservative. Still, that's a lot of back file information, double or even triple those numbers for large scale, data intensive organizations.
[00:04:12] Where is this backfile information typically found?
[00:04:15] What are these repositories that hold all this information? Well, you already heard me say offsite storage.
[00:04:22] Obviously that's for physical paper storage. You'd be surprised to find the many organizations have a couple of storage vendors holding their physical material. Sure, it may be necessary to use a different offsite storage vendor for different geographic locations, but there are organizations that have multiple offsite storage vendors for one location. Some even have different storage vendors and multiple storage pods. Yeah, like the moving pods packed with boxes of material.
[00:04:54] Each one of those locations is what I'm calling a repository for these particular physical records.
[00:05:01] Electronic records? Where could those be stored?
[00:05:04] Network drives, otherwise known as shared drives, personal network drives like OneDrive, cloud storage like Google Drive or Dropbox, SharePoint, or alternatives like box huddle, slacken workplace from meta, alfresco, enterprise content management systems, large active transactional systems, and even decommissioned transactional systems still hanging around?
[00:05:32] The list can go on and on. When it comes to electronic records, I think you get the idea. I think you understand that there's a need to get some kind of control over where your organization stores this information. So an organization would initiate a back file characterization project to develop a strategy and a plan for ensuring that the backfile records in both electronic and paper format can be accessed, maintained, or destroyed in accordance with information governance program guidelines, including the records retention schedule due to large volumes information. We're talking about the traditional approach of inventorying and classifying individual records to determine eligibility for destruction or other appropriate handling procedures. Simply not feasible. Could you imagine? Maybe you could because you've had to do it. But could you imagine having to inventory and index 10,000 boxes, let alone 200,000 boxes. Honestly, I've been there, done that, and I don't want to ever do it again.
[00:06:39] From an electronic perspective, there are plenty of tools that can help get a handle on excess information.
[00:06:45] We have a homegrown inventory tool that when pointed at a drive or location, it'll display all available metadata for the information and allow an individual to select what they want to do, such as move, keep, or stage for deletion.
[00:07:01] However, nobody wants to go through the material item by item and decide first, is it a record? Second, can I map it to the retention schedule so that I can determine how long I have to keep it or whether I can get rid of it? So a backfile characterization project is an alternative approach to characterize the backfile repositories to a level of detail that informs the decision making process for determining the best approach to handling each backfile repository approaches that are field tested and supported by statistics.
[00:07:39] The goal of the back file characterization is to provide sufficient information regarding the content and age of the information contained in a specific repository to support defensible destruction. You make a decision, you have all the statistics, all the facts in front of you. After this project is done to make a decision, that's as far as I'm going to go in this episode, I'll get more to help dive deeper into how to accomplish this type of characterization, taking into consideration your organization's tolerance for risk and several other factors.
If you have any questions, please send us an email at
[email protected] or look us up on the web at www.trailblazer.us.com.
[00:08:22] Thank you for listening and please tune in to our next episode. Also, if you liked this episode, please be a champion and share it with people in your social media network. As always, we appreciate you, the listeners. A special thanks goes to Jason Blake, who created our music.