June 17, 2024

00:19:46

Document Migration - E84

Document Migration - E84
What Counts?
Document Migration - E84

Jun 17 2024 | 00:19:46

/

Show Notes

2024 Episode 84 – What are the important steps for matching documents to data to ensure accurate document migration? How do you find all the places where documents are stored to migrate the appropriate documents? Join Information Governance Consultants, Maura Dunn and Lee Karas, as they explain important steps for document migration. Each episode contains important information gained through our experience working with companies across various industries and we talk about how you can apply this experience to your company. Episode length 00:19:46.
View Full Transcript

Episode Transcript

[00:00:01] Speaker A: How do you even find the documents? Hello. Thank you for joining us. This is what counts. A podcast created by Trailblazer Consulting. Here we highlight proven solutions developed through our experience working with companies across various industries. And we talk about how you can apply these solutions to your company. We share our experience solving information management challenges, like creating and implementing a records retention schedule, creating an asset data hierarchy, or helping with email management. This is Lee, and in this episode, Maura and I will talk about how to find those darn documents. Maura, how do you want to handle this? [00:00:40] Speaker B: Well, it's a challenge, I think we've said before, because it's so easy for people to save multiple copies of documents. And some of our listeners might remember from early episodes when we talked about email and attachments, how easy it is to proliferate the number of documents that are out in your company's servers, sharepoint sites, Google Drives, wherever you're storing documents, it is highly likely that there are multiple copies in multiple places. Because if you leave it up to individuals to figure out where to put things, a couple different things happen. First, no two people are going to do it the same way. Second, the same person is not going to do it the same way on more than one occasion unless they've thought through the process and the organizational structure, and that's been proven over and over again. It actually started from the world of library science and the idea that when you look into cataloging something, some things are straightforward, some aren't, and where there's variation in what's possible, people will pick the different answer every other time, something like that. So that's an inherent challenge in the way that we manage documents today, because manage is a very loose word in the way that people handle their documents. So that's our first challenge, and that's kind of universal to documents. But the specific one that I wanted to talk about, that I want to talk about today in this episode is around. Okay, we've got a database, and it's got data in it about these documents. They might be contracts, they might be business case documents for capital projects. They might be some kind of a report that is produced on a regular basis, say quarterly reports. And you've got a database that contains some data about them. And now you're coming into, maybe it's a new system, or maybe you're just trying to impose some structure and discipline and you want to better connect the data and the documents. So that was the angle I was going to take. What do you think? [00:02:52] Speaker A: I think that's fine. I was just going to throw in another example, like HR documents. You have your HRS system and then everybody's got a copy of their own evaluation, let's put it that way, or the various documents that would be out there for that. [00:03:08] Speaker B: Yeah, that's a great point, because it's not only the person who has their own copy, which every person you can expect will keep their own copies. The bigger challenge is when you have a supervisor and they've got copies of all their direct reports, and maybe it's a two level supervisor and that second person also has copies of all the direct report reviews for their direct reports and the second level. And maybe there's been some back and forth, some collaboration, some drafts. Suddenly you've got 27 different versions of something. You've got documents in the supervisors, say Onedrive or Google Drive. You've got documents in the secondary line. Managers, they're keeping them. Maybe they kept them on their hard drive because they were offline when they worked on it. And you've got everybody with their own. Okay, so what do we do now? We know this is a problem. We don't really have our arms around the problem. So the first thing is think about your stakeholders, and we've talked about that a lot in multiple episodes. Who's creating these documents and what are they doing with them? Kind of what's their role, who should be holding them and what's this system that you're trying to match it up with. So if you think about the HR example, you've got an HR department that up until now has maybe been working in paper. They're printing out the reviews and getting people to sign them, and then they're putting them in a hard copy folder. Hard copies are relatively easy to take care of compared to electronic copies, but now you've also got this HRIs system that you want to use because it's gonna manage, say, payroll or promotions or some statistical things where you have to be reporting on how many employees you have, making sure that you've got their payroll taxes right, making sure that you've got their benefits right, and you've got this system. But up until now, you haven't used your system for the reviews. [00:05:12] Speaker A: You've done those separately, or it assigns appropriate training classes as next steps of. [00:05:17] Speaker B: Moving people along could be training, which also might get you into a learning management system which introduces a whole new set of documents that are probably in folders somewhere, although in this case probably electronic folders. And those have also got different requirements. You need to keep them for the employee to say, I offered this training to these employees and they took what they took. But depending on the environment that you're in, the industry that you're in, you might also need to keep those trainings to prove that you managed your facility appropriately, that whenever you were doing maintenance or repairs or inspections, you had appropriately trained staff who were performing those. That means you're going to have to keep those training records longer than the staff, potentially. So that's a great example, a hard one and a real life one that infrastructure companies deal with all the time. So our first step is still find your stakeholders. The second step is it's a hard step, it's a manual step. Ask them, where are you putting things? And I know you remember, and we've mentioned before, so our loyal listeners might have heard about that company that we worked with, and we asked them about their recruiting process, and we were trying to map it out, and it was a pretty simple process. They got resumes in, they sent them around to the interviewers. They hired people, and they thought, no problem. Why do you want to start with that? And we said, oh, here's why you're keeping these resumes in six places. And they were shocked because everyone was just keeping them just in case, or because they forgot that they should get rid of it or they just didn't know what to do. And the thing about resumes is that there are actually rules around how long you should keep them. And they have privacy information that is sensitive and you need to be careful with it. And that was a small group that was doing that. Think about that in a bigger company. So ask the question, get the answer. We recently asked a company a question about tax forms that they were getting in from their suppliers and through their process, which they were like, oh, it's all automated. We're using our ticketing system to make sure everything gets set up correctly. But again, they were saving some tax forms in multiple locations across the company. So that's the first thing you found, your stakeholders. Second thing, ask them where they're putting their documents. The third step is a tricky one. Everything about this is hard. I keep wanting to say that. So the third step is make a chart, all the locations, all the type of documents that are there. Then you go look at these locations and you say, okay, I expected to find training materials in this folder, and I found them in this folder and also in these three other folders. What I didn't find in any of these three folders were the training attendance forms, and I need those because this is a company where I have to prove that I provided the appropriate training for to manage the facility safely. So where are those? Then? You do that over for every single location that somebody told you about and also for the types of documents that you expect to find. So how do we do that? How do you know what type of document you're expecting to find? That's a rhetorical question, but you can answer it if you want. [00:08:42] Speaker A: No, I'm good. Because you just keep. This can keep spiraling. I keep wanting to say, well, go listen to our interview questions to figure out how to talk to someone. And then I keep going to finding out the folder structure. And so, yeah, there's a bunch of different episodes that I want to point to. [00:09:01] Speaker B: Kind of things are starting to come together because we've been doing this for a while, and all the building blocks are needed now for these more advanced projects. So how you know this is, you look at your record categories, you look at your retention schedule, and you're saying, okay, I'm tackling HR documents and training materials. Look at my record category. What do it. What did I list as the sample documents or the likely content types that are found here? And if I'm looking at my locations that these guys told me about that the stakeholders told me about, and I haven't found one of the key content types. I got to go back and ask the question again. Okay. You told me all the training materials were here. I actually found them in three places, but what I didn't find yet were any certificates of completion or lists of attendees or, you know, proof that somebody went to the training. I didn't find those yet. So where do you think those are? And it becomes a little bit of a search, a mystery to try and track them down. So. Okay, done. All that. You have a giant chart that lists location and record category and document type. Now you got to say, do we have duplicates? And you can. There are different programs that you can run to see if these are exact matches or near matches. You can look at, like, dates in the properties. You can look at file size. You can look at the last update date. You can look at versioning and naming to figure out if you've got a bunch of duplicates and which one is the right one. You may or may not want to do that. I mean, you want to do it, we all want to do it. But you may make a decision that, okay, you know what? I'm just going to move them all. I'm going to save all the versions, and I'm not going to try and de dupe them because it's just, just, it's too much right now. And my risk is higher of losing something than it is of bringing over a duplicate. That's a judgment call, and you're going to have to base it on your particular situation. And as long as you document it, the reasons for it, and you have approval from the stakeholders and from whoever your approval chain is, then move forward. [00:11:08] Speaker A: Can you also go to the source? Meaning in your interview, as you were talking, you were talking to a manager, and if we're talking about the HR area, right. You started asking about all these different things. Can we go to HR? Can we go to the function and the particular process, find those documents there and use that as the source and forget about everyone else's? [00:11:36] Speaker B: You can go to them and use theirs as the source to make sure that you have a complete set. You can't forget about everything. Okay, then your risk is that you've got a set that you're managing in accordance with your policies and your retention schedule, and you've got uncontrolled documents out there that you don't know where they are and they represent a risk because they could, if they're sensitive information. We're talking about HR records here could be sensitive privacy information you don't want to breach that you didn't know about. Also, in the event of an audit or a litigation where you have to do an e discovery, you will have to go do a broader search and then you'd have to produce those records and that will undermine your story on. No, I have my set over here and I'm managing it, so it's not a good idea to just forget about it. [00:12:25] Speaker A: Yeah, that makes sense. [00:12:27] Speaker B: Okay. All right. So you've done all that. I do agree, though, that going to talk to the stakeholders is the, is the path to approval of, I'm just going to bring them all, or I've decided, decided this is the source, or whatever the decision is, but be comprehensive in what you're recommending. Now, we've got this database over here. Remember that we're trying to match things up, so we need to look at the documents. We now know a lot more about them than we did before. We know how old they are, we know how they're named, we know how they're organized. We know where they are. What are the key data points that we could find in our database that's going to point us to these documents? So if you have an LMS, a learning management system, you probably have training names, like the name of the training. You might also have dates. So you can use those two key pieces of data to look at your collection, your chart that tells you everything there is to know about your documents. Look for the training materials that have the same matching name. Look for attendance sheets that have the matching date. If it's contract records, you found out that they're organized by contract number. You can look for the contract number in your database. Or maybe you found out that they're organized by the vendor name, the supplier name, or the counterparty name. So look for that in your database. And you're basically, you're looking to match a couple of points so that you can get to, you know, it's one of those 80 20 rules. Can we match most of the documents to one of the records to a record in the database so that we can say these seven documents have something to do with this record in our database? And the goal here is not 100%, it's not perfection, because if we're looking for perfection, we will never finish any records management project. So the goal is we've matched what we can and we have exceptions left. We have documents that we can't figure out what they match with. And we probably have database records that we don't have any documents for. So, okay, let's take care of the ones that we matched. Let's go ahead and migrate them to the new area. We've got a new structure. We want to move them in. We've matched them up to the database records. Excellent. I'll come back to that in a second. Your exception bucket of unmatched documents and unmatched database records, depending how big it is, you could do a couple different things. You could put somebody on it, say you're just going to have to go look at every document and every record and figure out if they match. If it's a small batch, you can do that. The other thing you could do is treat this like any other back file system where what we try and do is understand generally the record category and the age and what's the maximum retention that we're going to need for these. And in this case, since we're trying to match it to database records, we might create a new record for the unmatched documents and say these are the three key data points that we can find about these documents that doesn't match anything else. Create those records, migrate those documents again, the key here is document it, share the recommendation with all the key stakeholders and approvers, and then go ahead and move forward. You're following your principles, you're following your rules, and you're not going to get 100% unless you only have ten records. So. All right, let's one more step, and then I have a sort of an overview thing. The one more step is once you've migrated everything, you've got to do a validation, make sure it actually moved where you wanted it to move. And that's a typical it task where they, you know, they do counts. They had 700 documents here, there's now 700 documents over here. They had this list of properties. It matches up to this list of properties. They were matched against these database records and they match against these database records. Typically it's a machine driven thing where you can produce reports that show you what matches and what exceptions there are to that migration process. Those are the basic steps. Before I go into my caveat, anything you want to add, Lee? [00:16:41] Speaker A: No, I think we need an in summary. [00:16:43] Speaker B: Okay, well, so the caveat, and maybe it isn't in summary, is user engagement. You'll notice that every step of the way there, we talked about stakeholders, we talked about asking questions, and we talked about involving people, because this can't be done in a vacuum. You, the records manager or the information governance person, you did not produce these records, you do not use them. You know very little about them. You know what your retention schedule says about them, you know what the policies say, but you don't know what the reality is. So you need to be working with the users all along, this whole process first, because that's going to make your life easier. You're going to find out more about the records, you're going to make sure that you're doing the right thing with them, and you're going to meet all your requirements and you'll have that shared responsibility and approval where it's not going to all come crashing down on you if you're wrong. The other kind of bonus for you is that migrating documents makes people mad sometimes and they feel like they can't find what they need and so they're going to start making their own copies and you don't want that to happen. So the more that you engage with them and prepare them for, here's where they're going, here's how they're going to be organized. Get their input on the folder structure or the file names or the labeling or the keywords that they can search by. They will be then happier to use the new structure. They will not create their own. They will be less likely to create their own shadow repositories full of copies of documents that you don't know about. So the user engagement is bonus for you on both ends. You learn a lot about what you learn what you need to know about these documents to get them to be moved and handled correctly, and you avoid that rebellion that comes from unhappy users. So that is a very high level view of the document migration world. It is one of those devil is in the details problems because it's easy to do that with ten documents or 100 documents. It's hard to do it with 100,000 documents, but it can be done. You just have to apply the same principles. [00:19:03] Speaker A: That was a lot of stuff, Maura. That seems like a hard task, that's for sure. If you have any questions, please send us an email at [email protected] or look us up on the web at www.trailblazer.us.com. Thank you for listening and please tune in to our next episode. Also, if you like this episode, please be a champion and share it with people in your social media network. As always, we appreciate you the listeners. Special thanks goes to Jason Blake created our music. [00:19:35] Speaker B: Thanks everyone. And yes, it was a hard task, Lee, but we're not in this for easy things. That's not why we, that's not why we picked the exciting world of information governance.

Other Episodes