top of page

Do not miss new blog posts! Subscribe to new posts, news, and updates.

Thank you for joining our Blog subscription!

Data mapping and strategy

Updated: Sep 6, 2020

Many of you belong to organizations that have had varying levels of success with implementing governance solutions. If you tried with SharePoint, you have had various levels of success or lack thereof as well. Often, the governance problems that arose with older versions happened because user adoption was so fast, that ungoverned sites proliferated too easily. 


Microsoft Advanced Data Governance is a fairly recent set of capabilities from Microsoft O365 that truly makes governance practical and reasonable. Deployments can still benefit from the viral nature of cloud adoption, but you still need to spend the time building a strategy, planning taxonomy and labeling structures, and best practice work processes. This viral nature for adoption can create a panic in the planning process. 

“Now that we have over 2,000 sites and 4,000 users in our roll-out, let’s begin planning how we are going to design this thing”

The real requirement here is for accelerators in the strategy planning and design process so that you get on the train while it is still in the station. One of the strongest accelerators is examining your existing content to help quantify and qualify your design decisions. By existing content, I mean stuff on shared drives, in emails, in older versions of SharePoint, Documentum, Google cloud, Box, or even existing database structures. Examining it requires using software to build an index of your unstructured content to begin to map and analyze. You have spent decades creating this valuable content and you should use it to your benefit.


Data Driven Strategy


Microsoft has ventured into this “Compliance Check” space with some introductory analysis tools that can look for keywords and regular expressions. It is an excellent first step, and they are showing that they see opportunity to analyze content before building a O365 strategy. Please ask if you’d like more information on this Compliance Workshop offering from Microsoft, or if you want, you can go broader and deeper with an additional set of complimentary tools.


Building a proper index provides the benefit of mocking up all your content into O365 to see how information structures work. It takes a small fraction of the time it would take to try to migrate content directly. Indexing tools can plow through roughly a terabyte of content per day with a simple installation. You can then try out your queries, analytics, classifiers, regexes, and labels on your real content. You then start extracting all the intelligence to build your strategy.



This data driven process can quickly accelerate the traditional interview process and provide accurate transparency into the stuff we normally just guess on. Design your system based on the evidence of your real content rather than exclusively on the interview process and expert advice. A data map and inventory or content assessment using data-discovery software will quickly provide you the following design accelerators (taken from this recent post by Atle Skjekkeland.)


Step 1: Vision


Establish an information management vision which aligns with your business objectives.  What are the goals, functions, and drivers for our O365 environment? Organizations often have a high-level notion that security risk mitigation, enhanced collaboration, or records compliance are important drivers. Recognize that the evidence for these notions are already at your fingertips. You should be asking the following types of questions of your existing content:

  • How do you know what technical issues, risks, retention categories, security gaps, or file management practices are more important than others? Can you be specific? Do you know them all? 

  • Are you storing emails, multimedia, images, engineering drawings, or other compound documents that need special treatment? Does that require additional O365 or migration capabilities?

  • Where is content left out in the open, handled carelessly, or where was it taken from when that last breach occurred?

A strategy describes how the ends (goals) will be achieved by the means (resources) and should be based on a realistic analysis of both of those. The evidence for that analysis lies in your existing shared drives, email platforms, and old SharePoint sites. 


Step 2: Critical Success Factors


Determine critical success factors for achieving your information management vision How compliant are you with what you have done in the past? This is true of legacy ECM as well as legacy SharePoint. It is hard to know where you are going and if you are making progress if you don’t know where you are. You should be asking the following types of questions of your existing content:

  • Have users been copying content or moving content into the RM repository? Have they been sending documents or links in emails? 

  • Have you accounted for all records categories, or are there clusters of uncaptured business value in unprotected locations? 

  • What level of accuracy are we achieving with our point-forward human classification effort?

An audit of current progress is quantifiable and will help make better decisions, but only if it does not take years to complete. The evidence lies in your existing shared drives, email platforms, and old SharePoint sites.


Step 3: Use Case Requirements


Determine use cases that need to be supported by specific site functionality. What data needs to be covered when you plan, prioritize and deploy sites? You should be asking the following types of questions of your existing content:


  • What site functionality should I plan for given existing hot topics, frequent discussions, active versioning, near duplicate clusters, or unexplained duplication. One reason we find duplication between two groups is that they lack the ability to collaborate on content

  • What other IG use cases need addressing? What kind of personal data risk to we face on shared drives from CCPA, GDPR or other recent regulations?

  • Is there content available that may need to be accessed for other IG imperatives such as a divestiture, a security breech, a legal hold, FOIA request, or a data subject access request

A true data map covers many IG use cases and can get you to the right data quickly. It may or may not need to be migrated to O365 for remediation, but either way, you can find out what needs to be done. The evidence lies in your existing shared drives, email platforms, and old SharePoint sites.


Step 4: Blueprint


Determine the taxonomy foundation for success. What labels and naming conventions will enhance user adoption? When users first begin to use the new environment, they will be looking for something familiar. If your labels, terms, metadata, and folder structures are foreign to them, adoption will be slower. You should be asking the following types of questions of your existing content:

  • When do users use names, nicknames, or acronyms to create or find information?

  • How and in what format are dates used to sort, classify, or act upon information?

  • Which departments should have the biggest say in determining ownership for labels and terms

A new environment doesn’t have to be a new experience if there are familiar signposts. The evidence lies in your existing shared drives, email platforms, and old SharePoint sites.


Step 5: Plan


Establish a plan with quick-wins.  Where should we start our deployment and what is next? With O365, deployment can be a random, and ad-hoc activity. To govern it well, you want to make sure everything is tied down from a governance perspective. The deployment plan will require several resources. You should be asking the following types of questions of your existing content:

  • What content types reside on what shares, storage devices, or platforms, and how much effort will be involved in moving that data to O365? 

  • Of the 40,000 SharePoint sites I already have, which ones have no unexpired records in them? 

  • Which of my 6,000 shares have accounting data versus HR data and where should I start

  • Which workgroup should move to the new system first to provide the greatest benefit?

The details necessary to make informed decisions about when these new capabilities will impact which users needs to be based on eyes-wide-open decisions about reality. The evidence lies in your existing shared drives, email platforms, and old SharePoint sites.


Step 6: Business Case


Determine the business benefits of change. How much will the enterprise really benefit form a move to ADG in O365? Time and motion studies, surveys, and interviews are useful to be able to prioritize and quantify the benefits, but that can take a long time to complete. You should be asking:

  • How many Personnel Action Forms and other HR documents were processed last year? 

  • How long will it take and with what content do we consider populating our O365 environments? 

  • How many users actually create proposals and specifications and is there a high season? 

  • How quickly are different types of content growing and how does that impact storage requirements?

A comprehensive cost justification analysis holds greater weight if the numbers used are based on actual data rather than just perceptions. The evidence lies in your existing shared drives, email platforms, and old SharePoint sites.

If you build an index of all your pre-O355 ADG content, the answers to these and many more questions are available to you as evidence to support your design decisions. 

Equally important, and I will expand on this in my next blog entry, you now have the capability to drastically improve the migration effort to populate content into O365 by:

  • Removing unnecessary content (eTrash or ROT) to save massive amounts of time in migration

  • Excluding content from migration that already should have expired

  • Quickly and easily developing KQL and classification elements to be configured into your O365 ADG environment

  • Applying and validating metadata and labels to content to ease the load on O365 dealing with legacy migrated data

  • Grouping and clustering content into site-appropriate packages, even though the content comes from a wide variety of platforms. 

Immediate access to all this information will help inform your decisions to move to O365 and allow you to achieve accurate results commensurate with the urgency and importance of O365. Our experience and use of advanced IG indexing capabilities and software should be an important part of your program. Please reach out if you are interested in hearing more about our indexing and strategy offerings for O365. 

167 views0 comments

Recent Posts

See All

© Infotechtion                                                                                                                                                       Privacy Policy

bottom of page