From healthcare to employee intranets, to financial services and non-profits, one of the most common challenges a growing organization faces is data sprawl. This is the phenomenon where data that should be located in one place sprawls to multiple applications and documents. For example, an organization may write part of its customer data (name, address, etc.) in a sales database application, part in an accounting spreadsheet, and part a project dossier document. Worse, it is not uncommon that the same field (e.g. address) appears in more than one of these places.
What are the dangers of data sprawl?
This redundancy leads to inefficiencies and sometimes even human error. Multiply all this by the typically large number of data fields an organization can accumulate for a customer, among many other entities, and the organization has a big challenge on its hands.
General Data Protection Regulation (GDPR) compliance is another critical consideration. GDPR is a broad set of rights to protect the data of EU citizens. It launches May 2018 with fines up to 4% of worldwide revenue for extreme violations. Research by Citrix cites data sprawl as one of the major obstacles to GDPR compliance.
How can we manage these risks?
The following steps can be taken to mitigate data sprawl. It's a good idea to set up this effort as a project in a project tracking system like JIRA, Trello, Asana, etc.
Step 1. Create a document listing all data items currently being captured across all applications, spreadsheets, and documents. A spreadsheet works well but is not a must.
Step 2. Add a column to the document from step 1 that lists the multiple possible names for any data item. For example, a data item labeled "Address" in one app can be labeled "Addr." in another.
Step 3. Work with your organization's stakeholders to prioritize the highest-value data items to simplify. Simplification can mean assigning only one writer of the data item or even deleting the data item. The team will further account for which stakeholders benefit from each.
Then, for each high-priority data item, determine whether there is more than one app/sheet/doc that the data item is manually written to. If so:
Step 4.1 Determine with stakeholders which app/sheet/doc should be the only one that data item should be written to. The rest should only read from that source.
Step 4.2 Work with stakeholders to create the action items that will get the data item to the desired state – one app/sheet/doc that gets the manual write, the rest automatically reading from that source.
Step 4.3 Determine whether the data item has a different name across apps/sheets/docs. If so, determine with stakeholders what the official name should be, and the action items to modify the apps/sheets/docs to use the official name. Further, define naming conventions with stakeholders, for example, "always use 'University', not 'U' nor 'Univ'". Place the results in an official naming conventions reference document.
Step 4.4 Create a project ticket, including:
- Tasks as defined in the previous steps
- Acceptance criteria
- The final arbiter, i.e. the stakeholder who is most impacted by the change
Step 4.5 Execute the project ticket
Step 4.6 With stakeholders, perform an impact assessment on the change. Confirm that automatic data reads working properly and stakeholders properly notified of any name changes.
Step 4.7 If this is the first iteration, perform a retrospective with stakeholders and tweak the process if need be.
Step 4.8 Go back to Step 4.1 and repeat for every data item.
The above steps will go a long way to getting better control over your data. Keep in mind that while a data item can be redundantly written between documents and applications, it can sometimes be redundantly written within a document or application as well.
If you’d like to learn more about managing data sprawl in your organization, please contact us. At Mediacurrent, we've invested in Data and Business Analytics personnel and processes to make digital transformation a reality for our clients. We’re happy to chat with you!