Data Sprawl is the uncontrolled increase of data that’s produced by organization. It’s a natural course for data to get larger (i.e. newer cameras have better resolution) but some data growth is unnatural and excessive. The good news is that data sprawl is preventable with a little bit of know-how and effort.
Unchecked data sprawl has two negative consequences: cost and security.
Cost of Data Sprawl
While it’s cheap to purchase storage these days (a Terabyte hard drive is around $ 60.00) keep in mind that enterprise grade storage is expensive. High end servers incorporate high velocity, high transfer rate and high reliability drives that often cost 5-10 times that of lower grade drives available at your nearest office supply store. In addition, data has to be backed up which means additional storage especially if there is a requirement for long term retention. The longer the retention period, the greater the cost. In a typical enterprise, it can take 2-3 times more storage to properly back up and retain data to meet retention policies.
Security and Data Sprawl
One of the concerns of data sprawl is security. As people copy and paste data from centralized storage on to local media and cloud storage as a means of portability, that data becomes harder to track and control. When employees leave, their data may not be properly purged from their personal phones, on-line accounts or cloud storage accounts. As the cloud makes it easier for non-technical personnel to implement Shadow IT projects, data is extended beyond the corporate boundaries where it may not be secured or properly retained. Sometimes data exists outside the corporate boundaries and it is unknown to managers, IT personnel or compliance officers.
Now that we know some of the negative effects of data sprawl, what can we do?
For starters, educate your employees about data sprawl. If they understand that the resources required to store, back up and retain data are costly, they may make a greater effort to reduce data sprawl. Ask them to use common sense over efficiency, for example: I have had people e-mail me high resolution photographs of computer error codes that were less than 5 words long. Instead of a 5Kb file, they sent a 5MB file! This is efficient for them, but inefficient for the organization’s resources.
Give users tools that can help them achieve their goals while helping them avoid data sprawl, and teach them how to use those tools. I met with an organization once over an Exchange server whose database had grown out of proportion. Users were sending large attachments to each other via internal e-mail groups. Group members were editing and re-sending the file back to the same group members. Several file revisions were stored in each of the group user’s mailbox! Amazingly a local network share was available but the group was just following one user’s lead who brought the e-mail file sharing habit over from a previous employer. A little training made a big difference.
Create written policies to prevent Shadow IT. Dictate where data can be stored, how it should be classified and protected, and train your employees on how to prevent data duplication and superfluousness. Give copies of the policies to all your employees and ask them to adhere to those policies. It’s good to trust but always verify, Behavior Analysis and Data Loss Prevention software is a great way to back up policies with physical constraints and to ensure that sensitive data is not being mishandled or stred irresponsibly.
With some effort and oversight, organizations can better manage their data which helps reduce storage costs, the liability of regulatory fines and the stress and embarrassment of data breaches and/or data loss.