Discovery and Classification: Make or Break your Data Governance
Michael Queenan, CEO, and co-founder of Nephos Technologies explains the importance of discovering and classifying your data to create an effective data governance strategy.
Since the implementation of GDPR in 2018, effective data governance has been essential. With the real possibility that the UK will introduce a new data protection law following Brexit, businesses must ensure their houses are in order and be ready to adapt quickly to remain compliant with any new legislation.
Much more than a tick box exercise, there are many benefits to be gained from an effective data governance strategy. However, as a result of needing to ensure compliance, especially when new legislation is imposed, many businesses will fail to lay the foundations required to get the most out of their data governance. This is a big mistake. This is the important groundwork that will transform processes, decision making, and performance – and it should not be overlooked.
Before embarking on a data governance strategy or project, every organization must ask itself the question, what data do we hold and where it is stored? Put bluntly, how can you efficiently use, manage or protect your data assets unless you know what and where they are?
Whilst this may seem like an obvious point, it is surprising how many organizations do not know the answer. This task has undoubtedly been made more difficult by the growth of outsourced cloud computing, ‘as-a-service’ tools, and the general complexity of modern technology infrastructure. However, without the important insight, the answer to that question unlocks, it is nearly impossible to derive business value from data, and very difficult to establish good data governance.
However, two easy steps can transform your data governance strategy and set you on the road to success. Data discovery is the first obstacle to overcome on the road to effective governance. The process of locating all your data and knowing where it is stored is crucial.
This requires an effective software solution with the ability to connect to data sources of any type and to identify data assets, wherever they reside. Without this capability, data governance projects can be seriously undermined from the outset. For instance, a security or privacy breach relating to an unsecured data asset brings a range of potentially serious governance and regulatory implications.
During the process of discovery, many of the software solutions used to aid this process will apply advanced analytics. This allows patterns to be detected that can give valuable insight into your data and answer highly-specific questions about your business.
The second, but just as important step, is data classification. This process accurately identifies each asset, and as a result, allows the appropriate levels of protection to be applied. Ideally, organizations will apply intuitive classification types based on well-understood and defined rule sets, such as GDPR and CCPA sensitivity of Personal Information (PI) and Personal Identifiable Information (PII).
However, this is often the downfall of effective data governance. Many organizations don’t understand the difference between PI and PII, thus data gets wrongly classified, the correct level of protection is not applied and businesses remain none-the-wiser about what data is specific and relevant to the business.
The difference between the two is simple, but can easily be mixed up – essentially, a specific person cannot be identified from PI data, but PII data can be used to verify a particular identity. As a result, PI data isn’t generally responsible for governance violations, but some tools fail to filter this data out. This makes the process more difficult for users, who then have to manually isolate the more sensitive PII data to ensure it is classified correctly and protected in the right way.
It is only after both these stages have been completed that organizations should start making decisions about the level of security that should be applied to protect each particular data set. At this point, some teams apply security technologies, such as least privilege management, to every dataset they own – irrespective of its categorization.
Although zero trust is an effective method to ensure strong security and protection, it is a very expensive option if applied to all datasets. Such high-security solutions should be saved for the most important data. This will allow organizations to benefit from the necessary data protection in a cost-effective manner.
Read More:
- Smart Technology to Address the Data Centre Energy Drain
- For boardrooms the future is bright, the future is data
- 5 Important data privacy trends for 2022
- Backing up data is a workforce-wide responsibility
Some organizations choose to make the distinction between the most important data based on which business functions create and hold the most sensitive data, such as the finance and HR departments. This works to a certain extent but it disregards the likely possibility that other parts of the business also produce important data sources, if not in the same quantity as others. Taking this approach can result in potentially serious governance blind spots.
To avoid this, taking the time to ensure effective discovery and classification takes place will give organizations a much better view of where to allocate their security and data protection budgets. This approach will create a strong and effective data governance strategy that will ensure robust compliance as well as deliver the necessary insight and intelligence to derive business value and drive innovation.
Click here to discover more of our podcasts
For more news from Top Business Tech, don’t forget to subscribe to our daily bulletin!