The need for governance in Big data


By Tomas Muller –

Big data is becoming an everyday part of doing business and this new area of information and analytics brings with it new challenges and risks that need to be safeguarded against with appropriate governance measures. This is vitally important for the protection of the integrity of the information itself as well as for the company, its future and its shareholders.

Big data is differentiated from ordinary data or small data most popularly in definition by the three Vs: Volume, Velocity and Variety. Some will argue that some additional ‘V terms’ are part of the definition such as veracity and value. However, whatever definition is applied, the specific nature of the three original above mentioned Vs have led to and will continue to lead to further big data governance challenges. The need for governance of big data business analytics is being expressed across all industries and across the globe by all involved parties. It is vital that a pre-crisis trend of ignorance and lack of accountability and competency is prevented from growing and persisting in this new area of work. The hard lessons of not thoroughly addressing risks in a timely manner have been felt internationally at all levels of society. Now that big data is starting to permeate through all parts of business it is crucially important that it is properly governed. Proper information governance is consequently becoming a discipline entirely in itself within many organisations.

Big data governance will need many facets if it is to be effective. Primary concerns include those about data integrity and quality, in combination with the challenges of devising and implementing processes and strategies to achieve proper data governance of big data.

Volume is the biggest challenge currently and is the real, and most troublesome, challenge that most companies are trying to get to grips with. The cost and technical complexity of storing large amounts of data was the limiting factor in the past as to how much data a corporation may be able to store and analyse. Although it may seem that the advances of technology have led to improvements on these fronts, the challenge still remains for companies of how to store big data in reality and how to ensure all the excess amounts of data are maintained to the same criteria of data quality as in the pre- big data stage. Proper governance needs to be applied to ensure a certain quality of data is maintained even with increasing amounts.

The quality of the data can often vary with the information governance that has been applied to the original source and safeguards and monitoring should be put in place to protect against any shortfalls in this area. Even if in a given organisation the information governance program is effective and thorough this does not substantiate the validity and integrity of other contributing external data sources. When newer, less structured data sources are combined with established systems it can be difficult to keep track of the integrity of the information. Often multiple sources will be merged together quickly to meet business deadlines and poor data quality sources can end up contaminating the big data set resulting in an entire set of untrustworthy and corrupted data. All consequent findings and analysis based on this polluted data will be inaccurate and corrupted also. This is why proper governance of big data is absolutely essential. The long term costs to business and investor returns as well as a company’s reputation can be far more damaging than the time and cost invested in ensuring proper governance of big data takes place.

The third V – variety – is an attribute that also makes data governance challenging. The explosive growth in the amount of unstructured data that organisations need to generate, organise and store, in particular currently data sourced via web and social media channels, has meant the task of handling a variety of forms of big data is complex and is increasing in complexity as time goes by. The proper transition from the traditional structured types of data that companies are used to handling to new unstructured forms of data is an important and necessary part of ensuring organisation-wide data quality and governance is adequately maintained.


Photo Accreditation – Ron Mader