Artificial Intelligence, Machine Learning, Data sciences, Business Intelligence – All familiar jargons in the world today and so in trend if you look at any career path. However, underlying all of it, a beast rears its head – Data. Most people who work with data would attest to the fact that understanding data and making it usable takes up the bulk of their time.
Let’s face it, at the end of the day if an organisation has grown organically or in-organically, there will always be some data that is not straight-forward. The odd customer who you had to service when you were just starting up and therefore made tweaks in your data capture, the operational issue that you needed to solve quickly lest you risk losing customers thereby giving you a new offbeat data source, the business logic that was needed to be implemented to integrate an acquisition with your operational platforms – All cogent cases to explain why the data is in a state that it is. After all, business comes first.
Unfortunately, when the same firms try to implement new-age machine learning models or even something seemingly simple as visualisations by building a data warehouse, they go through so many data quality issues that a project to implement anything goes into development hell. Naturally, the Business lines want results and the analysts who may have just joined the organisation have no idea why that business logic was put in place several years ago – but naturally it’s biblical knowledge across the businesses.
Enter the shiny new Chief Data Officer who with good intentions tries to helm the battle, but without setting up a proper governance of data assets within the firm, can get nowhere. Acknowledging this battle, firms are now beginning to understand that navigating through data isn’t as easy as it sounds and the data office is becoming part of business lines rather than merging with technology.
Be that as it may, the implementation of a data strategy for the long-term future of the firm is not a short engagement, but the businesses want some immediate results. For someone brought in for this purpose, the lowest hanging fruit would be to clean and match the data across business lines so everyone has the same view. Naturally, you wouldn’t want your sales head quoting a sales number that your operations head doesn’t agree with.
Let’s understand why this happens in the first place. Depending on the different touchpoints the organisation goes through to service its customer, there are numerous data capture mechanisms which would not account for the entire lifecycle. For instance, the system that raises purchase orders may not tie in with the system that actually handles the physical inventory. Therein comes a break or a gap in the data which then slowly and steadily transforms into Data Quality issues.
To remedy this, many organisations spend significant time and effort doing reconciliations across their data sources be it operational or financial. In today’s age of – well the jargons in the first line – there are several rule matching and learning algorithms that would do this for you.
The effort you would save when a program does this can be invested across the business so that a seemingly non-essential activity gives larger productivity gains. This also sets you up for success in subsequent activities you would do with the aforementioned data – now sanitised. So if you’re new into it all and want to implement a data solution or a platform or even basic visualisation, first take a long hard look at your data landscape, identify the points that break your landscape and check if you have adequate controls on them. Invest small efficient processes which will help you reconcile that data across your landscape and improve data quality. Then step on to build the future – model away!