Let me start by saying that this is not an article about big data. While the source of big data is external to your organization, it is a topic of its own. Many of the concepts and approaches discussed will definitely apply to your big data initiatives, but that won’t be the focus of this article.
Clients ask me all the time, why? Why do we need to upgrade? We are happy with the way the software is working? if it ain't broke, don't fix it!
The move to the cloud is fully in force, and therefore the variety of organizations’ integrated knowledge in hybrid environments has multiplied two-fold. Nearly three-quarters of respondents’ integrated knowledge in hybrid and cloud environments claimed poor knowledge quality in cloud services, restricted API access, company security and compliance policies as being among the key problems in implementations. The largest issue among all was an absence of data and skills contained inside their organization’s IT departments on the way to integrate with cloud services.
Data Generation, Analysis, and Usage - Current Scenario
Last decade has seen an exponential increase in the data being generated from across traditional as well as non-traditional data sources. International Data Corporation (IDC)report says that, data generated in the year 2020 alone will be a staggering 40 zettabytes which would constitute a 50-fold growth from 2010. The data generated per second has increased to 2.5 Quintillion bytes and with the advent of latest innovations like the Internet of Things; it is poised to grow even more rapidly. This increase in data generation coupled with growing ability to store various types of data that is being generated has ensued in a vast repository of data which is now available for scrutiny.
The objective of most MDM Hub projects is to establish a trusted source of master data. In addition to the right vendor, appropriate partner, and an efficient implementation plan, it is very important to come up with the right integration strategy. An organization's existing eco-system will typically consist of different source systems and integrating all of them with the new MDM system becomes a huge task in itself. Any small misstep here would lead to delays, cost overruns and substandard or missing data. In turn, sponsors will lose confidence in the MDM hub as a trusted source to be integrated with downstream applications.Net result: the ROI will seem a lot less attractive.
Data Quality - Overview
Corporates have started to realize that Data accumulated over the years is proving to be an invaluable asset for the business. The data is analyzed and strategies are devised for the business based on the outcome of Analytics. The accuracy of the prediction and hence the success of the business depends on the quality of the data upon which analytics is performed. So it becomes all the more important for the business to manage data as a strategic asset and its benefits can be fully exploited.
Data Quality is the buzz word in the digital age.
This blog post lists out some of the core concepts of Data Quality assessment, perception changes in data quality, data quality management over the years, use of data quality tools to parse, standardize and cleanse data.
Master Data Management (MDM) is no longer a “fast follower” initiative but is now a generally accepted part of any information management program. Many enterprises have well established MDM programs and many more are at the beginning stages of implementation. In order to be successful with MDM you need continuous insights into that master data itself and how it is being used otherwise it is impossible to truly manage the master data. An MDM dashboard is an effective tool for obtaining these insights.
Data Quality - Overview
Data Quality is the process of understanding the quality of data attributes such as data types, data pattern, existing values, and so on. Data quality is also about capturing the score of an attribute based on some specific constraints. For example, get the count of records for which the attribute value is NULL, or find the count of records for which a date attribute does not fit into the specified Date Pattern.