FROM DATA TO INSIGHT WITH MDW
Frequency of data silos
Nowadays, many companies use several operational applications across the whole organization: for example, a CRM for sales department and an ERP for the financial department. Most of the time, these applications have an underlying database with information in another database. Based on our experience (and we have to say, very often), this information is also found in heterogeneous files and databases used for business analysts to create ad-hoc analysis. In some organizations, the same data are duplicated dozens of times and have their life cycle.
Figure 1: Data silos in a company
Data silos harm productivity
Do not misunderstand us. An organization is always divided into departments. It is natural for each department to want to keep control of their data because of their different goals, priorities, and responsibilities. The problem with data silos is that they harmproductivity. They are also responsible for creating confusion, as it is unclear where the most recent version of the data is located. Data silos can also pose a problem when we start integrating data from systems that were not originally intended to work together.
Figure 2: Several database applications can contain the same data
Data silos waste resources
Integrating these different systems and databases is a challenging task. As a company specializing in data management, we have worked on many projects in which companies invest substantial resources and money to obtain an integrated data model, which nevertheless runs the risk to needing to change a non-negligible part of the project if a new application or database is added. One way to avoid this difficulty is to create a shared data model that respects the good old pattern: a single version of the truth.
The shared data model, also known as the Common Data Model, can help organizations to simplify the solution architecture to gain efficiency and clarity in order to access the latest version of the data. The value and usage of this model in the company is thus easy to understand.
Figure 3: Common Data Model
Common Data Model as a solution
The Common Data Model (CDM) is a shared data model that brings together the common data shared between applications and data sources. In Microsoft Word, it is the shared data language for business and analytical applications. It consists of a set of standardized, extensible data schemas published by Microsoft and its partners to enable data consistency across applications and business processes.
Microsoft, SAP, and Adobe Systems have partnered to create the Open Data Initiative to deliver a single data model that provides business insights based on behavioral, transactional, financial, and operational data. Each application,including Dynamics 365, Office 365, Power Platform, Adobe Experience Platform, SAP ERP, and SAP BW can deal with the CDM.
The CDM includes over 340 standardized and predefined schemas, including:
Entities
Attributes
Semantic metadata
Relationships
The schemas represent commonly used concepts and activities such as Accounts, Contacts, and Resources.
The CDM reference entities are categorized into subject areas (see diagram below) for easy discoverability and published in the public GitHub repo with supporting documentation (visit https://github.com/Microsoft/CDM). These entities have evolved over the past few decades with thousands of hours of investment from Microsoft as well as the partner eco-system (SAP and Adobe Systems).
Figure 5: Common Data Model entities
Where to start?
There are several ways to start working with the Common Data Model (CDM), but as consultants, we often observe two patterns that are commonly repeated when we beginworking on a new project
Dataverse application development
The first pattern is reusing the entity definitions available in the CDM instead of building a new data model for the application. The CDM can be used by various applications and services, including Microsoft Dataverse (previously known as Common Data Service), Dynamics 365, Microsoft Power Platform, and Azure, thus ensuring that all services can access the same data.
By using Dataverse, you can start application development using the CDM with built-in business logic, security, and integration. There is no need to consider a new data model for your new app, as the CDM structures your data in a standard format so that you can use, share, and analyze it more easily.
Integration with Microsoft Power Apps
Concretely, if you need to create entities for your new application, you can pick those that you need in PowerApps:
Figure 6: PowerApps studio – Add entities to your app.
And afterwards, entities will be available based on the selected categories.
Figure 7: PowerApp studio – Mapping attributes
The CDM also provides an easy and customizable business database, which can be used to power your new business applications. In PowerApp Studio, you can point to the CDM by adding it as a new data source in your app.
Figure 8: PowerApp studio – Connect to existing CDM.
Integration with Power BI and Azure DataLake Gen2
The second pattern that we often see with our clients is when Power BI and Azure Data Stack interact with the CDM. As already mentioned, the CDM provides a shared data layer, and this concretely translates into a specific data file structure stored in Azure Data Lake. The CDM definitions are open and available to any service or application that wants to use them.
Figure 9: Common Data Model architecture with Azure services
A CDM folder has the following appearance:
You can check the definition of all these files at the following link: https://docs.microsoft.com/en-us/common-data-model/data-lake. Keep in mind that correctly modeling the CDM folder is one of the main pillars of the CDM architecture.
This data file structure can be viewed as a rich semantic data layer for applications such as PowerBI and Azure Data Stack. You can leverage the power of the data-preparation capabilities of these components using:
– Power BI dataflows to ingest key analytics directly from the CDM folder.
– Azure Data Factory, which supports native inline CDM datasets (as source and destination)
– Azure Databricks formats and prepares it for the later steps and then writes it back to the DataLake.
– Azure Machine Learning trains and publishes a machine learning model that can be accessed by Power BI or other applications.
Did you know?
You need to correctly design the CDM to benefit all the existing accelerators. When we write this blog, for instance, seven Microsoft Industry Solution Accelerators are used: Healthcare, Financial Services (including Banking and Insurance), Manufacturing, Media and Entertainment, Nonprofit, Automotive, and Education (including Higher Education and K-12). You can access the updated list at the following link: https://docs.microsoft.com/en-us/common-data-model/industry-accelerators.
Power BI dataflows are both a data producer and data consumer of the CDM. Power BI dataflows can write data in CDM folders in Data Lake Storage Gen2 and read data in the CDM folder format. Power BI dataflows also offer an experience to map your data to CDM standard entities through the mapping transformation in Power Query Online.
How can we help you?
As already mentioned, correctly modeling the CDM folder is one of the main pillars of master data management. To achieve good results and generate a successful project, your company will have to:
1. Create a map of the concerned systems;
2. Democratize this notion of data excellence within your organization;
3. Define roadmap, roles, and responsibilities (governance);
4. Start the implementation of such a project while keeping a holistic view;
5. Have the necessary skills and/or to be accompanied by a specialized partner.
At MDW, as a triple Microsoft Gold Partner, we are a group of data engineers with extensive expertise incloud technologies; we help our clients define their cloud strategy, design their modern datawarehouse architecture, and implement all the necessary bricks.
Would you like some advice to help you implement a common data strategy for your organization? Go ahead and make an appointment for a free consultation call when it best suits you.