Technology World: Master Data Management

Showing posts with label Master Data Management. Show all posts

Friday, July 17, 2020

What is Build Match Group (BMG) in Informatica MDM?

Are you looking for details about the Build Match Group (BMG) process which is used in Informatica MDM? Are you also would like to know when the BMG process gets executed? Would you be interested in knowing how to control this behavior? If so, then you reached the right place. In this article, we will discuss the BMG process in detail.

What is the Build Match Group (BMG) Process?

The process by which redundant matching records are removed from the match set prior to the consolidation process is called the Build Match Group (BMG) process. It is a very important process for the matching process and plays vital role in Informatica MDM jobs.

How does the Build Match Group process removes the record?

Let's assume that the BMG indicator is on then in such a case if we run a match job then it will remove one of the symmetric matches from the manual match pairs.

e.g.

Let's consider the records below

Pair 1: 'Bob Paul' is matched with 'Robert Paul' with match rule number 3

Pair 2: 'Robert Paul' is matched with 'Bob Paul' with match rule number 5

As we know that the automerge_ind is set 1 for the matching pairs if records matched through auto-merge rules. The BMG process will trigger if all the records are matched with manual match rule then the BMG process will take effect. However, few records matched with the auto-merge rule, and few records matched with manual merge rule than one of the symmetric match entries will be removed from the match table.

When does the BMG process get execute?

There are two jobs during which the BMG process executed.

1. During Match Job: BMG process get triggered during match process if we enable 'BMG on match indicator' property.

2. During Merge Job: BMG process always gets executed during the merge job. There is no option to turn ON and OFF during the merge job.

What is impact of the BMG process on Manual match records?

There is no impact due to the BMG process on manually matched records. BMG process only applicable for auto-merge jobs i.e. AUTOMERGE_IND is 1 in <BASE_OBJECT>_MTCH table and we also need to enable Base Object for BMG process.

How to enable the Base Object for the BMG process?

In order to enable the Base Object for the BMG process, we need to update the C_REPOS_TABLE table for the BMG_ON_MATCH_IND field. If value of BMG_ON_MATCH_IND is 1 then BMG is ON, if the value is 0 then BMG is OFF for the given table.

Here is sample sql statement to update this field-

update C_REPOS_TABLE set BMG_ON_MATCH_IND=1 where table_name='<TABLE_NAME>'

Important note: Restart the application server with clearing the cache after making the above change.

Thursday, July 9, 2020

Best Practices for Elastic Search in Informatica MDM

Elastic Search a search engine that is based on the Lucene library is used in the Informatica MDM in order to achieve free text searches like google as well as a fuzzy search like match engine search. In this article, we will understand what are the best practices which we need to follow in order to implement Elastic Search using the Informatica MDM solution successfully.

Introduction

It is vital to follow best practices while integrating Elastic Search with Informatica MDM. Some minor configuration may lead to expensive performance cost. The best practices provided here helps not only to achieve better performance but also for better search results.

Elastic Search Best Practices

Here are the details about the Best Practices

1. Indexing Job Execution

If we enable searchable properties for Base Object tables including lookup table then we need to run indexing job for lookup table first then followed by indexing job on remaining Base Object table.

2. Indexing Job execution for all tables

If we have configured Searchable property for parent and child tables e.g. Party table, Party Phone table, etc. Then we need to run an indexing job for all the tables. First, run the indexing job for Party table and then run jobs for child tables

3. Facets configuration
Facets are used for pre-emptive grouping of the records. We need to use a limited number of facet fields as it has an advance impact on the performance of search functionality. We also have to make sure the fields for which we need to configure facets are having low entropy. Low entropy fields have a low set of unique values.

4. Unused Business Entities
If there are unused Business Entities with searchable properties then delete those as it will cause performance issues for indexing and load jobs.

5. Index Auto commit property
We need to increase the value of the auto-commit property and keep it optimum based on your environment configuration. The property es.index.refresh.interval can be used to set it

6. Indexing jobs in parallel

We should try to avoid running indexing jobs in parallel as that may cause resource exhaustion.

7. Running load jobs in parallel

If we have configured searchable on multiple tables such as Party and Address tables then do not run load jobs for these tables in parallel. This is because during load job indexing job get executed and may lead to resource exhaustion scenario and job will fail.

8. Deleting indexes

The CleanTable API will not delete the indexes, we need to manually delete it if required. However, in case you still would like to delete the indexes then we need to use the curl command to execute Elastic Search APIs to delete those. As of now, there is no Informatica API to handle this use case.

9. Limiting the number of searchable fields for Business Entities

We have limitations on how many searchable fields we should use for the Elastic Search document. By default 50 number of nested fields are allowed in Elastic Search. Apart from it, there is a limit on the amount of data is required for Elastic Search REST calls. The limit is 104857600. So make sure less number of searchable columns are configured for the Business Entities.

Learn more about Informatica MDM here -

Monday, June 22, 2020

What is the future of Master Data Management?

At present, Master Data management (MDM) has become the core project of any organization. The various industries such as banking, healthcare, insurance, telecommunication, manufacturing, and logistics, etc realized that with the implementation of MDM, businesses can achieve better growth in the competitive market. In this article, we will explore the future of Master Data Management. So let's start.

A. MDM with Cloud Solution

The MDM vendors such as Informatica, Reltio, IBM provides cloud solutions. However, the companies who are using these solutions are criticizing about growing cost of cloud and control aspect of it. The initial cost of the cloud solution implementation is less compared to in house MDM implementation. As data is a growing asset and it leads to more usage over time. Cloud solution cost is directly proportional to usage and hence cost cloud MDM solution increases drastically over the years. The infrastructure is owned and managed by the product vendor and we need to rely on the vendor for infrastructure issues. These issues are not limited to quarterly or monthly upgrades, server maintenance, emergency bug fixes, server crashes, major product releases, etc.

Even though with having these concerns, companies are still moving forward to use cloud MDM solutions and the reason is the cloud solution provides more sustainability. With recent pandemic, it is proved that businesses with cloud implementation survive better than in house solutions. There is no doubt, cloud solutions will be used by all the applications in the near future.

B. Artificial Intelligence and MDM

Artificial Intelligence (AI) is a buzz word in the current market. The MDM solution which has AI components will have better survivorship compared to one which does not. With recent releases, Informatica MDM has used AI features for small components in the data steward user interfaces. This tells us that the MDM solution components have started looking AI aspect more seriously. Many business intelligence applications are used to capture, store, access, and analyze data to assist business users in making better decisions. AI with business intelligence will create another world and MDM will be part of it.

There is a great scope for improvement in MDM solutions. AI can be used in extracting and transporting data from source to landing area and from landing area to MDM system. This will reduce the development, testing, performance tuning and deployment time. The cleansing and standardization heavily rely on manual configurations. If AI is leveraged then this manual effort can be reduced to a great extend. Another aspect where AI can be used is customer matching. Currently, many vendors use their proprietary match engine to identify and match customer records. Identifying and matching is an iterative process that takes a long time spanning from few months to few years, in some cases it is a never-ending process. If AI is used to identify and match the records then it will help business users as well as stakeholders to achieve their business goals.

C. Smart MDM and User Interface

The user interfaces (UI) used with the MDM solutions are developed with the technologies which are more stable. This is because of new features and smartness comes with newer versions and is hardly replicated in these interfaces. In many cases, we have noticed that the decade-old source code has never been touched in the MDM user interface. Most of the programming languages such as HTML5, JavaScript, Spring, Java, Python, R2, etc are evolving with great space. The future will not be far when these technologies will be self-improving the use of better infrastructure and intelligence. If these user interfaces needed to be survived in the global market then these need to use smartness in the applications. The end users are capable of handling these advanced features in doing daily routine work.

The end goal of these smart features is to make end users experience not better but the best. The main challenge in the current environment is these user interfaces are not self-explanatory. We have to spend much of the time in training business users. UI can be improved to accept voice and touch commands and in some cases, UI should be smart enough to take its own decisions. This way it will improve productivity and ultimately the profitability.

D. Quicker and Simpler

With the development of data processing technologies, we are able to achieve better improvement in the data processing. However, we see that it take a day to a month to perform initial data load from the source system to MDM systems depending on the volume of the data. This is a situation while dealing with gigabytes or terabytes of data. What will happen if we need to handle exabyte, zettabyte, or yottabyte data in the future? We need to think through now itself about handling future growth of the data within the stipulated time. 30 days of time is going to cost heavily as the value of time is growing at a faster pace. The value of 1 hr from now will be higher in comparison with the value of 1 hr now.

Most of the underlying technologies such as databases, JVMs are not improving in faster processing than what expected. MDM is heaving dependent on these technologies. If underlying technologies improve over time then MDM solutions will be improved automatically else MDM vendors need to come with their own underlying technologies in order to sustain in the future.

E. Increase in Cost - Increase in Value

Due to advancements in technologies such as AI, Cloud computing, etc. the cost of the MDM solution will go high. As it will use extensive data and time for research and solutions. Having said that those increased costs can be explained by the increase in the value of the smart MDM solution.

With the smart MDM approach, we will be creating sustainable, profitable, and future proof solutions that will benefit end customers as well as businesses. The smart MDM is not far!

Saturday, June 6, 2020

Top 10 new features in the Informatica MDM 10.4

Are you looking for an article that will provide detailed information about the new features in Informatica MDM 10.4? Are also would like to know what are the components changed for MDM 10.4? If so, then you reached the right place. In this article, we will discuss what are the new features introduced in the Informatica MDM hub, Provisioning Tool, Customer 360, or Entity 360.

A. MDM Hub features

1. MDM hub login screen

When user access MDM hub URL https://<server name><port>/cmx, the .jar file will be downloaded instead of JNLP file. Once the user double clicks on .jar file, it will open the login page. From MDM 10.4 onwards you can provide connect URL in login pages. So do not have to download .jar for each environment.

B. Provisioning Tool features

2. Match Rule Sets in Provisioning Tool
The new feature Match Rule Sets is introduced in the Provisioning Tool. Using this feature we can perform match tuning activity with help of business users.

3. ElasticSearch customization

We can customize ElasticSearch properties such as Tokenizers, Token Filters, Character filters, and Analyzer using the Provisioning Tool.

4. Hyperlink configuration

Prior to MDM 10.4, there was no provision to configure hyperlinks in Entity 360 or Customer 360 applications. With this latest upgrade, we can configure hyperlinks for Email, Web, etc fields.

C. E360/Customer 360 features

5. New Hierarchy views

With MDM 10.4, Hierarchy configuration has changed, and also the look and feel of hierarchy different than the earlier versions.

6. Chart Components

More controls are provided in MDM 10.4 for the configuration of charts in Entity and Customer 360 applications.

7. Multiple Task handling
With newer version of Informatica MDM, you can assign, claim or edit multiple tasks in a single request

8. Find and Replace
The update operation in Entity 360 and Customer 360 application became easier. We can update multiple records in a single request using Find and Replace functionality.

9. Bulk Data Import

The bulk import functionality improved by adding artificial intelligence to it. The mapping of source and target fields automatically done by using artificial intelligence.

10. Ad hoc Matching
Using Ad hoc matching, we can match records dynamically and make a golden copy of it.

Learn more about these features in detail here -

Monday, June 1, 2020

Informatica MDM - Sample requests using Business Entity Services

In this article, we will understand various sample requests using Informatica Business Entity Services. The sample requests are prepared using PKEY_SRC_OBJECT and ROWID_OBJECT values. The server name, port, and other parameters need to be updated specifically to your project.

1: Get request using PKEY_SRC_OBJECT
https://localhost:8080/cmx/cs/orcl-CMX_ORS/Customer/SFA:PKEY_000001?systemName=SFA&depth=5&suppressLinks=true

2: Get request using 'query' parameter
https://localhost:8080/cmx/cs/orcl-CMX_ORS/Customer?systemName=SFA&depth=2&suppressLinks=true&action=query&filter=CustomerAdditionalInformation.xslpTaskId={{SFATaskNumber}}

3: Create a new customer
Endpoint: https://localhost:8080/cmx/cs/orcl-CMX_ORS/Customer?systemName=SFA&depth=2&suppressLinks=true

Request Body:

{
"key":{ "sourceKey": "PKEY_00000021" },
"fullNm": "Sample Customer 123",
"fstNm": "KC First Name",
"lstNm": "KC Last Name",
"partyType":"",

"CustomerAltIdSMPLCustId":
{ "item": [
{
"key":{ "sourceKey": "PKEY_0000002~SMPLID" },
"SMPLCustId":"123"
}
]
},

"CustomerAltIdSMPLCustUniqId":
{ "item": [
{
"key":{ "sourceKey": "PKEY_0000002~SMPLUniqId" },
"SMPLCustUniqId":"123"
}
]
},

"CustomerAltIdContractId":
{ "item": [
{
"key":{ "sourceKey": "PKEY_0000002~CNTRID" },
"ContractId":"123"
}
]
},

"PartyAddress":
{ "item": [
{
"key":{ "sourceKey": "PKEY_0000002-SHIP" },
"xaddrtyp":"Shipping",
"xaddrln1": "123 ST",
"xaddrln2": "",
"xcity": "New York",
"xcntrycd": "US",
"xstprvnc": "NY",
"xpstlcd": "10101"
}
]
},

"CustomerAlternateNames":
{ "item": [
{ "key":{ "sourceKey": "PKEY_0000002~FORMAL" },
"altNm":"FORMAL_1234",
"xprtyAltNameType":"Does Business As (FORMAL)"
}
]
},

"CustomerPhone":
{ "item": [
{ "key":{ "sourceKey": "PKEY_0000002~PHONE" },
"phnNum":"123456890",
"phnTypeCd":"Telephone"
}
]
},

"CustomerPartyParentRel":
{ "item": [
{ "key":{ "sourceKey": "PKEY_000001~PKEY_0000002" },

"CustomerPartyParent" : {
"key":{ "sourceKey": "PKEY_000001" }
}
}
]
}
}

4. Update customer using rowid_object
Endpoint: https://localhost:8080/cmx/cs/orcl-CMX_ORS/Customer/rowId?systemName=SFA&suppressLinks=true&depth=2

Request Body:

{
"rowidObject":{{SFDCMDMID}},
"fullNm": "Integration Update Test 890",
"fstNm": "Samle first Name 890",
"lstNm": "Samle last Name 890",
"AccntStsCode": {
"accntstscd": "CUST"
},
"$original": {
"fullNm": "fullNm",
"fstNm": "fstNm",
"lstNm": "lstNm",
"AccntStsCode": {
"accntstscd": "CUST"
}
},
"CustomerAltIdTaxId": {
"$original":{"item":[null]},
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~TAXID"
},
"taxId": "tax01"
}
]
},
"CustomerAltIdABC_CORPOfficeNumber": {
"$original":{"item":[null]},
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~ABC_CORPRTLSNUM"
},
"ABC_CORPOfficeNum": "ABC_CORPOffice01",
"xaltIdType": "ABC_CORP Office Number"
}
]
},

"CustomerAltIdGovLicNum": {
"$original":{"item":[null]},
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~GOVLICID"
},
"GovLicNum": "tc001",
"xaltIdType": "Gov License Number"
}
]
},

"CustomerPartyParentRel": {
"$original":{"item":[null]},
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~HeadQuarter"
},
"xprtyPrnt":8
}
]
}
}

5: Update customer using PKEY_SRC_OBJECT
Endpoint URL: https://localhost:8080/cmx/cs/orcl-CMX_ORS/Customer?systemName=SFA&suppressLinks=true&depth=2

Request Body:

{
"key": {
"sourceKey": "PKEY_0000003"
},
"fullNm": "Integration Update Test 6",
"fstNm": "Samle first Name 6",
"lstNm": "Samle last Name 6",
"xhastbcclcnsflg": "1",
"xnsleffctvdt": "2015-08-19T00:05:31.630+05:30",
"xnslpblshddt": "2015-08-19T00:05:31.630+05:30",
"AccntStsCode": {
"accntstscd": "CUST"
},
"$original": {
"fullNm": "fullNm",
"fstNm": "fstNm",
"lstNm": "lstNm",
"AccntStsCode": {
"accntstscd": "CUST"
}
},
"CustomerAltIdTaxId": {
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~TAXID"
},
"taxId": "tax01"
}
]
},
"CustomerAltIdABC_CORPOfficeNumber": {
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~ABC_CORPRTLSNUM"
},
"ABC_CORPRtlSoreNum": "ABC_CORPOffice01"
}
]
},
"CustomerAltIdGovLicNum": {
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~GOVLICID"
},
"GovLicNum": "tc001"
}
]
},

"CustomerPartyParentRel": {
"item": [
{
"key": {
"sourceKey": "PKEY_0000003~HeadQuarter"
},
"CustomerPartyParent": {
"key": {
"sourceKey": "PKEY_0000005"
}
}
}
]
}
}

Wednesday, April 1, 2020

Top 5 indicators in the Informatica MDM

Are you looking for details about what are the different types of indicators used in the Informatica Master Data Management (MDM) system? Are you also interested in knowing what are the valid values for these indicators and what those values mean? If so, then you reached the right place. In this article, we will understand different indicators such as HUB_STATE_IND, CONSOLIDATION_IND, etc and their values in detail.

Indicators in the Informatica MDM:
Informatica MDM maintains several types of indicators and those are used during internal MDM processing. The indicators maintained in the MDM system are

1. HUB_STATE_IND
2. CONSOLIDATION_IND
3. DIRTY_IND
4. DELETED_IND
5. AUTOMERGE_IND

A) HUB_STAE_IND indicator
This field present in BO, XREF tables. These indicator fields represent whether the record is in the active, deleted or pending state.

Value	Meaning
1	Active Record
0	Pending Record
-1	Inactive Record

B) CONSOLIDATION_IND indicator

This filed present in the BO table. This indicator field represents whether the record is gone through the match process or not.

Value	Meaning
4	The new record (Unmerged record)
3	The record has gone through the match process and ready for consolidation
2	Queued for the Merge process
1	Consolidated or Golden record
9	The record is on hold. Normally data steward keep records on hold

C) DIRTY_IND indicator

This field present in the BO table but it is no more in used. It was used for the tokenization process in the earlier release. But now instead of this field, <BO>_DRTY table is used for the tokenization process. Valid values are 1 and 0 for this field. 0 means record is ready for tokenization and 1 means record went through tokenization process.

D) DELETED_IND indicator

This field present in BO and XREF Tables. It is reserved for future purposes.

E) AUTOMERGE_IND indicator

This field present in MTCH and HMRG tables. The valid values are 0 and 1.

Value	Meaning
1	Records are queued for auto-merge
0	Records are queued for manual merge

Monday, February 3, 2020

Informatica MDM - Important SQL Query: How to pull all the records from HMRG Table

There are some business use cases during which you may need to analyze data from the HMRG table i.e. History of Merge. Assume that you know the match rule number and Match rule set name then you can use the query below to pull records specific to match rule number from HMRG table.

select * from cmx_ors.c_bo_party_hmrg where rowid_match_rule in
(select rowid_match_rule from cmx_ors.c_repos_match_rule where rowid_match_set in (
select rowid_match_set from cmx_ors.c_repos_match_set where match_set_name='ORG_IDL') and rule_no=1
)

In this query,
rule_no=1 is a rule number from MDM hub for which we are looking for information
match_set_name='ORG_IDL' is a match set name from MDM hub under which rule_no=1 is present.

The above query will result in all records which satisfy the condition. We can join the result with parent party table and fetch other business attributes as per business needs.

Informatica MDM - The differences between Subject Area based IDD Application and Entity 360 application

Are you looking for details about the differences between Subject Area based IDD Application (aka Legacy IDD Application) and Entity 360 application? Would you be interested in knowing what are the limitations of Entity 360 application? Are also interested to know what are the great features of Entity 360 application? If so, then you reached right place. In this article we will discuss the differences between Subject Area based IDD application and Entity 360 application.

Category	IDD Data View	IDD Business Entity
Customization	Use of IDD User Exits 1. Integral part of IDD Configuration 2. Easy to implement customization 3. Easy to deploy as component of IDD 4. No separate resource configuration required, resources allocated to IDD will be used for User Exist 5. Error handling follows MDM standard practice, no additional handling is required 6. No additional security required as it integral part of IDD Application	No User Exit support 1. Need to write external services (Restful or SOAP based Web Services) 2. Required additional efforts to build and implement and deploy these external services 3. For scalability, high availability of external services, additional dedicated servers are required 4. Need to apply and maintenance security as these are external services to IDD Business entity 5. Extra error handling is required to follow MDM standard practice 6. Extra configuration is required to call external services 7. Dedicated resources need to be allocated to handle user requests
Fuzzy Search	Extended search functionality using MDM Match Engine to achieve fuzzy search	Elastic search uses Synonym properties file to achieve fuzzy search. Note: We need to maintain fuzzy keywords in the Synonym file in order to Fuzzy search work.
Data Import template	IDD Data View provides feature to import data. It is very helpful tool when business would like to import bulk data in need basic. No need to create or update requires manually	Do not support Bulk import template. Need to create or update bulk volume of data manually
Unmerge functionality	It supports both Tree unmerge and linear unmerge. Note: During Tree unmerge unmerged unmerge records get separated from group. During linear unmerge children records of unmerged record remain associated.	Supports only Tree unmerge
Report	Easy to integrate repots in the IDD application using Jasper Reports	Jaspersoft reports work in a Home page only if it is the only component in the Home page.
Workflow	If IDD application includes workflows, we must generate the business entity schema as a requirement for Data Director to manage the workflow tasks. However, we need to migrate to business entities	The business entity schema will be generated as part of Business Entity application publish event using Provisioning Tool.
Both Entity and IDD Data Views- Hybrid mode	Informatica recommends that you the Hybrid mode only on a temporary basis
Manual Override of matched record	Manual override of a value in the Matches view is allowed	Manual override a value in the Matching Records view is not allowed
Hierarchy View	Hierarchy relationships can be configured to show in a section to show duplicate hierarchy records.	The Hierarchy view does NOT permit the following actions:- · Finding a duplicate entity. · Initiating a merge. · Sharing a bookmark URL
Limitations	1. In the task inbox on the Home page, you cannot filter tasks by the creation date. 2. When you export search results that are based on a timeline, the export process ignores the timeline and exports all data.	1. The Cross Reference page and the Merge Preview page have pagination issues. 2. In the search results, some rows are empty. The rows represent records that are filtered out because the user does not have permission to view the records. 3. When a user role does not include the create and read privileges for a business entity, users with this role can still view the tasks associated with the business entity. 4. In the History view, the timescale labels in the Options menu do not appear correctly initially. 5. In the Hierarchy view, business entities in the Relationships tab of the history do not open in Business Entity view. 6. In the Timeline view, you cannot open the relationship records that appear on the Relationships tab. 7. In the Hierarchy view, in the Entity Details dialog box, when you click More Details, the dialog box closes without opening the selected business entity. 8. In the Matching Records view, when you merge records, the system can appear unresponsive. 9. If you delete a record and then search for the record, the ROWID of the deleted record still displays. 10. In the History view, when you try to view an event detail, an error might occur.

Technology World

DronaBlog