DronaBlog
Wednesday, August 8, 2018
Informatica Master Data Management - MDM - Quiz - 4
Q1. Which is not Correct regarding Landing tables?
A. A Single landing table could receive data from different source systems.
B. A Staging table is mapped to only one Landing table.
C. Landing tables do not have system columns.
D. Delta Detection is not a Landing table property.
Q2. Which feature is supported by Informatica Data Director?
A. Task oriented workflow capability.
B. A mechanism for hiding(masking) information based on security roles.
C. Localization of the Lookup display values.
D. All choices are correct.
Q3. When you select view rejects from the batch job log, you can see the reason why each record was rejected
A. True
B. False
Q4. Which meta data table is used to track the changes to a base object?
A. C_baseObjectName_HXRF
B. C_baseObjectName_HCTL
C. C_baseObjectName_HIST
D. All are correct
Q5. When performing data analysis which one of the following would you look for?
A. The availability of primary keys.
B. Which fields can come from each source.
C. Data Cardinality.
D. All the choices are correct.
Previous Quiz
Next Quiz
Sample SOAP UI Request and Response for RecalculateBO and RecalculateBVT
Are you looking for sample SOAP UI requests for RecalculateBO and RecalculateBVT? Are you also interested in knowing the request and response structure with elements in it? If so, then this article provides all this information.
<soapenv:Header/>
<soapenv:Body>
<urn:executeBatchRecalculateBo>
<urn:username>xxxx</urn:username>
<urn:password>
<urn:password>yyyy</urn:password>
<urn:encrypted>false</urn:encrypted>
</urn:password>
<urn:orsId>localhost-orcl-CMX_ORS</urn:orsId>
<urn:tableName>C_ADDRESS</urn:tableName> <urn:rowidObjectTable>TMP_ADDRESS_RECALCBO_1</urn:rowidObjectTable>
</urn:executeBatchRecalculateBo>
</soapenv:Body>
</soapenv:Envelope>
Sample executeBatchRecalculateBo request
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:siperian.api"><soapenv:Header/>
<soapenv:Body>
<urn:executeBatchRecalculateBo>
<urn:username>xxxx</urn:username>
<urn:password>
<urn:password>yyyy</urn:password>
<urn:encrypted>false</urn:encrypted>
</urn:password>
<urn:orsId>localhost-orcl-CMX_ORS</urn:orsId>
<urn:tableName>C_ADDRESS</urn:tableName> <urn:rowidObjectTable>TMP_ADDRESS_RECALCBO_1</urn:rowidObjectTable>
</urn:executeBatchRecalculateBo>
</soapenv:Body>
</soapenv:Envelope>
Sample executeBatchRecalculateBo response
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd=http://www.w3.org/2001/XMLSchema
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<executeBatchRecalculateBoReturn xmlns="urn:siperian.api">
<message>Succeeded</message>
<retCode>0</retCode>
<jobRunStatus>0</jobRunStatus>
</executeBatchRecalculateBoReturn>
</soapenv:Body>
</soapenv:Envelope>
Sample executeBatchRecalculateBvt request
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:siperian.api">
<soapenv:Header/>
<soapenv:Body>
<urn:executeBatchRecalculateBvt>
<urn:username>xxx</urn:username>
<urn:password>
<urn:password>yyyy</urn:password>
<urn:encrypted>false</urn:encrypted>
</urn:password>
<urn:orsId>localhost-orcl-CMX_ORS</urn:orsId>
<urn:tableName>C_ADDRESS</urn:tableName>
<urn:rowidObject>120001 </urn:rowidObject>
</urn:executeBatchRecalculateBvt>
</soapenv:Body>
</soapenv:Envelope>
Sample executeBatchRecalculateBvt response
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd=http://www.w3.org/2001/XMLSchema
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<executeBatchRecalculateBvtReturn xmlns="urn:siperian.api">
<message>Succeeded</message>
<retCode>0</retCode>
<jobRunStatus>0</jobRunStatus>
</executeBatchRecalculateBvtReturn>
</soapenv:Body>
</soapenv:Envelope>
Tuesday, August 7, 2018
Top 10 questions about Informatica MDM - Synchronization and RecalculateBO Jobs
Would you like to know about synchronization and RecalculateBO jobs in Informatica MDM? Are you also interested in knowing about what is the difference between RecalculateBO and RecalculateBvt? If so, then this article answers all these questions and also provides highlights on the wide variety of features related to RecalculateBO.
Q2: What changes to Trust on a Base Object does NOT cause the Synchronize job to appear?
Q1: What are the conditions under which the Synchronize Job appears in the Batch Viewer for a Base Object in MDM?
Answer: These are the conditions under which the Synchronize Job appears in the Batch Viewer:
- Enable trust on an untrusted column
- No changes made to the Staging Table can cause the Synchronize job to appear. For instance, adding a column which is trusted in the Base Object to a Staging table.
Q2: What changes to Trust on a Base Object does NOT cause the Synchronize job to appear?
Answer:
- Enable trust on an untrusted column > release lock (see Synchronize job) > disable trust on same column > release lock (Synchronize no longer seen);
- Change max trust value on a trusted column
- Disable trust on a column
- Enable Validation on a column
- Disable Validation on a column
Q3: What causes the synchronize job to not become active when the Trust column is modified?
Answer:
When an existing Trust column is modified by adding another Source system, the Trust does not work as expected because there are missing CTL entries for the newly added Trusted source.
To enable the Synchronize job follow the steps mentioned below:
- Run the SQL script UPDATE C_REPOS_COLUMN SET DIRTY_CTL_IND=1 WHERE ROWID_COLUMN='<rowid_column>';
- After the script is commited, run the job from Console.
- Refresh the Console.
Q5: What is the difference between "Recalculate_BO" and "Recalculate_BVT" in MDM?
Answer:
1) Recalculate_BO job is used when you want to run it for the entire Base Object, or a few records in the Base Object.
- ROWID_OBJECT_TABLE parameter: Recalculates all the Base Objects identified by the ROWID_OBJECT column.
- No ROWID_OBJECT_TABLE parameter: Recalculates all the records in a Base Object in batches of MATCH_BATCH_SIZE or one fourth of the total number of records in the table, whichever is less.
2) Run Recalculate_BVT job is used to run it for a single record.
Q6: Which API is faster "executeBatchRecalculateBvt SIF API" or "executeBatchRecalculateBo SIF API"?
Answer:
- The executeBatchRecalculateBo API is usually faster when multiple records need to recalculate BVT.
- The executeBatchRecalculateBvt API is faster when a single record needs to recalculate BVT.
Q7: Is the "<BO>_VXR" table impacted when a record is changed in MDM?
Answer: Yes, when we perform DELETE, PUT, MERGE and UNMERGE tasks, Master Data Management (MDM) recalculates the Best Version Of Truth (BVT) on existing Active records.
Q8: Is it enough to run the Synchronize job after adding trust to a new column in MDM?
Answer: No, Synchronize job just handles correcting the <BASE_OBJECT>_CTL table. We need to run executeBatchRecalculateBo after adding trust to a column.
Q9: When should we run "Revalidate" jobs in MDM?
Answer:
- If the validation rules are modified in the Base Object, then run the Revalidate job.
- We have to manually run this job from the Batch Viewer.
- Validation job gets enabled only when you have modified any column for validation after the initial load and before the merge job is run.
Q10: What is the behavior of the Revalidate job across MDM tables?
Answer:
- The Revalidate Base Object will check and/or change the trust score.
- It only calculates the trust score according to validation rules.
- It will be used during the recalculate Best Version of Truth (BVT) job.
- When recalculate BVT is run, the records in the Base Object may/may not change. It depends on the trust score during that time.
The video below provides details about Synchronization and RecalculateBO jobs in Informatica MDM?
Monday, August 6, 2018
Important and Useful Unix commands
Are you looking for an article which provides a list of important commands used for daily unix activities? This article provides a consolidated list of unix commands.
The video below provides a tutorial on Unix topics -
Introduction
In this article we have listed unix commands for : File and Directories, Compressed Files and Manipulating Data
File and Directory
Command
|
Details
|
cat
|
Displays
File Contents
|
cd
|
Changes
Directory to another directory
|
chgrp
|
Changes
file group
|
chmod
|
Changes
permissions
|
cp
|
Copies
source file into destination
|
file
|
Determines
file type
|
find
|
Finds
files
|
grep
|
Searches
files for regular expressions
|
head
|
Displays
first few lines of a file
|
ln
|
Creates
softlink on oldname
|
ls
|
Displays
information about file type
|
mkdir
|
Creates a new directory dirname
|
more
|
Displays data in paginated form
|
mv
|
Moves (Renames) an oldname to newname
|
pwd
|
Prints current working directory
|
rm
|
Removes (Deletes) filename
|
rmdir
|
Deletes an existing directory
provided it is empty
|
tail
|
Prints last few lines in a file
|
touch
|
Updates access and modification time
of a file
|
vi
|
To view file content
|
Compressed Files
Command
|
Details
|
compress
|
Compresses files
|
gunzip
|
Helps uncompress gzipped files
|
gzip
|
GNU alternative compression method
|
uncompress
|
Helps uncompress files
|
unzip
|
List, test and extract compressed
files in a ZIP archive
|
zcat
|
Cat a compressed file
|
zcmp
|
Compares compressed files
|
zdiff
|
Compares compressed files
|
zmore
|
File perusal filter for crt viewing
of compressed text
|
apropos
|
Locates commands by keyword lookup
|
info
|
Displays command
information pages online
|
man
|
Displays manual pages online
|
whatis
|
Searches the whatis database for
complete words
|
yelp
|
GNOME help viewer
|
Manipulating Data
Command
|
Details
|
awk
|
Pattern scanning
and processing language
|
cmp
|
Compares the
contents of two files
|
comm
|
Compares sorted
data
|
cut
|
Cuts out selected
fields of each line of a file
|
diff
|
Differential file
comparator
|
expand
|
Expands tabs to
spaces
|
join
|
Joins files on some
common field
|
perl
|
Data manipulation
language
|
sed
|
Stream text editor
|
sort
|
Sorts file data
|
split
|
Splits file into
smaller files
|
tr
|
Translates
characters
|
uniq
|
Reports repeated
lines in a file
|
wc
|
Counts words,
lines, and characters
|
vi
|
Opens vi text
editor
|
vim
|
Opens vim text
editor
|
fmt
|
Simple text
formatter
|
spell
|
Checks text for
spelling error
|
ispell
|
Checks text for
spelling error
|
emacs
|
GNU project Emacs
|
ex, edit
|
Line editor
|
The video below provides a tutorial on Unix topics -
Informatica MDM - Match Rule Tuning
Would you like to know how to perform Informatica MDM match rule tuning? Are you also interested in knowing what steps are involved in match rule tuning? If so, then this article, will assist you in understanding the steps involved in match rule tuning.
Introduction
The Informatica MDM match tuning is an iterative process. It involves the following steps: data profiling, data standardization, defining the fuzzy match key, tuning the fuzzy match process and database tuning.
Activities
The activities mentioned below are needed to perform the match rule tuning in the Informatica MDM.
Activity
|
Details
|
Data Profiling
|
The right data in for the match, the data investigation, the data accuracy, the data completeness.
|
Data Standardization
|
Cleaning and
standardization
|
Define the Fuzzy Match key
|
Fuzzy match keys ( the columns that need to be matched )
with the key width.
|
Fuzzy Match Process
|
How to use the following:
1) Key width 2) Match level 3) Search level 4) Cleanse server log 5) Dynamic Match Threshold (DMAT) 6) Filters 7) Subtype Matching 8) Match Only Previous Rowid Object option 9) Configure match threads 10) Enable Light Weight Matching (LWM) |
Database Tuning
|
1) Analyze tables
2) Create indexes 3) Configure Match_Batch_Size 4) Analyze STRP table. |
Data Profiling
- You need to perform analysis of the data on which the match will be performed. You should also analyze the quality of data across all fields.
- Share the result of the data analysis with business users and get inputs about what attributes need to be considered for the matching process.
- Identify fields which you think can provide better matches, e.g. SSN, TAXID etc.
- The next step is to determine filter criteria which are normally used on exact columns such as COUNTRY=US. This will be helpful for achieving better performance.
- You need to also determine the completeness of data. For example, if the country code is valued in only 50% of the records, it may not be a good candidate as an exact column.
- You need to verify percentage of data accuracy.,.e.g. the gender field should only contain gender values.
- It is always a good idea to analyze data using the pattern mechanism.
- Finally determine the type of match population to use. e.g. USA.
Data Standardization
- Determine the cleansing rule to standardize data, for example, Street, St. to ST.
- Use data standardizing tools such as address doctor, Trillium or any other third party tool.
Determine the Fuzzy Match Key
The basic rules mentioned below about defining the fuzzy match key include:
- OrganizationName: If the data contains the organization names or both organization names and the person's name
- PersonName: If the data contains person names only
- AddressPart1: If the data contains only the address
Tuning the Potential match candidates
a) Key Width:
- For less SSA indexes, reduce the key width to ‘Preferred’
- For more match candidates, use the key width as ‘Extended’
b) Search Level:
- For less SSA ranges use the search level as ‘Narrow’
- For more candidates to match use search level as ‘Exhaustive’
- Use ‘Typical’ search level for business data
- To match most candidates, use search level as ‘Extreme’. It has performance issues associated with it.
c) Match Level:
- For records that are highly similar and which should be considered for a match, use match level as ‘Conservative’
- Use match level as ‘Typical’ for most matches
- Match level ‘Loose’ is better for manual matches to ensure that tighter rules have not missed any potential matches
d) Define the fuzzy match key columns that have more unique data, e.g. the Person Name or the Organization Name
e) Data in the fuzzy match key columns should not contain nulls. Nulls (SSA_KEY is K$$$$$$$) are potential candidates for each other.
Use the range query as below and review the SSA_DATA column for all the qualifying candidates-
SELECT DISTINCT ROWID_OBJECT, DATA_COUNT,SSA_DATA, DATA_ROW
FROM C_PARTY_STRP
WHERE SSA_KEY BETWEEN ‘YBJ>$$$$’ AND ‘YBLVZZZZ’
AND INVALID_IND = 0
ORDER BY ROWID_OBJECT, DATA_ROW
Cleanse Server logs
Cleanse server logs help to determine long running ranges. These long running ranges normally have more candidates to match. Isolate such ranges by looking into the cleanse server logs. Normally, production is a multi-threaded environment, so determine the rangers for these threads. Analyze which thread is taking more time and take out those records from the matching process and re-run the match job.
The video below provides more details about the match rule tuning -
Subscribe to:
Posts (Atom)
Exploring Amazon SES: A Powerful Solution for Email Delivery
Email communication is a cornerstone of business operations, marketing campaigns, and customer engagement strategies. Reliable email deliver...
-
Would you like to know what are differences between Legacy IDD and Entity 360 or Entity application? Are you also interested in kn...
-
Are you working on a project where the oracle database is being used for implementation? Are you also facing an ORA-00604 and looking for f...
-
Are you looking for how to fix the error - "ORA-12801: error signaled in parallel query server P00D" in Oracle? Are you also inte...