Technology World

Monday, July 8, 2019

Top 12 Interesting features of Java 10

Would you be interested in knowing what are the new interesting features in Java 10? Would you also like to know Application Class Data Sharing, Java JIT Compiler, Time based release? If so, then you reached the right place. In this article, we will understand new features in Java 10 language.

Java 10 features

Java 10 is the fastest feature release of a Java SE platform. Features contain various enhancements into many functional areas such as garbage collection and compilation as well as local variable types.

ü Local-Variable Type Inference

ü Application Class-Data Sharing

ü Consolidate the JDK Forest into a Single Repository

ü Garbage-Collector Interface

ü Parallel Full GC for G1

ü Thread-Local Handshakes

ü Remove the Native-Header Generation Tool (javah)

ü Additional Unicode Language-Tag Extensions

ü Heap Allocation on Alternative Memory Devices

ü Experimental Java-Based JIT Compiler

ü Root Certificates

ü Time-Based Release Versioning

1. Local Variable Type Inference

Java now allows var style declarations. We can declare a local variable without specifying its type. The type will be inferred from context i.e from the type of actual object created.

For eg.

var str = “Welcome to Java 10";

//or

String str = " Welcome to Java 10";

In first the statement, type of str is determined by the type of assignment which of String type.

2. Application Data-Class Sharing:

The main goal of this feature is to improve startup and footprint, extend the existing Class-Data Sharing ("CDS") feature to allow application classes to be placed in the shared archive.

Goals:

-It reduces the footprint by sharing common class metadata across different Java processes.

-Improves start-up time.

-Application Class-Data Sharing allows the built-in system class loader, the built-in platform class loader, and custom class loaders to load archived classes.

3. Consolidate the JDK Forest into a Single Repository.

This feature is all about housekeeping. It combines the numerous repositories of the JDK forest into a single repository to simplify the development.

4. Garbage-Collector Interface.

It introduces common Garbage Collector Interface, by using this we can improve the code isolation. It allows alternative collectors to be quickly and easily integrated. The main goal is to provide better modularity for HotSpot internal GC code.

5. Parallel Full GC for G1.

This feature of Java 10 improves G1 worst-case latencies by making the full GC parallel.

The current implementation of the full GC for G1 uses a single-threaded mark-sweep-compact algorithm.

6. Thread-Local Handshakes.

It improves performance. While the java thread is in a savepoint safe state, a handshake operation is executed for each Java Thread. While keeping the thread in a blocked state the callback is executed either by the thread itself or by the VM thread.

7. Remove the Native-Header Generation Tool (javah)

It focuses on housekeeping. This feature removes javah tool from JDK. This practicality provides the flexibility to put in writing native header files at the time that Java source code is compiled, thereby eliminating the necessity for a separate tool.

8. Additional Unicode Language-Tag Extensions.

This feature enhances java.util.Locale and related APIs to implement extra Unicode extensions of BCP 47 language tags. This JEP will implement a lot of the extensions laid out in the newest LDML specification, within the relevant JDK classes.

This feature will add support for the following additional extensions:

i. cu (currency type)

ii. fw (first day of the week)

iii. rg (region override)

iv. tz (time zone)

9. Heap Allocation on Alternative Memory Devices

This feature enhances the potential of HotSpot VM to portion the Java object heap on an alternate device, like NV-DIMM, nominative by the user.

For example, with this feature, it is possible to assign lower priority processes to use the NV-DIMM memory, and instead, only allocate the processes which have a higher priority to the DRAM in a multi-JVM environment.

10. Experimental Java-Based JIT Compiler

It enables Graal, to be used as an experimental JIT compiler on the Linux/x64 platform. Graal is basically a new JIT compiler for java, which is the basis of Ahead-of-Time (AOT) compiler.

11. Root Certificates

This feature provides root Certification Authority (CA) certificates in the JDK.

This helps to promote OpenJDK and make it more effective to community users. The aim of this feature is to reduce the difference between the OpenJDK and Oracle JDK builds.

12. Time-Based Release Versioning

Unlike the old releases, the new time-based releases won’t be delayed and features will be released every six months. There are also Long Term Releases (LTS). It is mainly for enterprise customers.

What are the differences between IDD Data View and IDD Business Entity applications

Would you like to know what are differences between Legacy IDD and Entity 360 or Entity application? Are you also interested in knowing what are the open issues with Legacy IDD and Entity 360 application? If so, then you reached the right place. In this article, we will have a detailed discussion about the differences and features of both the applications. These differences are captured based on Informatica MDM 10.3 HF1.

Category	IDD Data View	IDD Business Entity
Customization	Use of IDD User Exits 1. Integral part of IDD Configuration 2. Easy to implement customization 3. Easy to deploy as a component of IDD 4. No separate resource configuration required, resources allocated to IDD will be used for User Exist 5. Error handling follows MDM standard practice, no additional handling is required 6. No additional security required as it integral part of IDD Application	No User Exit support 1. Need to write external services (Restful or SOAP-based Web Services) 2. Required additional efforts to build and implement and deploy these external services 3. For scalability, high availability of external services, additional dedicated servers are required 4. Need to apply and maintain security as these are external services to IDD Business entity 5. Extra error handling is required to follow MDM standard practice 6. Extra the configuration is required to call external services 7. Dedicated resources need to be allocated to handle user requests
Data Import template	IDD Data View provides the feature to import data. It is a very helpful tool when business would like to import bulk data in need basic. No need to create or update requires manually	Do not support Bulk import template. Need to create or update bulk volume of data manually
Unmerge functionality	It supports both Tree unmerge and linear unmerge. Note: During Tree unmerge unmerged unmerge records get separated from the group. During linear unmerge children records of unmerged record remain associated.	Supports the only Tree unmerge
Report	Easy to integrate repots in the IDD application using Jasper Reports	Jaspersoft reports work in a Home page only if it is the only component in the Home page.
Workflow	If IDD application includes workflows, we must generate the business entity schema as a requirement for Data Director to manage the workflow tasks. However, we need to migrate to business entities	The business entity the schema will be generated as part of Business Entity application publish event using the Provisioning Tool.
Both Entity and IDD Data Views- Hybrid mode	Informatica recommends that the Hybrid mode only on a temporary basis
Manual Override of matched record	Manual override of value in the Matches view is allowed	Manual override a value in the Matching Records view is not allowed
Hierarchy View	Hierarchy relationships can be configured to show in a section to show duplicate hierarchy records.	The Hierarchy view does NOT permit the following actions:- · Finding a duplicate entity. · Initiating a merge. · Sharing a bookmark URL
Limitations	1. In the task inbox on the Home page, you cannot filter tasks by the creation date. 2. When you export search results that are based on a timeline, the export process ignores the timeline and exports all data.	1. The Cross Reference page and the Merge Preview page have pagination issues. 2. In the search results, some rows are empty. The rows represent records that are filtered out because the user does not have permission to view the records. 3. When a user role does not include the create and read privileges for a business entity, users with this role can still view the tasks associated with the business entity. 4. In the History view, the timescale labels in the Options menu do not appear correctly initially. 5. In the Hierarchy view, business entities in the Relationships tab of the history do not open in Business Entity view. 6. In the Timeline view, you cannot open the relationship records that appear on the Relationships tab. 7. In the Hierarchy view, in the Entity Details dialog box, when you click More Details, the dialog box closes without opening the selected business entity. 8. In the Matching Records view, when you merge records, the system can appear unresponsive. 9. If you delete a record and then search for the record, the ROWID of the deleted record still displays. 10. In the History view, when you try to view event detail, an error might occur.

Tuesday, July 2, 2019

Important stats related SQL queries with details of determining Tablespace

This article on Oracle database provides details about how can we monitor tablespace during sql execution. In this article we will also see how to what are the sqls are currently executing. During this process we will understand, how to get SQL ID associated with each sql statement.

Tablespaces in the Oracle database

The tablespace is one or more logical storage units in the Oracle database. Each database table belongs to some tablespace in the oracle database. Normally, Oracle DBA has alerts setup to monitor tablespace, so that SQL or jobs which are based on SQL statements will not fail due to tablespace issue. However, as a developer, we can also monitor tablespace using the SQL statement below -

SELECT B.TABLESPACE_NAME, TBS_SIZE SIZE_MB, A.FREE_SPACE FREE_MB FROM (SELECT TABLESPACE_NAME, ROUND(SUM(BYTES)/1024/1024 ,2) AS FREE_SPACE FROM DBA_FREE_SPACE GROUP BY TABLESPACE_NAME) A, (SELECT TABLESPACE_NAME, SUM(BYTES)/1024/1024 AS TBS_SIZE FROM DBA_DATA_FILES GROUP BY TABLESPACE_NAME UNION SELECT TABLESPACE_NAME, SUM(BYTES)/1024/1024 TBS_SIZE FROM DBA_TEMP_FILES GROUP BY TABLESPACE_NAME ) B WHERE A.TABLESPACE_NAME(+)=B.TABLESPACE_NAME;

To determine used and free tablespace, use the sql below

select tablespace_name, SUM(bytes_used/1024/1024/1024) "Temp_Used", sum(bytes_free/1024/1024/1024) "Temp_Free"

from v$temp_space_header where tablespace_name like '%TEMP%' group by tablespace_name;

Determine currently running Active DB Sessions

As Oracle developer or application developer, we can use the query below to determine the active DB sessions in the Oracle database -

SELECT * FROM GV$SESSION WHERE STATUS = 'ACTIVE' AND TYPE <> 'BACKGROUND' AND USERNAME = 'TEST_SCHEMA_ID';

This query will return important details e.g. USERNAME, STATUS, database MACHINE, SERVICE_NAME, SQL_ID etc.

Determine which SQL statement is running in Database

Before determining SQL statement, we need to determine what are the active session associated with SQL statement which can be determined with help of above section - 'Determine currently running Active DB Sessions'.

By using the result of above query, get the SQL_ID associated with active session. Once we have SQL_ID then execute the SQL statement below to determine the query which is running in the database -

SELECT * FROM GV$SQL WHERE SQL_ID = '61f38skxkw8hc'

e.g. Here '61f38skxkw8hc' is value for SQL_ID

Get Explain Plan
In order to get explain plan execute the statement below -

select * from TABLE(dbms_xplan.display_awr('61f38skxkw8hc'));

Get Start and End time of SQL Query

select sql_id, first_load_time, last_load_time, elapsed_time, cpu_time from v$sql where sql_text like 'with /* slow */ rws as (%';

select sql_id, elapsed_time_delta/executions_delta avg_elapsed
from sys.dba_hist_sqlstat
where snap_id = :snap;

Getting SQL Event details

SELECT SNAP_ID, SQL_TIME, SESSION_ID, USER_ID, SQL_ID, SQL_OPNAME,
TO_CHAR(A.SQL_EXEC_START, 'MM/DD/YYYY HH24:MI:SS'), SESSION_STATE,
EVENT, TIME_WAITED, PROGRAM, A.MACHINE FROM DBA_HIST_ACTIVE_SESS_HISTORY A
WHERE
A.SQL_TIME BETWEEN TO_DATE('03/10/2018 09:00:00', 'MM/DD/YYYY HH24:MI:SS')
AND TO_DATE('03/11/2018 6:00:00', 'MM/DD/YYYY HH24:MI:SS')
AND SQL_ID = 'XXX'
AND EVENT IS NOT NULL
AND EVENT <> 'CELL SINGLE BLOCK PHYSICAL READ'
ORDER BY SQL_TIME ASC

Monday, May 20, 2019

Overview of Informatica Customer 360i

Would you like to know more about what is Informatica Customer 360i? Are you also interested in knowing what are capabilities of the Informatica Customer 360i application? If so, then refer this article. This article also provides highlights on the underlying architecture of Informatica Customer 360i .

What is Informatica Customer 360i

Informatica acquired AllSight company which is Artificial Intelligence enabled customer insight company on Feb 28, 2019. AllSight Inc a startup had a product named AllSight Intelligent 360. After-acquired by Informatica it called now as Informatica Customer 360 Insight (Customer 360i). It is powered by CLAIRE engine (Cloud-scale AI-powered Real-time Engine). CLAIRE uses artificial intelligence (AI) and machine-learning techniques powered by enterprise-wide data and metadata. It helps to significantly boost the productivity of all managers and users of data across the organization.

Capabilities of Informatica Customer 360i

1. It connects data of any type
2. It has capabilities to manage billions of records across all data sources
3. The customer data linkages can be easily resolved
4. With the help of Customer 360i, we can create relationships using advanced machine learning algorithms
5. Using Natural Language Processing we can provide additional customer attributes from unstructured data.
6. The relationships, households and complex B2B hierarchies using a graph data store can be easily visualized with product
7. It has capabilities to present multiple perspectives of the customer based on unique users and use case context

The architecture of Informatica Customer 360i

1. Customer 360 Insights is built on a big data technology stack.
2. The technologies used are Spark, Apache Hadoop, In-memory data stores, Graph, Columnar.
3. Data scientists can use R and Python languages with Informatica Customer 360i for flexibility.
4. It uses the microservices architecture to achieve scalability for deployment and redeployment of functionality
5. It also uses the SaaS deployment model which helps to simplify as well as accelerates implementation.

Use of Informatica Customer 360i

1. Informatica Customer 360i can be used for customer engagement
2. It works on structured and unstructured data sources
3. It will help enterprises to create the relationship between master, transaction, interaction, and reference data
4. These relationships will help to discover rich, personalized behavioral insights.

5. These insights can be used across the enterprise to connect customer interactions in real time and ensure the delivery of the next best action.
6. This new solution automates and simplifies profile and relationship unification
7. It also scales AI across transactions and interactions in the business data.

Customer Intelegence evolution

Application Centric
1. Fix data quality
2. De-duplication in the business data

Master Data Driven
1. Resolve duplicate records from multiple store
2. Manages master data
3. Fix data quality in the enterprise system data

Customer Intelligence Empowered
1. Match customer entities
2. Enrich data with derived intelligence
3. Provide multiple unique customer views

Sunday, May 19, 2019

Details about Informatica MDM metadata or infrastructure tables

You might have come across the term metadata tables in infrastructure tables during your Informatica MDM project implementation. What are these infrastructure tables? What is the significance of these tables? How can we access it and use it? Are you facing these questions and would like to know more about these? If so, then you reached the right place. In this article, we will explore the infrastructure tables get generated during the Base Object, Stage and Landing tables configuration. So let's start.

Introduction:

The MDM infrastructure tables are the core part of Informatica MDM. These tables are created, whenever we configure the basic tables such as Base Object (BO), Stage and Landing tables along with their properties such as Raw Retention, Delta detection on the Stage table or match and merge setting on the Base Object table.

What are the MDM infrastructure tables?

Assume that we create Landing table, Stage table, and Base Object table as C_L_PARTY, C_S_SALES_PARTY, and C_B_PARTY respectively. Also assume that we configure raw retention, delta detection, tokenization, match and merge rule as well. After doing all these configurations at table level the supporting tables are created.

Tables at Landing table level: There is no infrastructure table created at the landing table level
Tables at Staging table level: The tables created at the Staging table level are

C_S_SALES_PARTY_RAW

C_S_SALES_PARTY_PRL

C_S_SALES_PARTY_OPL

C_S_SALES_PARTY_REJ

Each of these tables has its own importance and are used during MDM batch job execution.

3. Tables at Base Object table level: There are 14 supporting infrastructure tables are created.

C_B_PARTY_MTCH

C_B_PARTY_HIST

C_B_PARTY_XREF

C_B_PARTY_HXRF

C_B_PARTY_DRTY

C_B_PARTY_CTL

C_B_PARTY_HMRG

C_B_PARTY_HCTL

C_B_PARTY_EMI

C_B_PARTY_EMO

C_B_PARTY_VXR

C_B_PARTY_HVXR

C_B_PARTY_VCT

C_B_PARTY_STRP

What is the need of the MDM infrastructure tables?

The Informatica MDM implementation involves various process such as Stage, Load, Tokenization, Match, Merge, etc. During each process, the data is transferred from the source table to the target table. During this transfer data is manipulated with the help of supporting table. e.g. During the stage job, the data is transferred from the landing tables to Staging tables. During this transfer, the landing data is maintained in _PRL, _RAW tables. The _PRL table data is used to determine delta of the source record which is subprocess during stage job.

Similar cases are involved during load job as well tokenization job. These infrastructure tables play a vital role in Informatica MDM implementation.

Relationship between Landing table and the Base Object table

The load job loads data from the Stage table to a Base Object
There is still the dependency on landing table data to handle the rejection.
The batch job will try to pull the source table record for inserting into the reject table.
If the landing table is missing the corresponding records, then the reject table will have an entry to state that the source table entry not found.
If the landing table is huge and performance issues occur in the load job during the rejection handling, then assess the environment to add a custom index on the landing table.

Is it ok to modify the existing structure of the MDM infrastructure tables?

Informatica strongly recommends that do not modify the structure of these tables as these designed for internal processing purpose only. If you modify these tables, metadata validation may complain error.

The video below provides detailed information about the MDM infrastructure tables -

Friday, February 22, 2019

Informatica Master Data Management (MDM) Architecture Overview

Are you looking for information about Informatica Master Data Management Architecture? Would you be also interested in knowing what components involved? If so, then you reached the right place, in this article we will explore Informatica MDM Architecture in detail. We will also understand what upstream and downstream systems are involved.

MDM Architecture Overview

As we know, Master Data Management i.e. MDM is a solution for mastering business information. MDM involves several processes with the help of which we can achieve uniformity, accuracy, and consistency in the business data. Such business-critical data can be used for better process management and for achieving the organization's goals. With the help of the MDM solution, we can carry out the data governance practices very effectively.

If we look at the big picture of MDM architecture, we can see, there are basically three layers. The first layer is the source systems, the second layer is MDM implementation and the third layer is consumption.

The source system layer includes operational systems which maintain 3rd party data. This layer may include multiple sources with different platforms such as Siebel, Oracle, SAP, Acxiom or D and B. The data from source systems will not be pushed to MDM layer directly. In order to push data from source system to MDM layer, we normally use ETL layer (Here ETL stands for Extract, Transport, and Load). This data push may happen in the batch mode or real-time mode or near real-time mode.

Once data is entered to MDM landing tables, first data cleansing will happen. Data standardization rules will be applied to enrich the business data. Cleansed and standardized data will be loaded into staging tables. To achieve data integrity constraints are enforced to the Base Object table while loading data from staging table to base object table. This is not the end of the process. Actual processing work will start after this. Even though the cleansed and standardized records are loaded in the MDM system, there will be duplicate and fuzzy records in the system. The next processes i.e. match process will identify such records based on the business criteria and rules developed during the data quality analysis phase. These duplicate records will be consolidated to make a golden copy of records. The golden copies of records may hold relations among them e.g. Manager and Employee relationships or Organization and Branch relationships etc. These relationships can also be maintained in the MDM system in the form of hierarchies.
The data stewardship will help to keep a golden copy of records in its consistency state and enforce controls on create and update processes through the user interface which comes with Informatica MDM product. e.g. Informatica Data Director or Customer 360 application.

All these MDM features such data modeling, data quality, identifying duplicates, consolidating records, maintaining hierarchies and workflows will not be compromised over the security, hence MDM also comes with role based in build security. However, if it is required, we can integrate the organization's existing security features such as LDAP security for authentication. However, authorization of MDM components needs to be happening in the MDM hub as per role-based mechanism. One of the great thing about Informatica MDM is it keeps MDM configurations in synch with the help of metadata.

Okay, we created golden of records in the MDM hub. What we do with this data? Thanks for asking that question, actually, after the successful implementation of MDM solution, the golden copies of records will be available to the consumer to consume. There could be a third party application which can consume data directly from MDM. However, in most of the cases, the data will be pushed from MDM to these consuming system through ETL layer as like data loading from source systems to MDM. It could be the batch mode, real-time or near real-time. There are few other types of systems such as analytical or reporting systems which consumes data for different purposes. The analytical consuming systems such as Data warehouse, Data marts or Portal dashboard will use these golden records to analyze the data and comes better organization growth plans. On the other hand, reporting consuming systems such as business intelligence or corporate performance management will help to produce the report to achieve effectiveness in business processes and to achieve business goals.

MDM Architecture - Deep Dive

We got a basic idea where Informatica MDM fits in the enterprise application. Now is the time to deep dive into the MDM system architecture. Informatica MDM three major components and those are Hub store, MDM hub and Services Integration Framework.

The Hub Store is a database component where business data is stored and consolidated. Hub store is based on underlyng database which can be Microsoft SQL Server, Oracle or DB2. It contains information about all of the databases that are part of your Informatica MDM Hub implementation. It has two parts, one is Master Database and second is Operational Reference Store, also known as ORS. We will use ORS term quite often during our lectures as well as during real time MDM implementation.

What is this Master Database component? Master Database is a database schema which maintains most critical configuration details of Informatica MDM hub. It includes user accounts created using MDM hub users section. Security configuration such as username and encrypted passwords for database schema users and application users. Master data maintains registry for Operational reference Store. e.g. If you register 3 ORS then those 3 entries will be present in Master Database. The default name of the Master Database is CMX_SYSTEM. Normal practice we use term CMX_SYSTEM quite often as like ORS.

We know Registering Database, Creating Users, providing tool access, overview of message queue and security provider. Where these configurations are maintained? Yes, you are right, this information is also persisted in the master database. The Master Database stores the connection settings and properties for each Operational Reference Store which are registered through MDM hub. In other words, we can access and manage multiple Operational Reference Stores from one Master Database.

Important thing to remember is for a given Informatica MDM Hub environment we can have only one Master Database. i.e. only one CMX_SYSTEM for one MDM environment.

Okay, if master database maintains configuration details then where business data is stored? I am glad you asked that question. The answer is, business data is stored in Operational reference store. Lets understand what else Operational Reference Store contains. Along with business data, ORS also maintains the rules for processing the master data. If you remember, about match columns, matchrule sets, all such rules are stored in the ORS. It also stores additional information such as BVT, Tokens, data leaniage along with history. It has Repository tables which start with C_REPOS and Repository archive table which starts with C_REPAR which holds all this information. Do you want to know what is default name of ORS? It is CMX_ORS, but you can name whatever you like because at the end it is database schema name. Unlike Master Database there is no restriction on number of ORS in a given MDM hub environment. But if we configure more ORS in the MDM hub it will adversely impact your MDM env, so use it wisely.

Important thing to remember about ORS is, we cannot associated a single Operational Reference Store with multiple Master Databases. The Master Database also stores site-level information, such as the number of incorrect log-in attempts allowed before a user account is locked out.

Next important component in the MDM architecture is application server. Informatica MDM supports three application servers and those are JBOSS, Weblogic and Webshphere. Irrespective of what kind of application server you are using, the components which get installed on these servers will remain same. We normally install Process Server and Hub Server on the application server. Lets understand little more about Process server. It is java code (to be specific a Java servlet) that cleanses the data and also processes batch jobs. Prior to MDM 9.7, we used to call it as cleanse server instead of process server. Why? because only cleansing used happen on cleanse server. But now, both data cleansing and batch job processing happens on Process server and hence the name. We can configure mutliple process server for better performance. Apart from that, we can configure configure process server in 3 different modes and those are Batch mode, online mode and the Batch and Online mode. We can choose the mode as per our business need. On other hand, Hub Server is used for core and common services which includes security, access and session management.

Monday, January 28, 2019

Important Informatica MDM Interview Questions and Answers - Part IV

Are you looking for the Informatica MDM interview questions and answers? Are you also looking for an explanation for various concepts in MDM? If yes, then refer to this article where we have explained various MDM concepts in the form of interview questions and answers. This article will be helpful for the Informatica MDM interview. In this article, we will focus on questions and answers about Cleanse function in the Informatica MDM.

Q1: What is the use of cleanse function in MDM?

Answer: Informatica MDM hub is used for data enrichment and consolidation. In order to perform data enrichment, it has to go through cleansing and standardization. Cleansing is a process through source data is cleansed for nuisance characters or words, invalid data or repeated words. Cleansing also helps to achieve standardization e.g. converting Limited, Ltd., Lmtd, Lt to standard work LTD etc.
In Informatica MDM hub, cleansing is achieved during stage process while moving data from the landing table to the staging table. We need to install and configure cleanse engine before running stage job.

Q2: What are the cleanse functions you have used your projects so are?

Answer: This is one of the common questions get asked during the Informatica MDM interview.

In order to achieve data cleansing we normally use inbuild cleanse functions such as Concatenation, Trim, Uppercase, regular expression.
For complex operation where IF-ELSE conditions need to be handled then we use graph function. We also use Cleanse List function to achieve cleansing and standardization.
There are several scenarios where inbuild cleanse function do not satisfy the business requirement in such case we build the custom Java Cleanse function. e.g. determining the length of String, Determine the index of the character in the given String.

The video below explains how to develop custom Java cleanse function.

Q3: How to read the database using cleanse function?

Answer: Read database function is used to perform the lookup and get values from the database table. While using read database cleanse function we need to connect to the database by passing table name and column name on which we need to perform the lookup.

Normally, Read database cleanse function is used if we need to populate values in staging table by reading database table.

Q4: What is the Graph Cleanse function and how to create it?

Answer: MDM hub comes with various types of inbuilt cleanse function such as Data Conversion, General Processing, Geographic, Logic Functions, Math Functions, Misc Functions, Noise Functions, and String Functions. However, there are some business scenarios where these inbuilt functions do not meet the requirements.

In such cases, we can combine inbuilt cleanse function and create our own cleanse function. In order to create such a function, we need to use the Graph Cleanse Function. Using Graph Cleanse function we can achieve IF-ELSE or CASE statement scenarios.

Q5: Have you created a custom Java Cleanse function? If yes, what was use case?

Answer: Informatica MDM hub comes with inbuilt cleanse function. We can build custom complex function by combining these inbuilt functions to cleanse and standardize the business data. There are some business cases where inbuilt cleanse function or custom complex cleanse function does not satisfy business needs. In such cases, we need to create custom Java Cleanse functions. Informatica MDM provides Java framework to create custom Java cleanse function.

Business use case: Determining the geocode of the given address.
Assume that your business would like to determine the geocode of the given physical address. We have two options here:
a) Buy address doctor license from Informatica and populate co-ordinates for address
b) Build custom logic using Google Geocoding API (Free) with no extra license money

If we choose option b) where we no need to pay for determining Geocoding of address. In order to implement such custom logic, we need to write custom Java cleanse function.

The video below provides a detailed explanation about how to build custom Java Cleanse function.