Technology World

Thursday, July 3, 2025

Understanding the Consolidation Indicator in Informatica IDMC Customer 360 MDM SaaS

The Consolidation Indicator (CI) field is a crucial element in Informatica IDMC Customer 360 MDM SaaS, playing a vital role in managing and consolidating data. This indicator helps track the state of data records as they move through the matching and merging process, ultimately leading to the creation of a "best version of truth" for each unique entity.

The CI field can take on four distinct values, each signifying a different stage in the data consolidation lifecycle:

Match Dirty: This is the initial state for data records when they are first loaded or updated within MDM SaaS. It indicates that the record is new or has been modified and needs to be processed for potential matches.
Match Index: After the indexing job runs, the CI value transitions from "Match Dirty" to "Match Index." In this state, records are prepared to participate in the matching process. If re-indexing is required for any reason, the CI value can be reset back to "Match Dirty."
Matched: A record receives the "Matched" CI value once it has gone through the match and merge process. This applies whether a matching candidate was found or not.
Consolidated: This is the final and most desirable state for a merged record. "Consolidated" signifies that a unique and accurate "best version of truth" record has been successfully created.

Beyond these core values, there are several important aspects related to the CI field:

Accept as Unique: If the "accept as unique" option is enabled for a record that undergoes the match and merge process but doesn't find any matching rules, its state will change from "Match Index" to "Matched." This allows the record to be treated as unique even without a direct match.
XREF Record Updates: When an XREF (cross-reference) record is updated, its CI value automatically reverts to "Match Index." This ensures that the updated XREF record is re-evaluated through the match and merge process.
CI Field Location: A key distinction in MDM SaaS compared to on-premise MDM is the placement of the CI field. In MDM SaaS, the CI field is located at the XREF level, not the business entity level. This granular placement provides more precise control over data consolidation.
Extracting Consolidated Records: To extract only the consolidated records, a two-step extraction process is necessary. This involves creating one extract for business entity records and another for XREF records. These two extracts can then be joined to filter and retrieve only the consolidated values.

In summary, the Consolidation Indicator is a fundamental component of Informatica IDMC Customer 360 MDM SaaS, providing clear visibility into the data consolidation journey and enabling robust data management practices.

Tuesday, June 3, 2025

Introduction to Reltio Master Data Management

In today’s digital economy, data is the foundation of successful business operations. However, with data pouring in from countless sources — sales platforms, marketing systems, customer service channels, and more — many organizations struggle with fragmented, inconsistent, or outdated information. This is where Master Data Management (MDM) comes in, and Reltio is a leading player in this space.

Reltio Master Data Management is a modern, cloud-native MDM platform designed to help enterprises consolidate, cleanse, and unify their critical data assets. By creating a single, trusted source of truth, Reltio enables businesses to drive better decisions, improve customer experiences, and enhance compliance.

A Brief History of Reltio

Reltio was founded in 2011 by Manish Sood, a data industry veteran who saw the limitations of legacy MDM systems firsthand. With its headquarters in Redwood City, California, Reltio set out to build a next-generation MDM platform, designed for the cloud and for the demands of modern data-driven enterprises.

Since then, Reltio has grown rapidly, attracting investment from major venture capital firms and building a customer base across Fortune 500 organizations in healthcare, life sciences, financial services, retail, and other sectors. Its platform is recognized by industry analysts (such as Gartner and Forrester) for its innovation, scalability, and business value.

Why Reltio?

Unlike traditional on-premise MDM solutions, Reltio offers a cloud-first, API-driven architecture that supports real-time data processing and integration. Here are some standout features that make Reltio a compelling choice:

Multi-Domain MDM: Manage customer, product, supplier, and location data in one place
Cloud-native & scalable: Handles high-volume, high-velocity data seamlessly
Data quality & governance: Cleansing, validation, survivorship, lineage tracking
Graph technology: Discover and leverage entity relationships with connected graph models
API-first & real-time: Modern integration to power digital ecosystems

Detailed Business Use Cases with Attributes

Let’s look at some practical business use cases where Reltio is especially valuable, with examples of typical attributes managed in each:

1️⃣ Customer 360 for Financial Services

Use Case: A bank needs to create a unified customer profile to improve onboarding, risk assessment, and personalized product offerings.

Typical attributes managed:

Name
Address
Social Security Number / National ID
Date of Birth
Contact numbers
Email addresses
Account numbers
KYC documents
Risk rating
Credit score
Relationships to other customers or accounts (beneficiaries, joint account holders)

Business value:

Improved compliance (KYC/AML)
Better fraud detection
Personalized cross-selling opportunities

2️⃣ Product 360 for Retail & E-commerce

Use Case: A global retailer needs a single view of its products across all sales channels, to drive consistency in pricing, promotions, and supply chain.

Typical attributes managed:

SKU
Product name
Description
Brand
Price
Categories
Product images
Inventory levels
Supplier details
Related products / bundles

Business value:

Faster time-to-market for new products
Accurate inventory planning
Seamless omnichannel experience

3️⃣ Healthcare Provider 360

Use Case: A healthcare network needs to manage consistent information about its providers (doctors, specialists, clinics) to streamline referrals and claims processing.

Typical attributes managed:

Provider name
NPI (National Provider Identifier)
Specialty
License details
Affiliated hospitals
Availability
Contact information
Insurance acceptance
Certifications

Business value:

Reduced claim rejections
Improved care coordination
Enhanced provider search tools for patients

4️⃣ Supplier 360 for Manufacturing

Use Case: A manufacturer wants to manage supplier information globally to optimize procurement, quality, and compliance.

Typical attributes managed:

Supplier name
Tax ID
Supplier location
Product categories supplied
Pricing agreements
Contracts
Quality certifications
Risk assessments
Relationship hierarchy (parent/subsidiary)

Business value:

Reduced supplier risk
Consolidated spend
Better contract compliance

Typical Industries Benefiting from Reltio

Retail & E-commerce — better product and customer data for omnichannel
Financial Services — single customer view for compliance and fraud
Healthcare — provider and patient data management
Life Sciences — compliance and product data governance
Manufacturing — supplier and product data optimization

Conclusion

Master Data Management is no longer a “nice-to-have” — it’s a business imperative. Reltio’s modern, flexible, and scalable approach helps enterprises build a trustworthy data foundation to thrive in the digital era.

With its rich history of innovation, strong multi-domain capabilities, and focus on real-time, API-driven architecture, Reltio is well positioned to support modern businesses as they navigate increasingly complex data challenges.

If you’re exploring a future-ready MDM solution to unify and unleash the power of your data, Reltio is absolutely worth a closer look.

Wednesday, March 26, 2025

Understanding NULL Handling in Informatica MDM: Allow NULL Update vs. Apply NULL Values

NULL handling in Informatica MDM plays a crucial role in data consolidation and survivorship. Two key properties that determine how NULL values are managed are Allow NULL Update and Apply NULL Values. Let’s break them down:

1. Allow NULL Update on the Staging Table

This property controls whether a NULL value can overwrite a non-NULL value during a load job.

Enabled: A non-NULL value in a column can be updated to NULL.
Disabled: Prevents NULL updates, retaining existing non-NULL values.
Behavior in Cross-Referenced (XREF) Records:
- If a Base Object has a single XREF, a NULL can overwrite a non-NULL value.
- For multiple XREFs, NULL updates are managed based on the Allow NULL Update setting.
- To maintain consistency across single and multi-XREF records, a user exit can be implemented.

2. Apply NULL Values on the Base Object

This property determines how NULL values are treated during the consolidation process.

By Default (Disabled):
- NULL values are automatically downgraded, ensuring non-NULL values survive.
When Enabled:
- NULL values are treated normally with trust scores.
- NULLs may overwrite non-NULL values during put-operations or consolidations.
- Higher trust scores allow NULL values to survive in the Base Object.

3. Comparison: Allow NULL Update vs. Apply NULL Values

4. How MDM Determines NULL Survivorship?

For each XREF column, MDM follows these steps:

Identify the source stage table:
- If the XREF record has a non-null STG_ROWID_TABLE, use it.
- If not, use ROWID_SYSTEM to find the source stage table.
If only one source stage table exists:
- Use the Allow NULL Update setting of that table.
If multiple source stage tables exist:
- If all have the same setting, use it.
- If inconsistent, refer to Apply NULL Value setting in the Base Object.
If no stage table is found, use Apply NULL Value setting in the Base Object.
If Allow NULL Update is false, the trust score of NULL values is significantly downgraded, reducing the likelihood of NULLs surviving.

5. Operations Affected by NULL Handling

All operations involving Best Version of Truth (BVT) calculation follow these rules, including:

Load/Put/CleansePut
Merge/Unmerge
Recalculate BVT
Revalidate

By understanding these settings, you can better manage data integrity and ensure accurate MDM processing!

Wednesday, November 13, 2024

Understanding Survivorship in Informatica IDMC - Customer 360 SaaS

In Informatica IDMC - Customer 360 SaaS, survivorship is a critical concept that determines which data from multiple sources should be retained when records are merged or updated. It's a set of rules and strategies designed to ensure data accuracy, consistency, and reliability.

Key Concepts

Source Ranking:
- Assigning Trust: Each source system is assigned a rank based on its reliability and data quality.
- Prioritizing Data: Higher-ranked sources are considered more trustworthy and their data takes precedence.
- Example: If you have two sources, "HR" and "Sales," with HR being more reliable, you might assign it a rank of 1 and Sales a rank of 2. When a conflict arises, data from HR would be prioritized.
Survivorship Rules:
- Defining the Rules: These rules dictate how conflicts between field values from different sources are resolved.
- Common Rule Types:
  - Maximum: Selects the maximum value.
  - Minimum: Selects the minimum value.
  - Decay: Considers the trust level and decay rate of a source over time.
  - Custom: Allows for more complex rules based on specific business requirements.
- Example: For a "Customer Address" field, a decay rule might be applied, giving more weight to recent updates from a trusted source.

Source Last Updated Date:
- Resolving Ties: When multiple sources have the same trust level and ranking, the source with the most recent update is prioritized.
- Example: If two sources, both ranked equally, provide different values for a "Phone Number" field, the value from the source with the latest update would be chosen.
Block Survivorship:
- Grouping Fields: Allows you to treat a group of related fields as a single unit.
- Preserving Consistency: When a block survives, all fields within the block are retained together.
- Example: A "Customer Address" block might include "Street," "City," "State," and "ZIP Code." If the block survives from one source, all these fields are retained.
Deduplication Criteria:
- Identifying Duplicates: Defines the conditions for identifying duplicate records.
- Resolving Duplicates: Determines how to merge duplicate records, often based on survivorship rules.
- Example: You might deduplicate customers based on a combination of "First Name," "Last Name," and "Email Address."

Practical Example: Customer Data Merge

Imagine you have two source systems: "HR" and "Sales." Both systems have customer data, but there are inconsistencies and missing information.

Source Ranking: HR is ranked higher than Sales.
Survivorship Rules:
- For "Name," the maximum value is chosen.
- For "Address," the most recent update from the higher-ranked source is selected.
- For "Phone Number," a decay rule is applied, giving more weight to recent updates.
Block Survivorship: The "Address" block is treated as a unit.

If a customer record exists in both systems with conflicting data, the merge process would:

Prioritize the "Name" from HR if it's different.
Use the most recent "Address" from HR.
Select the "Phone Number" with the highest trust score, considering recency.

Effective Survivorship Configuration

Clear Understanding of Data Sources: Assess the reliability and quality of each source.
Prioritize Critical Fields: Focus on configuring survivorship rules for fields that are essential to business operations.
Consider Data Quality and Consistency: Analyze data quality issues and inconsistencies to optimize survivorship rules.
Regular Review and Refinement: Continuously monitor and adjust survivorship configurations as data sources and business requirements evolve.
Test Thoroughly: Implement a robust testing strategy to validate survivorship behavior and identify potential issues.

By carefully configuring survivorship rules, you can ensure that your master data is accurate, consistent, and reliable, enabling better decision-making and improved business processes.

Learn more about Informatica MDM SaaS - Customer 360 in Informatica IDMC

Wednesday, October 30, 2024

What is Glue Job in AWS?

An AWS Glue job is a managed ETL (Extract, Transform, Load) job used to process data in AWS. AWS Glue makes it easy to discover, prepare, and integrate data from various sources for analytics, machine learning, and application development.

How AWS Glue Jobs Work

AWS Glue jobs let you process large datasets using Apache Spark or small tasks with Python Shell scripts. The main workflow includes:

Data Extraction: Reading data from sources like Amazon S3, RDS, Redshift, etc.
Data Transformation: Applying transformations to clean, enrich, or format the data.
Data Loading: Writing the transformed data back to storage or analytical services.

Sample Glue Job Code

Below is an example of a Glue job script written in Python that reads data from an Amazon S3 bucket, applies a simple transformation, and writes the result back to another S3 bucket. This script uses the glueContext object, which is part of Glue’s Python API for Spark.

import sys

from awsglue.transforms import *

from awsglue.utils import getResolvedOptions

from pyspark.context import SparkContext

from awsglue.context import GlueContext

from awsglue.dynamicframe import DynamicFrame

# Initialize Glue context

args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()

glueContext = GlueContext(sc)

spark = glueContext.spark_session

job = Job(glueContext)

job.init(args['JOB_NAME'], args)

# Step 1: Read data from S3

source_data = glueContext.create_dynamic_frame.from_options(

's3',

{'paths': ['s3://source-bucket/path/to/data']},

'json'

)

# Step 2: Apply transformation (Filter rows where 'age' > 30)

filtered_data = Filter.apply(frame=source_data, f=lambda row: row['age'] > 30)

# Step 3: Write transformed data back to S3

output = glueContext.write_dynamic_frame.from_options(

frame=filtered_data,

connection_type='s3',

connection_options={'path': 's3://target-bucket/path/to/output'},

format='parquet'

)

# Commit the job

job.commit()

Explanation of the Code

Initialization: Sets up the Glue job context, which provides the Spark session and AWS Glue API.
Data Extraction: Reads JSON data from the source S3 bucket into a DynamicFrame, which is a Glue-specific data structure for Spark.
Transformation: Filters records to include only those where the age field is greater than 30.
Data Loading: Writes the transformed data back to an S3 bucket in Parquet format, which is optimized for analytics.
Commit: Completes the job.

Features of AWS Glue Jobs

Job Scheduling and Triggers: AWS Glue jobs can run on a schedule, on-demand, or based on events.
Serverless and Scalable: Glue jobs scale automatically with the volume of data and remove the need to manage infrastructure.
Data Catalog Integration: Glue jobs can leverage the Glue Data Catalog, a central repository for storing metadata about data sources.

AWS Glue jobs streamline data engineering tasks and are widely used in AWS-based data pipelines for data analytics and machine learning projects.

Learn more about Python here

Tuesday, September 24, 2024

Dynatrace : An Overview

Dynatrace, a leading provider of software intelligence, offers a powerful platform designed to monitor, analyze, and optimize the performance of complex applications and infrastructure.¹ With its advanced AI capabilities, Dynatrace provides comprehensive insights into the behavior of applications, enabling organizations to proactively identify and resolve performance issues

Key Features of Dynatrace

AI-Powered Automation: Dynatrace's AI engine, Davis, automatically discovers and maps application dependencies, eliminating the need for manual configuration.
Real User Monitoring (RUM): Gain deep insights into the user experience by tracking performance metrics from the end-user perspective.
Synthetic Monitoring: Simulate user interactions to proactively identify performance bottlenecks and ensure application availability.
Infrastructure Monitoring: Monitor the health and performance of your underlying infrastructure, including servers, networks, and databases.
Application Performance Management (APM): Gain visibility into the performance of your applications, from the frontend to the backend.
Cloud Monitoring: Monitor the performance of applications running in cloud environments, including AWS, Azure, and GCP.

Benefits of Using Dynatrace

Improved Application Performance: Identify and address performance bottlenecks before they impact users.
Enhanced User Experience: Deliver faster and more reliable applications to improve customer satisfaction.
Reduced Mean Time to Repair (MTTR): Quickly diagnose and resolve issues, minimizing downtime.
Proactive Problem Resolution: Predict potential problems and take preventive measures.
Cost Optimization: Identify opportunities to optimize resource utilization and reduce costs.

Disadvantages of Dynatrace

Steep Learning Curve: Dynatrace can be complex to set up and configure, especially for large and complex environments.
High Cost: Dynatrace can be expensive, particularly for organizations with extensive monitoring needs.
Limited Customization: While Dynatrace offers a high degree of automation, customization options can be limited in some areas.

Major Dynatrace Consumers

Dynatrace is used by a wide range of organizations across various industries, including:

Technology Companies: Software developers, cloud providers, and IT service providers.
Financial Services: Banks, insurance companies, and investment firms.
Healthcare: Hospitals, pharmaceutical companies, and healthcare providers.
Retail: E-commerce companies, brick-and-mortar retailers, and supply chain management organizations.
Government: Government agencies and public sector organizations.

Dynatrace offers a comprehensive platform for monitoring and optimizing application performance. While it can be complex and expensive, the benefits in terms of improved user experience, reduced downtime, and cost optimization can make it a valuable investment for organizations seeking to ensure the reliability and performance of their applications.

Saturday, September 21, 2024

Informatica IDMC Match and Merge Process: A Comprehensive Guide

The Match and Merge process in Informatica Intelligent Data Management Cloud (IDMC) plays a critical role in Master Data Management (MDM) by unifying and consolidating duplicate records to create a “golden record” or a single, authoritative view of the data. This functionality is particularly important for Customer 360 applications, but it also extends to other domains like product, supplier, and financial data.

In this article, we’ll break down the core concepts, the configuration details, and the Cloud Application Integration processes involved in implementing Match and Merge within Informatica IDMC.

1. Key Concepts in Match and Merge

a. Match Process:

• Matching refers to identifying duplicate or similar records in your data set. It uses a combination of deterministic (exact match) and probabilistic (fuzzy match) algorithms to compare records based on pre-configured matching rules.

• The process involves evaluating multiple attributes (such as name, email, address) and calculating a “match score” to determine if two or more records are duplicates.

• Match Rule: A match rule is a set of criteria used to identify duplicates. These rules consist of one or more conditions that define how specific fields (attributes) are compared.

• Match Path: When matching hierarchical or relational data (like customer with their addresses), the match path defines how related records are considered for matching.

b. Merge Process:

• Merging involves consolidating the matched records into a single record. This process is guided by survivorship rules that determine which data elements to keep from the duplicate records.

• The goal is to create a golden record, which is an authoritative version of the data that represents the most accurate, complete, and up-to-date information.

c. Survivorship Rules:

• Survivorship rules govern how to prioritize values from different duplicate records when merging. They can be configured to pick values based on data quality, recency, completeness, or by source system hierarchy.

• Common strategies include: most recent value, most complete value, best source, or custom rules.

d. Consolidation Indicator:

• A flag or status in the IDMC system that indicates whether a record is a consolidated master record or if it is a duplicate that has been merged into a golden record.

2. Configuration of Match and Merge in Informatica IDMC

To configure Match and Merge in Informatica IDMC, there are several steps that involve setting up match rules, survivorship strategies, and managing workflows in the cloud interface.

a. Creating Match Rules

Match rules are at the core of the matching process and determine how potential duplicates are identified. In IDMC, these rules can be created and configured through the Business 360 Console interface.

• Exact Match Rules: These rules compare records using a simple “equals” condition. For instance, an exact match rule could check if the first name and last name fields are identical in two records.

• Fuzzy Match Rules: Fuzzy match rules, often based on probabilistic algorithms, allow for minor variations in the data (e.g., typos, abbreviations). These are ideal for matching names or addresses where slight inconsistencies are common.

• Algorithms like Levenshtein distance, Soundex, or Double Metaphone can be used.

• Weighted Matching: For more sophisticated matching, each field can be assigned a weight, indicating its importance in determining a match. For example, an email match might have more weight than a phone number match.

• Thresholds: A match rule also defines a threshold score, which determines the cutoff point for when two records should be considered a match. If the total match score exceeds the threshold, the records are considered potential duplicates.

b. Configuring Survivorship Rules

Survivorship rules are essential for determining which values will be retained when records are merged.

• Most Recent: Retain values from the record with the most recent update.

• Most Complete: Choose values from the record that has the most complete set of information (fewest nulls or missing fields).

• Source-based: Give preference to certain systems of record (e.g., CRM system over a marketing database).

• Custom Rules: Custom survivorship logic can be defined using scripts or expression languages to meet specific business needs.

c. Defining Merge Strategies

• The merge strategy defines how records are consolidated once a match is identified. This could be a hard merge (where duplicate records are permanently deleted and only the golden record remains) or a soft merge (where records are logically linked, but both are retained for audit and tracking purposes).

3. Cloud Application Integration in Match and Merge

In Informatica IDMC, Cloud Application Integration (CAI) is used to automate and orchestrate the match and merge processes. Cloud Application Integration allows you to create sophisticated workflows for real-time, event-driven, or batch-driven match and merge operations.

a. Key Components of CAI

• Processes and Services: CAI provides prebuilt processes or custom-built processes that handle events (e.g., new records created) and trigger match and merge jobs.

• Business Process Management: You can orchestrate the entire customer data flow by using CAI to manage how and when records are matched and merged based on predefined criteria or user input.

• Real-Time Integration: CAI supports real-time matching, where data coming in from different systems (e.g., CRM, e-commerce platforms) is automatically deduplicated and consolidated into the master record as soon as it is ingested into IDMC.

b. Steps for Cloud Application Integration

1. Triggering Match Process: CAI workflows can be set up to initiate the match process when new data is imported, updated, or synchronized from external sources. For example, a batch of customer records from a CRM system can trigger the match job.

2. Handling Match Results: Once potential matches are identified, CAI workflows can determine whether to automatically merge the records or send them for manual review.

3. Merge Execution: If the match job identifies duplicate records, CAI can trigger a merge process based on predefined merge strategies and survivorship rules.

4. Data Stewardship Involvement: In more complex scenarios, CAI can notify data stewards when manual intervention is required (e.g., for borderline matches that need human review).

c. Automating Matching and Merging with Real-Time Updates

CAI can integrate with external systems using connectors to keep master data up to date across different environments. For example:

• New customer records from an e-commerce platform can be automatically compared with existing records in IDMC to determine if they represent new customers or duplicates.

• Based on the match results, CAI can trigger a workflow that either updates the master record or adds a new record to the system.

4. Best Practices for Match and Merge in Informatica IDMC

• Define Clear Match Rules: Start with exact match rules for critical fields (such as customer ID) and add fuzzy rules for fields prone to variations (e.g., name and address).

• Test Match Thresholds: Experiment with match scores and thresholds to fine-tune the balance between over-merging (false positives) and under-merging (false negatives).

• Monitor Performance: Match and merge operations can be resource-intensive, especially with large datasets. Use IDMC’s built-in monitoring tools to track the performance and optimize configurations.

• Data Stewardship: Set up workflows that allow data stewards to review borderline cases or suspicious matches to ensure high data quality.

The Match and Merge process in Informatica IDMC provides a robust framework for deduplicating and consolidating customer data, ensuring that organizations can achieve a 360-degree view of their customers. However, to get the most value from this functionality, it’s essential to configure match rules, survivorship logic, and cloud workflows thoughtfully. By leveraging Informatica IDMC’s Cloud Application Integration features, organizations can automate and streamline their data unification processes while ensuring high-quality, reliable, and accurate customer records.

Learn more about Informatica IDMC - Customer 360 here

DronaBlog

Thursday, July 3, 2025

Tuesday, June 3, 2025

A Brief History of Reltio

Why Reltio?

Detailed Business Use Cases with Attributes

1️⃣ Customer 360 for Financial Services

2️⃣ Product 360 for Retail & E-commerce

3️⃣ Healthcare Provider 360

4️⃣ Supplier 360 for Manufacturing

Typical Industries Benefiting from Reltio

Conclusion

Wednesday, March 26, 2025

1. Allow NULL Update on the Staging Table

2. Apply NULL Values on the Base Object

3. Comparison: Allow NULL Update vs. Apply NULL Values

4. How MDM Determines NULL Survivorship?

5. Operations Affected by NULL Handling

Wednesday, November 13, 2024

Wednesday, October 30, 2024

How AWS Glue Jobs Work

Sample Glue Job Code

Explanation of the Code

Features of AWS Glue Jobs

Tuesday, September 24, 2024

Key Features of Dynatrace

Benefits of Using Dynatrace

Disadvantages of Dynatrace

Major Dynatrace Consumers

Saturday, September 21, 2024

1. Key Concepts in Match and Merge

a. Match Process:

b. Merge Process:

c. Survivorship Rules:

2. Configuration of Match and Merge in Informatica IDMC

3. Cloud Application Integration in Match and Merge

4. Best Practices for Match and Merge in Informatica IDMC