Sunday, 14 June 2015

Lookup Cache Types in Informatica


To boost the performance of Lookup Transformation , we mention to cache the lookup data , so it can directly look into cache file instead of connecting to source and then reading data from there for each row -

Lookup can generate a cache file which persists for the duration of the session, or even created as a permanent named cache. This means that the data required for the lookup is read from its source only once. Informatica creates an index on the lookup cache minimizing the processing time for calls to the cache. 

Static Lookup :
In case of static cache ,the Integration Service does not update the cache while it processes the transformation. This is why it is called as Static. In Static Cache when the Lookup condition is true it return value from lookup table else returns Null or Default value. In Static Cache the important thing is that you cannot insert or update the cache.
Static Cache Lookup

Static Cache is normally used while referring to reference Data or when Data source is not a target for the mapping.
Example : Referring to Dept table to get Dept Name based on the DEPT no for Each employee.

Dynamic Lookup:

In Dynamic Cache we can insert or update rows in the cache when we pass the rows. Normally while using Target  as the lookup source to identify if new record is already there in Target or not we use this Dynamic cache.When a new record comes and it is not present in Cache (means missing in Target as well) , then it will insert this new record in Cache as well . So the dynamic cache is synchronized with the target with each processed row.
Dynamic Cache Lookup
Example : If there are 2 or more entries of the same customer on the same day in Source systems then while loading to the target we want the Lookup Cache to be refreshed dynamically(Insert/Update) and see the latest data in Cache.

Non Persistent Cache:

By default, the Informatica Server uses a non-persistent cache when you enable caching in a Lookup transformation. The Informatica Server deletes the cache files at the end of a session. The next time you run the session, the Informatica Server builds the memory cache from the database

Persistent Cache:

Persistent lookups are used when there is a need to save and reuse the existing cache files. Persistent cache is used when a lookup table does not change between session runs. The first time the Informatica Server runs a session using a persistent lookup cache; it saves the cache files to disk instead of deleting them on session completion. The next time the Informatica Server runs the session which calls the same persistent lookup, it builds the memory cache from the cache files, eliminating the time required to read the lookup table.  If the lookup table changes occasionally, you can override session properties to recache the lookup from the database.

Advantages:

  • Informatica Server uses existing cache files for subsequent lookups, eliminating the time required to built the cache again
  • Cache file can be shared
  • Persistent lookup cache can be used for same multiple lookup call with in the same mapping / different mappings

Disadvantages:

  • Cache needs to be rebuilt if lookup table changes occasionally
  • Cache needs to be rebuilt if lookup transformation is changed or data movement code is changed or database connection information is changed
  • Lookup SQL override, has to be same in all lookups using same persistent cache file, else the mapping will fail
Note: In case there are multiple interdependent jobs in a workflow, which uses same persistent lookup cache, it is recommended to have dummy jobs for persistent cache file generation in the beginning of workflow. This eases the support and maintenance and jobs execution time. As a practice, please ensure to delete the existing cache file prior to re-building the cache. Informatica internally handles deletion of existing cache files; it is still preferred to have the UNIX script for deletion. (For e.g.: In case session for building lookup cache is disabled or is not executed, then the subsequent session will use the same cache file available on the server. The addition of UNIX scripts job will ensure files are deleted; this can be done through command task 
Courtesy:Techtricks

No comments:

Post a Comment