A highly motivated and ambitious individual able to give timely and accurate advice, guidance, support and training to team members and individuals. Having the ability to work with the minimum of supervision whilst leading a team. Having a proven ability to lead by example, consistently hit targets, improves best practices and organizes time efficiently.
Thursday, 18 June 2015
Transformations that supports Sorted Input
1)Aggregator
2)Joiner
3)Look up
2)Joiner
3)Look up
Stop vs Abort
Stop:
- The reading process will immediately be killed by using Stop command
- It does not have time out period
Abort:
- It gives a time out period of 60 seconds to the Informatica server.
- Otherwise it kills the DTM process.
- The reading process will immediately be killed by using Stop command
- It does not have time out period
Abort:
- It gives a time out period of 60 seconds to the Informatica server.
- Otherwise it kills the DTM process.
List of output files during a session run
1. Session log
2. Badfile
3. Errors log
4. Workflow log
2. Badfile
3. Errors log
4. Workflow log
Mapping, Session, Worklet, Workflow, Mapplet
Mapping : Represents flow of data from source to destination
Session : A set of instructions which describes the data movement from source to destination
Worklet : Set of tasks
Workflow : A set of instructions that specifies the way to execute the tasks to Informatica
Mapplet : Used for creation and configuration of a group of transformations
Session : A set of instructions which describes the data movement from source to destination
Worklet : Set of tasks
Workflow : A set of instructions that specifies the way to execute the tasks to Informatica
Mapplet : Used for creation and configuration of a group of transformations
factless fact table
It can view the number of occurring events
Example :
- A number of accidents occurred in a month.
ETL Architect Sample Questions
Depending on who's doing the interview and how serious they are:
-Describe advantages of the CIF architecture versus the bus architecture with conformed dimensions. Which would fit best in our environment given [some parameters they give you] and why
-Describe snowflaking
-Describe factless fact tables.
-Draw a star schema of our business
-Describe common optimization techniques applied at the data model level
-How do you handle data rejects in a warehouse architecture?
-Describe common techniques for loading from the staging area to the warehouse when you only have a small window.
-How do you load type 1 dimensions
-How do you load type 2 dimensions, and how would you load it given our [insert business particularity]
-How would you model unbalanced hierarchies
-How would you model cyclic relations
-What major elements would you include in an audit model?
-How would you implement traceability?
Wednesday, 17 June 2015
Load Duplicates and Unique values in separate targets
Sunday, 14 June 2015
Send Mail From Shell Script
Here we will see simple bash script to send emails using the mail command in linux operating system.
#!/bin/bash TO_ADDRESS="recipient@domain.com" FROM_ADDRESS="sender" SUBJECT="Mail Server Hosting Demo" BODY="This is a linux mail system. Linux is one of the email operating systems which can be used to send and receive emails." echo ${BODY}| mail -s ${SUBJECT} ${TO_ADDRESS} -- -r ${FROM_ADDRESS}
We will enhance the above script to attach files, to read body from a file and specifying a list of users in CC. The enhanced mail script is shown below:
#!/bin/bash TO_ADDRESS="recipient@domain.com" FROM_ADDRESS="sender" SUBJECT="linux mail send attachment example" BODY_FILE="script.dat" ATTACHMENT_FILE="logfile.txt" CC_LIST="user1@gmail.com;user2@yahoomail.com;user3@earthlink.com;user4@cheetahmail.com" uuencode ${ATTACHMENT_FILE} | mail -s ${SUBJECT} -c ${CC_LIST} ${TO_ADDRESS} -- -r ${FROM_ADDRESS} < ${BODY_FILE}
Tar Command
The syntax of tar command is
The options of tar command are:
tar [options] [Archive file] [files list]
The options of tar command are:
c : creates a tar file. v : verbose. Displays the files information. f : Specify the tar file name. r : updates the tar file with new files. x : Extracts files from the archive (tar file). t : view contents of tar file. z : Specify the tar command to create a tar file using gzip in unix. j : uses bzip2 to create the tar file.
1. Creating a tar file
Let see a sample example by archiving all the files in my current directory. The ls -l command displays the files and directories in the current directory.
We see how to tar all these files using the -c option with the tar command. This is shown below:
Let see a sample example by archiving all the files in my current directory. The ls -l command displays the files and directories in the current directory.
> ls -l drwxr-xr-x 2 user group 4096 Aug 8 03:23 debian -rw-r--r-- 1 user group 174 Aug 2 23:39 file -rw-r--r-- 1 user group 0 Aug 8 03:22 linux_server.bat -rw-r--r-- 1 user group 76 Aug 2 02:21 test.sh -rw-r--r-- 1 user group 0 Aug 8 03:22 unix_distro
We see how to tar all these files using the -c option with the tar command. This is shown below:
> tar -cvf archive.tar * debian/ file linux_server.bat test.sh unix_distro > ls archive.tar debian file linux_server.bat test.sh unix_distro
2. Printing the contents of tar file
We have created the tar file and we dont know whether it contains the actual files or not. To view the contents of the tar file use the -t option as
We have created the tar file and we dont know whether it contains the actual files or not. To view the contents of the tar file use the -t option as
> tar -tvf archive.tar drwxr-xr-x user/group 0 2012-08-08 03:23:07 debian/ -rw-r--r-- user/group 174 2012-08-02 23:39:51 file -rw-r--r-- user/group 0 2012-08-08 03:22:19 linux_server.bat -rw-r--r-- user/group 76 2012-08-02 02:21:32 test.sh -rw-r--r-- user/group 0 2012-08-08 03:22:09 unix_distro
3. Updating the tar file with new contents.
You can add new files to the existing archive (tar) file using the -r option.
You can add new files to the existing archive (tar) file using the -r option.
>touch red-hat-linux.dat >tar -rvf archive.tar red-hat-linux.dat red-hat-linux.dat >tar -tvf archive.tar drwxr-xr-x pcenter/pcenter 0 2012-08-08 03:23:07 debian/ -rw-r--r-- pcenter/pcenter 174 2012-08-02 23:39:51 file -rw-r--r-- pcenter/pcenter 0 2012-08-08 03:22:19 linux_server.bat -rw-r--r-- pcenter/pcenter 76 2012-08-02 02:21:32 test.sh -rw-r--r-- pcenter/pcenter 0 2012-08-08 03:22:09 unix_distro -rw-r--r-- pcenter/pcenter 0 2012-08-08 04:00:00 red-hat-linux.dat
4. Extracting the contents of tar file
In the first example, we have created the archive file. Now we will see how to extract the set of files from the archive. To extract the contents of the tar file use the -x option.
In the first example, we have created the archive file. Now we will see how to extract the set of files from the archive. To extract the contents of the tar file use the -x option.
> tar -xvf archive.tar debian/ file linux_server.bat test.sh unix_distro
5. Creating compressed tar file
So far we have created a uncompressed tar file in the above examples. We can create a compressed tar file using the gzip
So far we have created a uncompressed tar file in the above examples. We can create a compressed tar file using the gzip
Compressing files using gzip > tar -zcvf new_tar_file.tar.gz *
Find Command
How to find for a file using name?
find -name "sum.txt" ./bkp/sum.txt ./sum.txt
-i can be used for case insensitive file names.
How to remove files which contain the name "java".
This will delete all the files which have the word “java" in the file name in the current directory and sub-directories.
find -name "*java*" -exec rm -r {} \;
This will delete all the files which have the word “java" in the file name in the current directory and sub-directories.
How to find for a file in the current directory only?
find -maxdepth 1 -name "sum.txt"
Sed Command
Sed is a Stream Editor used for modifying the files in unix (or linux). Whenever you want to make changes to the file automatically, sed comes in handy to do this. Most people never learn its power; they just simply use sed to replace text. You can do many things apart from replacing text with sed.
Consider the below text file as an input.
1. Replacing or substituting string
Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word "unix" with "linux" in the file.
Here the "s" specifies the substitution operation. The "/" are delimiters. The "unix" is the search pattern and the "linux" is the replacement string.
By default, the sed command replaces the first occurrence of the pattern in each line and it won't replace the second, third...occurrence in the line.
2. Replacing the nth occurrence of a pattern in a line.
Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word "unix" with "linux" in a line.
3. Replacing all the occurrence of the pattern in a line.
The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.
4. Replacing from nth occurrence to all occurrences in a line.
Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line. The following sed command replaces the third, fourth, fifth... "unix" word with "linux" word in a line.
Replacing string on a specific line number.
You can restrict the sed command to replace the string on a specific line number. An example is
courtesy:folkstalk
Consider the below text file as an input.
>cat file.txt unix is great os. unix is opensource. unix is free os. learn operating system. unixlinux which one you choose.
Sed Command Examples
1. Replacing or substituting string
Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word "unix" with "linux" in the file.
>sed 's/unix/linux/' file.txt linux is great os. unix is opensource. unix is free os. learn operating system. linuxlinux which one you choose.
Here the "s" specifies the substitution operation. The "/" are delimiters. The "unix" is the search pattern and the "linux" is the replacement string.
By default, the sed command replaces the first occurrence of the pattern in each line and it won't replace the second, third...occurrence in the line.
2. Replacing the nth occurrence of a pattern in a line.
Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word "unix" with "linux" in a line.
>sed 's/unix/linux/2' file.txt unix is great os. linux is opensource. unix is free os. learn operating system. unixlinux which one you choose.
3. Replacing all the occurrence of the pattern in a line.
The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.
>sed 's/unix/linux/g' file.txt linux is great os. linux is opensource. linux is free os. learn operating system. linuxlinux which one you choose.
4. Replacing from nth occurrence to all occurrences in a line.
Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line. The following sed command replaces the third, fourth, fifth... "unix" word with "linux" word in a line.
>sed 's/unix/linux/3g' file.txt unix is great os. unix is opensource. linux is free os. learn operating system. unixlinux which one you choose.
Replacing string on a specific line number.
You can restrict the sed command to replace the string on a specific line number. An example is
>sed '3 s/unix/linux/' file.txt unix is great os. unix is opensource. unix is free os. learn operating system. linuxlinux which one you choose.
Replacing string on a range of lines.
You can specify a range of line numbers to the sed command for replacing a string.
Here the sed command replaces the lines with range from 1 to 3.
You can specify a range of line numbers to the sed command for replacing a string.
>sed '1,3 s/unix/linux/' file.txt linux is great os. unix is opensource. unix is free os. learn operating system. linuxlinux which one you choose.
Here the sed command replaces the lines with range from 1 to 3.
Deleting lines.
You can delete the lines a file by specifying the line number or a range or numbers.
You can delete the lines a file by specifying the line number or a range or numbers.
>sed '2 d' file.txt >sed '5,$ d' file.txt
Target Update Override
One can perform updates in the target in mapping using 2 approaches:
- Using a UPDATE STRATEGY transformation
- Configuring Session and defining properties to "Treat all rows as UPDATE"
But both these approach works only when there is a Primary Key defined in the target table. The session will fail when there will not be any key defined in the target.
Alternatively, if you want to perform TARGET updates, with putting condition on COLUMNs not included as PRIMARY KEY in target table, it will not be feasible with out changing the Keys on target table. i.e. If PRIMARY KEY is defined on COLUMN_A in TABLE_A and you want to Update TABLE_A with conditions on or including COLUMN_B
In such cases the UPDATE_OVERRIDE property of TARGET Object can be useful.
- How to access UPDATE_OVERRIDE property?
- Go to Mapping
- Double Clink on the concerned TARGET or edit the Target
- Click on Properties Tab
- The second Transformation Attribute is the property we are looking for
- Syntax for UPDATE_OVERIDE SQL:
UDATE <TARGET TABLE>
SET <COLUMN NAME to be updated> = :TU.<TARGET COLUMN PORT NAME (As in Designer)>
SET <COLUMN NAME to be updated> = :TU.<TARGET COLUMN PORT NAME (As in Designer)>
, [Other columns need to be updated]
WHERE <COLUMN NAME to be treated as KEY > = :TU.<corresponding TARGET COLUMN PORT NAME (As in Designer)>
AND [other conditions]
WHERE <COLUMN NAME to be treated as KEY > = :TU.<corresponding TARGET COLUMN PORT NAME (As in Designer)>
AND [other conditions]
- Example:
UPDATE EMPL_POST_HIST
SET POST = :TU.POST
, UPDATE_DATE = :TU.UPDATE_DATE
WHERE EMPL = :TU.EMPL
Following is an example showing Problem statement and step by step Instructions, of two approaches, of suggested solution.
Problem Scenario:
- Target table has not defined Primary key
- Informatica target object has no Keys - Since the Database table has not Keys the Imported Object in Informatica will not have any Key in the Target definition
- Original Mapping : The mapping created using this very Target definition
- Defining update logic: configuring session for UPDATE Logic
- Configure session for Treat All Source Row as "UPDATE"
- Configure in the session target property to UPDATE as UPDATE - optional
- The error you will get while execution - the session will fail
Solution:
Approach 1: Using Update in the Target
- Go to properties tab of Target in Mapping:
- Edit the "Update Override" attribute - generate SQL and add where Clause with the column name on which you want to update:
- Please note that you need to remove the very column, used in WHERE clause from the SET Clause list
- Go to Workflow and refresh changes in the mapping and save them
- Execute the Workflow
- Now it has successfully updated the rows:
Approach 2: Forcing Keys in the Target object definition in designer
- Edit the Target in the target designer
- Define your Primary key column. Please note that in this case the Database need not have keys defined on the same table:
- Save the changes
- Execute the workflow
- The workflow will update the records in the target table:
PMCMD Command Usage in Informatica
Informatica provides four built-in command line programs or utilities to interact with the informatica features. They are:
1. Scheduling the workflow
The pmcmd command syntax for scheduling the workflow is shown below:
You cannot specify the scheduling options here. This command just schedules the workflow for the next run.
2. Start workflow
The following pmcmd command starts the specified workflow:
3. Stop workflow
Pmcmd command to stop the infromatica workflow is shown below:
4. Start workflow from a task
You can start the workflow from a specified task. This is shown below:
5. Stopping a task.
The following pmcmd command stops the specified task instance:
- infacmd
- infasetup
- pmcmd
- pmrep
- Start workflows.
- Start workflow from a specific task.
- Stop, Abort workflows and Sessions.
- Schedule the workflows.
1. Scheduling the workflow
The pmcmd command syntax for scheduling the workflow is shown below:
pmcmd scheduleworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
You cannot specify the scheduling options here. This command just schedules the workflow for the next run.
2. Start workflow
The following pmcmd command starts the specified workflow:
pmcmd startworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
3. Stop workflow
Pmcmd command to stop the infromatica workflow is shown below:
pmcmd stopworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
4. Start workflow from a task
You can start the workflow from a specified task. This is shown below:
pmcmd startask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name -startfrom task-name
5. Stopping a task.
The following pmcmd command stops the specified task instance:
pmcmd stoptask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-namecourtesy:folkstalk
6. Aborting workflow and task.
The following pmcmd commands are used to abort workflow and task in a workflow:
pmcmd abortworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name pmcmd aborttask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name
Subscribe to:
Posts (Atom)