January 2015 – Umesh Rakhe

Convert String To Date

Posted on January 16, 2015 by Umesh

This Post I will describe you, how to convert string to date in Talend. I will use various string dates to demonstrate.

Converting simple string with consistent format: “MM/dd/yyyy hh:mm”

12/21/2000 0:00

we will convert above date with “MM/dd/yyyy hh:mm” format. for that we will use below built in function from TalendDate routine.

TalendDate.parseDate(“MM/dd/yyyy hh:mm”,”12/21/2000 0:00″)

above function will return Date object if you print it will give you output as

Thu Dec 21 00:00:00 IST 2000

if you want this date in any other format then use below function.

TalendDate.formatDate(“dd-MMM-yyyy”, TalendDate.parseDate(“MM/dd/yyyy hh:mm”, “12/21/2000 0:00”))

TalendDate.formatDate(pattern, Dated) will return date in string type “21-Dec-2000” .

we can parse below non consistent formatted string using same method.

12/21/2000 0:00
5/11/2007 0:00
5/1/2009 0:00

convert heterogeneous formatted string to date. Sample String dates are as follows.

2014/12/21
20140214
2014/12/13
2014/12/23
20141201

We will write some java code to replace “/” with non. so below code will replace “/” with empty string “” and then parse date function convert it using given format.

TalendDate.parseDate(“yyyyMMdd”, InputString.replaceAll(“/”, “”))

Convert dates with time stamp.

Input String “2014-11-14T10:41:34-08:00”
Format “yyyy-MM-dd’T’HH:mm:ssXXX”

TalendDate.parseDate(“yyyy-MM-dd’T’HH:mm:ssXXX”,”2014-11-14T10:41:34-08:00″)

Input String: “2013-09-03T21:54:32.027+02:00”
Format: “yyyy-MM-dd’T’HH:mm:ss.SSSX:00”

TalendDate.parseDate(“yyyy-MM-dd’T’HH:mm:ss.SSSX:00″,”2013-09-03T21:54:32.027+02:00”)

Input String: “Tue May 08 00:00:00 CEST 2012”
Format: “EEE MMM dd HH:mm:ss zzz yyyy”

TalendDate.parseDateLocale(“EEE MMM dd HH:mm:ss zzz yyyy”, “Tue May 08 00:00:00 CEST 2012”, “EN”)

Input String: “30 Aug 2011 07:06:00”
Format: “dd MMM yyyy HH:mm:ss”

TalendDate.parseDateLocale(“dd MMM yyyy HH:mm:ss”,”30 Aug 2011 07:06:00″,”EN”)

Input String “24/02/2015 23:15:37.250000000”
Format: “dd/MM/yyyy HH:mm:ss.SSSS”

System.out.println(TalendDate.parseDate(“dd/MM/yyyy HH:mm:ss.SSSS”, “24/02/2015 23:15:37.250000000”));

Parse DateTime-string with AM/PM marker

Input String : “12/20/2012 10:02 PM”
Format String: “MM/dd/yyyy HH:mm a”

System.out.println(TalendDate.parseDate(“MM/dd/yyyy HH:mm a”,”12/20/2012 10:02 PM”);

If you have any other format which is not listed here, then please send us we will include in list.

Keep visiting this page for newer formats.

Read XML with Optional Elements

Posted on January 9, 2015 by Umesh

This post I will describe how to parse XML with optional element.

We will use below source xml file which has three customer details, along with awards details, and <CUSTOMERAWARDS> is a optional xml element.

We will parse this file using tXMLMap component. so fist of all add tFileInputXML and configure as below.

Assign source file path
Create single column in schema named as
Create CUSTOMERS column with “Document” data type in schema.
Put loop Xpath query = “/CUSTOMERS”
In Mapping section add XPath Query =”.”
Select Get Nodes check box.

Add tXMLMap component and connect with tFileInputXML component using Main link and create source tree structure as shown in image.

Note: You can create create sub elements manually or it can be populated from XSD file or from repository.

Add two Outputs and drag and drop relevant source columns to output (Refer image).

Click on first output`s “set loop function” short menu and add one sequence then select xpath = customerid xpath, see the image for more details.

Our first Output is ready now you have to configure second output so follow the steps we did for first output and select xpath= customerawards, see the image for more details.

Add tlogrow for each output and then execute the job you will see output like below. If you observe, customer id 1236 it has no awards extracted but customer id 1234 and 1235 awards extracted completely.

Difference between tMap and tJoin

Posted on January 9, 2015 by Umesh

tMap is frequently used component for joins and lookup purpose, it is also use for verity of operations and transformations, whereas tJoin is used for join and lookups only.

tMap	tJoin
It accepts more than one input one is main and rests of the lookups.	It accepts only two inputs and only one is main and other one is lookup.
We can create more than one output	It has two default outputs one is “Main” and another one is ” Inner join reject”
tMap has “inner join ” and ” left outer join” joining model	tJoin offer`s only “inner join”
tMap offers three match model Unique Match First Match All Matches	tJoin defaulted with Unique match
tMap allows to store data on file option for lookup data processing	tJoin doesn`t offer this feature
In tMap you can filter data using filter expression	tJoin doesn`t offer this feature
You can write transformation using expression builder at each column level	tJoin doesn`t offer this feature

Split Rows to Columns

Posted on January 9, 2015 by Umesh

This post I will describe you how to split rows into columns, we will use below sample as input records.

Input Rows.

Expected Output.

Create a Job and add tFixedFlowInput component and put above input as “Use inline content” and create schema as shown in image.

Add tPivotToColumnsDelimited component and connect with tFixedFlowInput component as main connection then configured this component shown in below image.

Configurations :

Pivot Column =”Type”

Aggregation column=”Value”

Aggregation Function =”last”

Group by “ID” and “Name” column.

Rest of the configuration is for output file, where our output will be transferred. to read output file we can use either delimited component but for quick review I`ll use tFileInputFullRow.

Add tFileInputFullRow below the tFixedFlowInput component and connect with “On Sub Job Ok” trigger. and provide previously created file path and rest of the details.

add tLogRow and connect to tFileInputFullRow component and execute the job you will get above out put on console.

Final Job Design.

This component will create N number of columns based on your input, if you are dealing with fix schema then it will create complexity for further processing.

Split large XML into multiple XML

Posted on January 2, 2015 by Umesh

In this post, I will describe you how to split large XML into several xml.

Here is our Sample XML file. ( which is not huge but just a sample)

We are expecting three XML files from sample xml hence lets start with metadata creation for this sample file.

Once you created metadata then you can drag and drop schema to job designer. for the scenario we will choose tFileInputXML component.

Now add another component tXMLMap and link tFileInputXML to tAdvancedFileOutputXml then configure tAdvancedFileOutputXml as shown in image.

Now we have mapped our source column to output columns, but it will output all the rows in single file, to create a file for each row we have to configured tAdvancedFileOutputXML component using Advance property of component tab. use “Spit output in Several files” option with value as “1”. by doing this it will create new file for each row.