I have listed down some of the common differences between tJava, tJavaRow and tJavaFlex component.
|Use component to integrate your custom Java code||Yes||Yes||Yes|
|It will be executed first but only once in the subjob.||Yes||No||–|
|It requires Input Flow||No||Yes||No|
|It Requires Output Flow||No||If output Schema Defined||If output Schema Defined|
|It can be used as Start of the Job||Yes||No||Yes|
|It can be used as a separate subjob||Yes||No||Yes|
|It accepts Main Flow or Iterator Flow||Both||Only Main||Both|
|It has three Java code parts (start, main, end)||No||No||Yes|
|It will Auto propagate Data||No||No||Yes|
tJava advantage: this component can be use as trigger component, At start of the job at end of the job.
tJavaRow : this component required main flow so it can be used at end of the Sub Job but not at the start of subjob.
tJavaFlex: this component holds capabilities of tJava & tJavaRows you can use this component for row generation, or at start of job, or at then end of sub job. or individual subjob. It gives you ability to auto propagate data.
File processing is a day to day task in ETL world, and there is huge need of validation regarding source file format, headers, footers, column name, data type and so on, thanks to tSchemaComplianceCheck component which can do most of the validation like.
- Length checking.
- Date pattern/format.
- Data Types
But does not support number of columns and column sequence validation, that which we have to manage using java code, in this post I will describe you how to validate column names and their sequence.
This is our final job design.
Let’s start with adding first component to job designer. Add tFileList and configure to get expected files.
Add tFileInputFullRow component and configure as shown in below screen.
- Add tMap and connect with main link from tFileinputFullRow component.
- Add tFixedFlowInput and connect with lookup link to tMap then configure as follows.
Note: if you have your refrence header row stored in file or database you can use it instead of tFixedFlowInput.
- Configure tMap as follows.
- Make inner join with your reference line and main line of input.
- Add two outputs and named it as “matching” and “reject” respectively.
- In the reject output click on setting “catch lookup inner join reject”=true
- Add source line to both the flows.
See image for more details.
- Add tJava next to tMap and connect with “matching” flow.
- Add another tJava next to tMap and connect with “reject” flow.
- Add tFileInputDelimited and connect with first tjava using “iterate” flow.
- Configure tFileInputDelimited as shown in below image.
Add tLogRow component to see the output from file.
You can see that for each file whole sub job will be executed if it is matching with header row then it will be used for reading.
You can connect reject row to make a note of rejected file based on your requirement.
Loop Start Date through End Date using tLoop
This post I will describe you how to loop through start date to end date. For that we will use tLoop component which will give us two loop options first one is “for loop” and second one is “ while loop”.
Write down below code in tJava.
java.util.Date start_date=TalendDate.parseDate(“yyyy-MM-dd”, “2015-01-01”);
java.util.Date end_date=TalendDate.parseDate(“yyyy-MM-dd”, “2015-01-10”);
long l=TalendDate.diffDate(end_date, start_date);
code look likes as follows.
In above line of you can see we have parse two dates first one is start_date and second one is end_date.
Then we have calculated number of days using TalendDate.diffDate() method it will return number of days in long data type that is stored in variable “l” then this being assigned to “context.days” context variable.
Drop tLoop component next to the tJava and link with “OnsubJobOk” trigger then configure tLoop as follows.
Add tJava component next to tLoop and link with “iterate” flow. tLoop as two global variables which can be used in flow for calculation or manipulation.
Here are those variables.
We will use CURRENT_VALUE to get the day from start day through end date. To print each day on console we will use add date method from TalendDate routine. See the below code, wherein we are adding current value from flow to the start_date to increment start date by one day.
After job run you will see below output on console.
This Post I will describe you, how to convert string to date in Talend. I will use various string dates to demonstrate.
- Converting simple string with consistent format: “MM/dd/yyyy hh:mm”
we will convert above date with “MM/dd/yyyy hh:mm” format. for that we will use below built in function from TalendDate routine.
TalendDate.parseDate(“MM/dd/yyyy hh:mm”,”12/21/2000 0:00″)
above function will return Date object if you print it will give you output as
Thu Dec 21 00:00:00 IST 2000
if you want this date in any other format then use below function.
TalendDate.formatDate(“dd-MMM-yyyy”, TalendDate.parseDate(“MM/dd/yyyy hh:mm”, “12/21/2000 0:00”))
TalendDate.formatDate(pattern, Dated) will return date in string type “21-Dec-2000” .
we can parse below non consistent formatted string using same method.
- convert heterogeneous formatted string to date. Sample String dates are as follows.
We will write some java code to replace “/” with non. so below code will replace “/” with empty string “” and then parse date function convert it using given format.
TalendDate.parseDate(“yyyyMMdd”, InputString.replaceAll(“/”, “”))
Convert dates with time stamp.
- Input String “2014-11-14T10:41:34-08:00”
- Format “yyyy-MM-dd’T’HH:mm:ssXXX”
- Input String: “2013-09-03T21:54:32.027+02:00”
- Format: “yyyy-MM-dd’T’HH:mm:ss.SSSX:00”
- Input String: “Tue May 08 00:00:00 CEST 2012”
- Format: “EEE MMM dd HH:mm:ss zzz yyyy”
TalendDate.parseDateLocale(“EEE MMM dd HH:mm:ss zzz yyyy”, “Tue May 08 00:00:00 CEST 2012”, “EN”)
- Input String: “30 Aug 2011 07:06:00”
- Format: “dd MMM yyyy HH:mm:ss”
TalendDate.parseDateLocale(“dd MMM yyyy HH:mm:ss”,”30 Aug 2011 07:06:00″,”EN”)
- Input String “24/02/2015 23:15:37.250000000”
- Format: “dd/MM/yyyy HH:mm:ss.SSSS”
System.out.println(TalendDate.parseDate(“dd/MM/yyyy HH:mm:ss.SSSS”, “24/02/2015 23:15:37.250000000”));
Parse DateTime-string with AM/PM marker
- Input String : “12/20/2012 10:02 PM”
- Format String: “MM/dd/yyyy HH:mm a”
System.out.println(TalendDate.parseDate(“MM/dd/yyyy HH:mm a”,”12/20/2012 10:02 PM”);
If you have any other format which is not listed here, then please send us we will include in list.
Keep visiting this page for newer formats.