Monthly Archives: September 2013
JobScript is simple text file with .jobscript extension, this file is used by Talend API to generate Talend Job, in JobScript you can define components, schema, transformations, connections between component. and all the things which can be done using Talend Job designer.
Note: JobScript feature not available in Talend Open Studio.
JobScript looks like a plain Text having JSON like structure. if you aware of Json then it is very easy to understand JobScript. Talend help center has good explanation on JobScript check once so you have good understanding of JobScript and the terminology we will use.
We will create job script to create a job which will be used for loading CSV data to SQL server with transformations.
so our job will have following components.
below screen you can see a sample JobScript which has exact hierarchy which start with basic setting of job then job parameters, components , and ends with component connections. I have marked those in numbers with block so there are total 3 blocks which i am going explain in detail.
Talend is great code generator having 200+ connectors, which gives you ability to transform data from one system to another. Talend is good for mid size organisation where you have to process few MB of data not GB`s and TB`s of data. because having lack of parallel processing, generic schema load model and batch processing features. there are some component and feature available which Talend claims it will give you parallel and batch processing but it fails at certain level. any way we are not going to discuss Talend perhaps we will discuss how can we automate Talend Job creation? instead creating hundreds of jobs for hundreds of metadata?
I am ETL developer and i have been assigned task to create one such job which will be used like generic data loader where metadata will be stored in SQL database tables, and these tables will be used by my job to create schema, apply transformations and then load the SQL, is short Dynamic schema using Talend.
I thought it`s great idea and Talend has Dynamic schema feature, then it can be done in few days. but when i started working on it became nightmare, so finally i dropped the idea of Dynamic schema. because of following reason.
- Reject Connectore will not work.
- You can not apply custom transformations during load
- You have to apply transformation using SQL.
- Your file must have header row.
- All the fields loaded with string data type.
- You have to change data type at the SQL side.
- No escape character support.
- SQL Table must present before start the load.
- Log management will not work.
I communicated with Talend using help center, Talend Forged and after so long found solution which is not Dynamic Schema but Dynamic Job creation using JobScript. Yes JobScript , it is Json like structure with nodes and child nodes, properties with values, components and settings, connections, context, routines and many more. every can be define using JobScript.
In next post i will explain what JobScript exactly, it basic structure and basic things need to create JobScript.