Blog Archives

Read XML with Nested Loops

In this post, I will describe you, how to parse xml having nested loops in Talend. For that I am using below XML as source.

Source XML has list of Items from a retail store,  and <item> node repeated for each item. Inside <item> node we have nested nodes for <batter> and <topping> our task is to read all the items with nested loops in their separate flow.

Image of source XML.

Sample XMl file with Nested Loops

Sample XMl file with Nested Loops

First of all you should create metadata of your XML file, but in this post I am using XSD to populate source schema in tXMLMap.

Create new job and add tFileinputXML component and configured it as shown in Image.

Image of tFileInputXMl component.

We have set XPATH Loop to “/items” as this is the root node of XML file. then I have created one column with “Document” data type. see in Image.

Set tFileInputXML for Nested loops

Set tFileInputXML for Nested loops

tXMLMap and connect with tFileinputXML component using “Main” link.

Open tXMLMap right click on “items” node at right(source) side then select “Import from file”.

provide XSD file it will automatically create all sub nodes. like below image.

Set batter & toppiq as “loop” element by right click  and select ” As loop element”

Configure tXMlMap

Configure tXMlMap

Create two outputs,  “batter” and “tappinq” then drag respective columns to “batter & tappinq”

Click on “set loop function” box it will open a window wherein you have add new row and select respective loop path e.g. for “toppinq” i will select “toppinq” loop like wise batter for “batter” output.

Your Final settings looks like below image.

Configuration tXMLMap output with loop path

Configuration tXMLMap output with loop path

we are ready to get the result add tLogRow components for each  of the output and execute the job it show results shown in image.

Nested XML loop parse Output

Nested XML loop parse Output

 

Advertisements

Read XML with Optional Elements

This post I will describe how to parse XML with optional element.

We will use below source xml file which has three customer details, along with awards details, and <CUSTOMERAWARDS> is a optional xml element.

Sample XML file

Sample XML file

We will parse this file using tXMLMap component. so fist of all add tFileInputXML and configure as below.

  • Assign source file path
  • Create single column in schema named as
  • Create CUSTOMERS column with “Document” data type in schema.
  • Put loop Xpath query = “/CUSTOMERS”
  • In Mapping section add XPath Query =”.”
  • Select Get Nodes check box.

Add tXMLMap component and connect with tFileInputXML component using Main link and create source tree structure as shown in image.

Note: You can create create sub elements manually or  it can be  populated from XSD file or from repository.

Add two Outputs and drag and drop relevant source columns to output (Refer image).

tXMLMap Configuration

tXMLMap Configuration

Click on first output`s “set loop function” short menu and add one sequence then select xpath = customerid xpath, see the image for more details.

tXml Map First Output

tXml Map First Output

Our first Output is ready now you have to configure second output so follow the steps we did for first output and select xpath= customerawards, see the image for more details.

tXml Map Second Output

tXml Map Second Output

Add tlogrow for each output and then execute the job you will see output like below. If you observe, customer id 1236 it has no awards extracted but customer id 1234 and 1235 awards extracted completely.

OutPut

Out Put

Split large XML into multiple XML

In this post, I will describe you how to split large XML into several xml.

Here is our Sample XML file. ( which is not huge but just a sample)

Split Xml Talend

Source XML

We are expecting three XML files from sample xml hence lets start with metadata creation for this sample file.

Once you created metadata then you can drag and drop schema to job designer. for the scenario we will choose tFileInputXML component.

Now add another component tXMLMap and link tFileInputXML to tAdvancedFileOutputXml then configure tAdvancedFileOutputXml as shown in image.

tAdvanceOutputXMl Mapping

tAdvanceOutputXMl Mapping

Now we have mapped our source column to output columns, but it will output all the rows in single file, to create a file for each row we have to configured tAdvancedFileOutputXML component using Advance property of component tab. use “Spit output in Several files” option with value as “1”. by doing this it will create new file for each row.

tAdvancedOutputXML Setting

tAdvancedOutputXML Setting

After run, this job will create three files on mention path like below.

Output Xml Files

Output Xml Files

And here is the final output.

Output Xml Files

splitxml6 splitxml7Output Xml Files