Showing posts with label xml parsing. Show all posts
Showing posts with label xml parsing. Show all posts

Friday, 15 April 2011

Reading/Parsing RSS feed using ROME

ROME is an open source tool to parse, generate and publish RSS and Atom feeds. Using Rome you can parse the available RSS and Atom feeds. Without bothering about format and version of RSS feed. The core library depends on the JDOM XML parser.
Atom is on the similar lines of RSS is another kind of feed. But it’s different in some aspects as protocol, payloads.
RSS is a method to share and publish contents. The contents may be any things from news to any little information. The main component is xml. Using xml you can share your contents on web. At the same time you are free to get what you like from others.

Why use Rome instead of other available readers

The Rome project started with the motivation of ‘ESCAPE’ where each letter stands for:
E – Easy to use. Just give a URL and forget about its type and version, you will be given a output in the format which you like.
S – Simple. Simple structure. The complications are all hidden from developers.
C – Complete. It handles all the versions of RSS and Atom feeds.
A – Abstract. It provides abstraction over various syndication specifications.
P – Powerful. Don’t worry about the format let Rome handle it.
E – Extensible. It needs a simple pluggable architecture to provide future extension of formats.

Dependency

Following are few dependencies:
J2SE 1.4+, JDOM 1.0, Jar files (rome-0.8.jar, purl-org-content-0.3.jar, jdom.jar)

Using Rome to read a Syndication Feed

Considering you have all the required jar files we will start with reading the RSS feed. ROME represents syndication feeds (RSS and Atom) as instances of the com.sun.syndication.synd.SyndFeed interface.
ROME includes parsers to process syndication feeds into SyndFeed instances. The SyndFeedInput class handles the parsers using the correct one based on the syndication feed being processed. The developer does not need to worry about selecting the right parser for a syndication feed, the SyndFeedInput will take care of it by peeking at the syndication feed structure. All it takes to read a syndication feed using ROME are the following 2 lines of code:
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build (new XmlReader (feedUrl));
Now it’s simple to get the details of Feed. You have the object.

The sample code is as follows.
import java.net.URL;
import java.util.Iterator;

import com.sun.syndication.feed.synd.SyndEntry;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.io.SyndFeedInput;
import com.sun.syndication.io.XmlReader;

/**
 * @author Hanumant Shikhare
 */
public class Reader {

public static void main(String[] args) throws Exception {

URL url = new URL("http://viralpatel.net/blogs/feed");
XmlReader reader = null;

try {

reader = new XmlReader(url);
SyndFeed feed = new SyndFeedInput().build(reader);
System.out.println("Feed Title: "+ feed.getAuthor());

for (Iterator i = feed.getEntries().iterator(); i.hasNext();) {
SyndEntry entry = (SyndEntry) i.next();
System.out.println(entry.getTitle());
}
} finally {
if (reader != null)
reader.close();
}
}
}

Understanding the Program

Initialize the URL object with the RSS Feed or Atom url. Then we will need XMLReader object which will then take URL object, as its constructor argument. Initialize the SyndFeed object by calling the build(reader) method. This method takes the XMLReader object as an argument.

References

https://rome.dev.java.net/
http://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared
http://www.rss-specifications.com

Tuesday, 15 March 2011

JDOM vs SAX and DOM

DOM represents a document tree fully held in memory. It is a large API designed to perform almost every conceivable XML task. It also must have the same API across multiple languages. Because of those constraints, DOM does not always come naturally to Java developers who expect typical Java capabilities such as method overloading, the use of standard Java object types, and simple set and get methods. DOM also requires lots of processing power and memory, making it untractable for many lightweight Web applications and programs.

SAX does not hold a document tree in memory. Instead, it presents a view of the document as a sequence of events. For example, it reports every time it encounters a begin tag and an end tag. That approach makes it a lightweight API that is good for fast reading. However, the event-view of a document is not intuitive to many of today's server-side, object oriented Java developers. SAX also does not support modifying the document, nor does it allow random access to the document.

JDOM attempts to incorporate the best of DOM and SAX. It's a lightweight API designed to perform quickly in a small-memory footprint. JDOM also provides a full document view with random access but, surprisingly, it does not require the entire document to be in memory. The API allows for future flyweight implementations that load information only when needed. Additionally, JDOM supports easy document modification through standard constructors and normal set methods.

JDOM tutorial : Introduction


JDOM is an open source API designed to represent an XML document and its contents to the typical Java developer in an intuitive and straightforward way. As the name indicates, JDOM is Java optimized. It behaves like Java, it uses Java collections, and it provides a low-cost entry point for using XML. JDOM users don't need to have tremendous expertise in XML to be productive and get their jobs done.
JDOM interoperates well with existing standards such as the Simple API for XML (SAX) and the Document Object Model (DOM). However, it's more than a simple abstraction above those APIs. JDOM takes the best concepts from existing APIs and creates a new set of classes and interfaces that provide, in the words of one JDOM user, "the interface I expected when I first looked at org.w3c.dom." JDOM can read from existing DOM and SAX sources, and can output to DOM- and SAX-receiving components. That ability enables JDOM to interoperate seamlessly with existing program components built against SAX or DOM.
JDOM has been made available under an Apache-style, open source license. That license is among the least restrictive software licenses available, enabling developers to use JDOM in creating products without requiring them to release their own products as open source. It is the license model used by the Apache Project, which created the Apache server. In addition to making the software free, being open source enables the API to take contributions from some of the best Java and XML minds in the industry and to adapt quickly to new standards as they evolve.
History of JDOM
The JDOM API was developed by Jason Hunter and Brett McLaughlin in March 2000. Now it is being maintained by the http://www.jdom.org/. You can download the latest version of JDOM libraries and source file from its official website at http://www.jdom.org/.
The JDOM api was developed to provides fast and robust api for processing xml documents. The JDOM API is designed specifically for Java platform, making it more useful. It uses the built-in String support of the Java language. It also makes use of Java 2 collection classes wherever possible. So, JDOM API gives good performance.
Downloading JDOM API
The JDOM API is distributed from it official website at http://www.jdom.org/. You can get the latest source and binary version from http://www.jdom.org/.
The current version of JDOM is 1.1.1, which can be downloaded from http://www.jdom.org/downloads/source.html

Sunday, 6 March 2011

Parsing an XML Document with XPath

J2SE 5.0 provides the javax.xml.xpath package to parse an XML document with the XML Path Language (XPath) other than DOM and SAX parsing. The JDOM org.jdom.xpath.XPath class also has methods to select XML document node(s) with an XPath expression, which consists of a location path of an XML document node or a list of nodes.

Parsing an XML document with an XPath expression is more efficient than the getter methods, because with XPath expressions, an Element node may be selected without iterating over a node list. Node lists retrieved with the getter methods have to be iterated over to retrieve the value of element nodes.

XPath - A Query Language for XML

Let us see how XPath can be used to query the various pieces of data in a XML Document. Consider a following simple XML file,

<employees>    
<employee id = "001">
<name>Johny</name>
</employee>
<employee>
<name>Williams</name>
</employee>
</employees>


The above XML file represents a collection of Employee instances as represented by the <employee> tag. A set of <employee> shares a common root tag <employees>. It is wise to mention that in XML terms a tag, element or node all means the same. A XML Document is nothing but a collection of properly organised well-formed tags. A XML Document can contain a mixture of several of the commonly-used tags or nodes like Element, Attribute, Text etc.

For example, in the above employees.xml, <employees>, <employee>, <name> are examples for 'Elements'. 'Attributes' represent a property of an element and in our example XML Document, it happens to be the 'id' attribute of the <employee> element. A 'Text' in a XML Document represents any textual content. For example 'Johny' and 'Williams' are the suitable candiates for 'Text'.

XPath uses simple expressions to query or select a portion of information from a XML Document. For instance, if we want to get the name of the first employee, then we can frame an expression like this,


/employees/employee[1]/name

The above expression can be intepreted like this, Starting from the root of the XML Document, (which is represented by '/') traverse until the <employees> element is found, then deep traverse to find the first employee element represented by employee[1], then retrive the value of the <name> element. As seen, the XML Document is hierarchically traversed to retrieve the information. '/' represents the root of the document, and multiple elements having the same name can be accessed using Array-based notation. The index starts with 0, 1, … and so on. If we want to select an attribute then '@' sign has to be prefixed along with the attribute name. For example, if we wish to query for the 'id' value for the second employee, then the following XPath expression will just do that,


/employees/employee[2]/@id


Java and XPath


Easy to use Java XPath API is available for accessing the XML data. The XPath API is available in the standard JDK distribution in the javax.xml.xpath package. All we have to do is to utilize the XPathFactory, XPath and XPathExpression classes and interfaces to do the task.

XPathFactory class follows the standard Factory Pattern to create XPath objects. XPath objects provides an environment to compile expressions which is encapsulated by XPathExpression. Then the compiled XPathExpression can be executed to get the desired results. Following is the code snippet,

XPathFactory xPathFactory = XPathFactory.newInstance();
// To get an instance of the XPathFactory object itself.
XPath xPath = xPathFactory.newXPath();
// Create an instance of XPath from the factory class.
String expression = "SomeXPathExpression";
XPathExpression xPathExpression = xPath.compile(expression);
// Compile the expression to get a XPathExpression object.
Object result = xPathExpression.evaluate(xmlDocument);
// Evaluate the expression against the XML Document to get the result.



Sample Application


Following section provides a sample application to demonstrate the usage of XPath in Java Applications. The sample application tries to select the value of an element, the value of an attribute, the value of a element-set (which is an element containing multiple elements) by compiling and executing different expressions.

1) projects.xml

Here is a XML file called 'projects.xml' which contains the structured information for various projects. The <project> element has an attribute called 'id' and various nested elements like <name>, <start-date> and <end-date>. The structure of the XML File is given below.

<?xml version="1.0" encoding="UTF-8"?>
<projects>

<project id = "BP001">
<name>Banking Project</name>
<start-date>Jan 10 1999</start-date>
<end-date>Jan 10 2003</end-date>
</project>
<project id = "TP001">
<name>Telecommunication Project</name>
<start-date>March 20 1999</start-date>
<end-date>July 30 2004</end-date>
</project>
<project id = "PP001">
<name>Portal Project</name>
<start-date>Dec 10 1998</start-date>
<end-date>March 10 2006</end-date>
</project>

</projects>

2) XPathReader.java

Now, let write a simple Java Application which acts as a reader in reading the various pieces of information from the XML Document. Following is the Java source that does the job of parsing the XML Document.


package com.javabeat.tips.xpath;
import java.io.IOException;
import javax.xml.XMLConstants;
import javax.xml.namespace.QName;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
public class XPathReader {

private String xmlFile;
private Document xmlDocument;
private XPath xPath;

public XPathReader(String xmlFile) {
this.xmlFile = xmlFile;
initObjects();
}

private void initObjects(){
try {
xmlDocument = DocumentBuilderFactory.
newInstance().newDocumentBuilder().
parse(xmlFile);
xPath = XPathFactory.newInstance().
newXPath();
} catch (IOException ex) {
ex.printStackTrace();
} catch (SAXException ex) {
ex.printStackTrace();
} catch (ParserConfigurationException ex) {
ex.printStackTrace();
}
}

public Object read(String expression,
QName returnType){
try {
XPathExpression xPathExpression =
xPath.compile(expression);
return xPathExpression.evaluate
(xmlDocument, returnType);
} catch (XPathExpressionException ex) {
ex.printStackTrace();
return null;
}
}
}


The constructor of this class is passed a XML File from which the information has to be read. The method initObjects() is called immediately, and it is used to initialize the XML Document and the XPath objects. A Document representation of the XML File is created by calling the DocumentBuilder.parse() method Then, a new XPath object is created by calling the XPathFactory.newXPath() method.

Client Applications can then call XPathReader.read() method by passing the expression to be evaluated and the return type of the expression. The return type of the expression is a QName which in XML terms, stands for Qualified Name. The standard XPath data-types are String, Number, Boolean, Node, NodeSet etc., which are represented as constants in XPathConstants namely XPathConstants.STRING, XPathConstants.NUMBER, XPathConstants.BOOLEAN, XPathConstants.NODE and XPathConstants.NODESET. Hence, the return type after evaluating an expression should be any of the above mentioned data-types. Within the read() method, an expression is compiled using the XPath.compile() method which returns a XPathExpression and the compiled expression can be evaluated using XPathExpression.evaluate() method.

3) XPathReaderTest.java

package com.javabeat.tips.xpath;
import javax.xml.xpath.XPathConstants;
import org.w3c.dom.*;
public class XPathReaderTest {

public XPathReaderTest() {
}

public static void main(String[] args){

XPathReader reader = new XPathReader("
src\\com\\javabeat\\tips\\xpath\\projects.xml"
);

// To get a xml attribute.
String expression = "/projects/project[1]/@id";
System.out.println(reader.read(expression,
XPathConstants.STRING) + "\n");

// To get a child element's value.'
expression = "/projects/project[2]/name";
System.out.println(reader.read(expression,
XPathConstants.STRING) + "\n");

// To get an entire node
expression = "/projects/project[3]";
NodeList thirdProject = (NodeList)reader.read(expression,
XPathConstants.NODESET);
traverse(thirdProject);
}

private static void traverse(NodeList rootNode){
for(int index = 0; index < rootNode.getLength();
index ++){
Node aNode = rootNode.item(index);
if (aNode.getNodeType() == Node.ELEMENT_NODE){
NodeList childNodes = aNode.getChildNodes();
if (childNodes.getLength() > 0){
System.out.println("Node Name-->" +
aNode.getNodeName() +
" , Node Value-->" +
aNode.getTextContent());
}
traverse(aNode.getChildNodes());
}
}
}
}


This test application uses the XPathReader class by creating its instance and then calls the XPathReader.read() method by passing different expressions and return types. As we see, the third expression tries to retrieve an entire node-set by passing the return type as XPathConstants.NODESET. Since a node-set contains a collection of nodes which in turn can contain some other nodes, a Recursive Traversal is made on the node-set to get the name and the value of the node by calling the Node.getNodeName() and Node.getTextContent() methods. The following would be the expected output for the above sample client application.

Output for the above program


BP001
Telecommunication Project
Node Name-->project , Node Value-->
Portal Project
Dec 10 1998
March 10 2006

Node Name-->name , Node Value-->Portal Project
Node Name-->start-date , Node Value-->Dec 10 1998
Node Name-->end-date , Node Value-->March 10 2006



Saturday, 5 March 2011

Xml parsing using SAX

SAX stands for Simple API for Xml. Using SAX with JAXP allows developers to traverse through XML data sequentially, one element at a time, using a delegation event model. Each time elements of the XML structure are encountered, an event is triggered. Developers write event handlers to define custom processing for events they deem important.

This program SAXParserExample.java parses a XML document and prints it on the console.
Following xml file is used:

 

<?xml version="1.0" encoding="UTF-8"?>
<Personnel>
<Employee type="permanent">
<Name>Seagull</Name>
<Id>3674</Id>
<Age>34</Age>
</Employee>
<Employee type="contract">
<Name>Robin</Name>
<Id>3675</Id>
<Age>25</Age>
</Employee>
<Employee type="permanent">
<Name>Crow</Name>
<Id>3676</Id>
<Age>28</Age>
</Employee>
</Personnel>


Sax parsing is event based modelling.When a Sax parser parses a XML document and every time it encounters a tag it calls the corresponding tag handler methods

when it encounters a Start Tag it calls this method
    public void startElement(String uri,..

when it encounters a End Tag it calls this method
    public void endElement(String uri,...

Like the dom example this program also parses the xml file, creates a list of employees and prints it to the console. The steps involved are


  • Create a Sax parser and parse the xml
  • In the event handler create the employee object
  • Print out the data

Basically the class extends DefaultHandler to listen for call back events. And we register this handler with the Sax parser to notify us of call back events. We are only interested in start event, end event and character event.
In start event if the element is employee we create a new instant of employee object and if the element is Name/Id/Age we initialize the character buffer to get the text value.
In end event if the node is employee then we know we are at the end of the employee node and we add the Employee object to the list.If it is any other node like Name/Id/Age we call the corresponding methods like setName/SetId/setAge on the Employee object.
In character event we store the data in a temp string variable.


a) Create a Sax Parser and parse the xml


private void parseDocument() {

//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {

//get a new instance of parser
SAXParser sp = spf.newSAXParser();

//parse the file and also register this class for call backs
sp.parse("employees.xml", this);

}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}

b) In the event handlers create the Employee object and call the corresponding setter methods.


//Event Handlers
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
//reset
tempVal = "";
if(qName.equalsIgnoreCase("Employee")) {
//create a new instance of employee
tempEmp = new Employee();
tempEmp.setType(attributes.getValue("type"));
}
}


public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}

public void endElement(String uri, String localName,
String qName) throws SAXException {

if(qName.equalsIgnoreCase("Employee")) {
//add it to the list
myEmpls.add(tempEmp);

}else if (qName.equalsIgnoreCase("Name")) {
tempEmp.setName(tempVal);
}else if (qName.equalsIgnoreCase("Id")) {
tempEmp.setId(Integer.parseInt(tempVal));
}else if (qName.equalsIgnoreCase("Age")) {
tempEmp.setAge(Integer.parseInt(tempVal));
}

}



c) Iterating and printing.

private void printData(){

System.out.println("No of Employees '" + myEmpls.size() + "'.");

Iterator it = myEmpls.iterator();
while(it.hasNext()) {
System.out.println(it.next().toString());
}
}


Listing the full program:


import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;

import org.xml.sax.helpers.DefaultHandler;

public class SAXParserExample extends DefaultHandler{

List myEmpls;

private String tempVal;

//to maintain context
private Employee tempEmp;


public SAXParserExample(){
myEmpls = new ArrayList();
}

public void runExample() {
parseDocument();
printData();
}

private void parseDocument() {

//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {

//get a new instance of parser
SAXParser sp = spf.newSAXParser();

//parse the file and also register this class for call backs
sp.parse("employees.xml", this);

}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}

/**
* Iterate through the list and print
* the contents
*/
private void printData(){

System.out.println("No of Employees '" + myEmpls.size() + "'.");

Iterator it = myEmpls.iterator();
while(it.hasNext()) {
System.out.println(it.next().toString());
}
}


//Event Handlers
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
//reset
tempVal = "";
if(qName.equalsIgnoreCase("Employee")) {
//create a new instance of employee
tempEmp = new Employee();
tempEmp.setType(attributes.getValue("type"));
}
}


public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}

public void endElement(String uri, String localName, String qName) throws SAXException {

if(qName.equalsIgnoreCase("Employee")) {
//add it to the list
myEmpls.add(tempEmp);

}else if (qName.equalsIgnoreCase("Name")) {
tempEmp.setName(tempVal);
}else if (qName.equalsIgnoreCase("Id")) {
tempEmp.setId(Integer.parseInt(tempVal));
}else if (qName.equalsIgnoreCase("Age")) {
tempEmp.setAge(Integer.parseInt(tempVal));
}

}

public static void main(String[] args){
SAXParserExample spe = new SAXParserExample();
spe.runExample();
}

}






Running SAXParserExample (JDK 1.5+)


  1. Download SAXParserExample.java, Employee.java, employees.xml to c:\xercesTest
  2. Go to command prompt and type
    cd c:\xercesTest
  3. To compile, type
    javac -classpath . SAXParserExample.java
  4. To run,type
    java -classpath . SAXParserExample

Parsing xml with DOM

The steps are

  • Get a document builder using document builder factory and parse the xml file to create a DOM object
  • Get a list of employee elements from the DOM
  • For each employee element get the id,name,age and type. Create an employee value object and add it to the list.
  • At the end iterate through the list and print the employees to verify we parsed it right.

Let us take this xml file:

<?xml version="1.0" encoding="UTF-8"?>
<Personnel>
<Employee type="permanent">
<Name>Seagull</Name>
<Id>3674</Id>
<Age>34</Age>
</Employee>
<Employee type="contract">
<Name>Robin</Name>
<Id>3675</Id>
<Age>25</Age>
</Employee>
<Employee type="permanent">
<Name>Crow</Name>
<Id>3676</Id>
<Age>28</Age>
</Employee>
</Personnel>


Full code listing - DomParserExample

a) Getting a document builder

private void parseXmlFile(){
//get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

try {

//Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();

//parse using builder to get DOM representation of the XML file
dom = db.parse("employees.xml");


}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
}



b) Get a list of employee elements
Get the rootElement from the DOM object.From the root element get all employee elements. Iterate through each employee element to load the data.

 

private void parseDocument(){
//get the root element
Element docEle = dom.getDocumentElement();

//get a nodelist of elements
NodeList nl = docEle.getElementsByTagName("Employee");
if(nl != null && nl.getLength() > 0) {
for(int i = 0 ; i < nl.getLength();i++) {

//get the employee element
Element el = (Element)nl.item(i);

//get the Employee object
Employee e = getEmployee(el);

//add it to list
myEmpls.add(e);
}
}
}


c) Reading in data from each employee.


/**
* I take an employee element and read the values in, create
* an Employee object and return it
*/
private Employee getEmployee(Element empEl) {

//for each <employee> element get text or int values of
//name ,id, age and name
String name = getTextValue(empEl,"Name");
int id = getIntValue(empEl,"Id");
int age = getIntValue(empEl,"Age");

String type = empEl.getAttribute("type");

//Create a new Employee with the value read from the xml nodes
Employee e = new Employee(name,id,age,type);

return e;
}


/**
* I take a xml element and the tag name, look for the tag and get
* the text content
* i.e for <employee><name>John</name></employee> xml snippet if
* the Element points to employee node and tagName is 'name' I will return John
*/
private String getTextValue(Element ele, String tagName) {
String textVal = null;
NodeList nl = ele.getElementsByTagName(tagName);
if(nl != null && nl.getLength() > 0) {
Element el = (Element)nl.item(0);
textVal = el.getFirstChild().getNodeValue();
}

return textVal;
}


/**
* Calls getTextValue and returns a int value
*/
private int getIntValue(Element ele, String tagName) {
//in production application you would catch the exception
return Integer.parseInt(getTextValue(ele,tagName));
}



d) Iterating and printing.


private void printData(){

System.out.println("No of Employees '" + myEmpls.size() + "'.");

Iterator it = myEmpls.iterator();
while(it.hasNext()) {
System.out.println(it.next().toString());
}
}


See Xml parsing using sax.


Listing full code :


import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class DomParserExample {

//No generics
List myEmpls;
Document dom;


public DomParserExample(){
//create a list to hold the employee objects
myEmpls = new ArrayList();
}

public void runExample() {

//parse the xml file and get the dom object
parseXmlFile();

//get each employee element and create a Employee object
parseDocument();

//Iterate through the list and print the data
printData();

}


private void parseXmlFile(){
//get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

try {

//Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();

//parse using builder to get DOM representation of the XML file
dom = db.parse("employees.xml");


}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
}


private void parseDocument(){
//get the root elememt
Element docEle = dom.getDocumentElement();

//get a nodelist of <employee> elements
NodeList nl = docEle.getElementsByTagName("Employee");
if(nl != null && nl.getLength() > 0) {
for(int i = 0 ; i < nl.getLength();i++) {

//get the employee element
Element el = (Element)nl.item(i);

//get the Employee object
Employee e = getEmployee(el);

//add it to list
myEmpls.add(e);
}
}
}


/**
* I take an employee element and read the values in, create
* an Employee object and return it
* @param empEl
* @return
*/
private Employee getEmployee(Element empEl) {

//for each <employee> element get text or int values of
//name ,id, age and name
String name = getTextValue(empEl,"Name");
int id = getIntValue(empEl,"Id");
int age = getIntValue(empEl,"Age");

String type = empEl.getAttribute("type");

//Create a new Employee with the value read from the xml nodes
Employee e = new Employee(name,id,age,type);

return e;
}


/**
* I take a xml element and the tag name, look for the tag and get
* the text content
* i.e for <employee><name>John</name></employee> xml snippet if
* the Element points to employee node and tagName is name I will return John
* @param ele
* @param tagName
* @return
*/
private String getTextValue(Element ele, String tagName) {
String textVal = null;
NodeList nl = ele.getElementsByTagName(tagName);
if(nl != null && nl.getLength() > 0) {
Element el = (Element)nl.item(0);
textVal = el.getFirstChild().getNodeValue();
}

return textVal;
}


/**
* Calls getTextValue and returns a int value
* @param ele
* @param tagName
* @return
*/
private int getIntValue(Element ele, String tagName) {
//in production application you would catch the exception
return Integer.parseInt(getTextValue(ele,tagName));
}

/**
* Iterate through the list and print the
* content to console
*/
private void printData(){

System.out.println("No of Employees '" + myEmpls.size() + "'.");

Iterator it = myEmpls.iterator();
while(it.hasNext()) {
System.out.println(it.next().toString());
}
}


public static void main(String[] args){
//create an instance
DomParserExample dpe = new DomParserExample();

//call run example
dpe.runExample();
}

}


Running DOMParserExample(JDK 1.5+)


  1. Download DomParserExample.java, Employee.java, employees.xml to c:\xercesTest
  2. Go to command prompt and type
    cd c:\xercesTest
  3. To compile, type
    javac -classpath . DomParserExample.java
  4. To run, type
    java -classpath . DomParserExample