Archive

Posts Tagged ‘node’

Android XML Adventure – Create & Write XML Data

October 27, 2011 8 comments

Article Series: Android XML Adventure

Author: Pete Houston (aka. `xjaphx`)

TABLE OF CONTENTS

  1. What is the “Thing” called XML?
  2. Parsing XML Data w/ SAXParser
  3. Parsing XML Data w/ DOMParser
  4. Parsing XML Data w/ XMLPullParser
  5. Create & Write XML Data
  6. Compare: XML Parsers
  7. Parsing XML using XPath
  8. Parsing HTML using HtmlCleaner
  9. Parsing HTML using JSoup
  10. Sample Project 1: RSS Parser – using SAXParser
  11. Sample Project 1: RSS Parser – using DOM Parser
  12. Sample Project 1: RSS Parser – using XMLPullParser
  13. Sample Project 2: HTML Parser – using HtmlCleaner
  14. Sample Project 2: HTML Parser – using JSoup
  15. Finalization on the “Thing” called XML!

=========================================

Have you mastered with parsing XML stuffs well?

Today, I’d like to talk about writing XML file, just like a normal Java program, there’s no differences.

Our steps are going to be:

+ First, create XML string (which is residing in memory for later use).

+ Second, write the XML string to a file, which belongs to internal storage of an Android application.

About the second step, I assume that you’ve already known how to do it. Shortly, `Context.openFileOutput()` will do the job.

For the first thing, there various ways to do; however, I’d like to introduce only most three common ways.

1. A normal String format

Understand what I mean?

	public static String writeUsingNormalOperation(Study study) {
		String format =
				"<?xml version='1.0' encoding='UTF-8'?>" +
				"<record>" +
				"	<study id='%d'>" +
				"		<topic>%s</topic>" +
				"		<content>%s</content>" +
				"		<author>%s</author>" +
				"		<date>%s</date>" +
				"	</study>" +
				"</record>";
		return String.format(format, study.mId, study.mTopic, study.mContent, study.mAuthor, study.mDate);
	}

The Good:

  • Very quick and easy.
  • Not much code or custom objects required.

The Bad:

  • Straightforwardly static string, if you want to output a list of XML tags, this way doesn’t work.
  • Easy to make mistake while making the format for output.

2. Using DOM

	public static String writeUsingDOM(Study study) throws Exception {
		Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
		// create root: <record>
		Element root = doc.createElement(Study.RECORD);
		doc.appendChild(root);

		// create: <study>
		Element tagStudy = doc.createElement(Study.STUDY);
		root.appendChild(tagStudy);
		// add attr: id =
		tagStudy.setAttribute(Study.ID, String.valueOf(study.mId));

		// create: <topic>
		Element tagTopic = doc.createElement(Study.TOPIC);
		tagStudy.appendChild(tagTopic);
		tagTopic.setTextContent(study.mTopic);

		// create: <content>
		Element tagContent = doc.createElement(Study.CONTENT);
		tagStudy.appendChild(tagContent);
		tagContent.setTextContent(study.mContent);

		// create: <author>
		Element tagAuthor = doc.createElement(Study.AUTHOR);
		tagStudy.appendChild(tagAuthor);
		tagAuthor.setTextContent(study.mAuthor);

		// create: <date>
		Element tagDate = doc.createElement(Study.DATE);
		tagStudy.appendChild(tagDate);
		tagDate.setTextContent(study.mDate);

		// create Transformer object
		Transformer transformer = TransformerFactory.newInstance().newTransformer();
		StringWriter writer = new StringWriter();
		StreamResult result = new StreamResult(writer);
		transformer.transform(new DOMSource(doc), result);

		// return XML string
		return writer.toString();
	}

The Good:

  • Implementable for dynamic data output.
  • Flexible XML configuration.

The Bad:

  • Too much object creations, using a lot.
  • Performance gets worse if XML document is large.

3. Using XMLSerializer

	public static String writeUsingXMLSerializer(Study study) throws Exception {
		XmlSerializer xmlSerializer = Xml.newSerializer();
		StringWriter writer = new StringWriter();

		xmlSerializer.setOutput(writer);
		// start DOCUMENT
		xmlSerializer.startDocument("UTF-8", true);
		// open tag: <record>
		xmlSerializer.startTag("", Study.RECORD);
		// open tag: <study>
		xmlSerializer.startTag("", Study.STUDY);
		xmlSerializer.attribute("", Study.ID, String.valueOf(study.mId));

		// open tag: <topic>
		xmlSerializer.startTag("", Study.TOPIC);
		xmlSerializer.text(study.mTopic);
		// close tag: </topic>
		xmlSerializer.endTag("", Study.TOPIC);

		// open tag: <content>
		xmlSerializer.startTag("", Study.CONTENT);
		xmlSerializer.text(study.mContent);
		// close tag: </content>
		xmlSerializer.endTag("", Study.CONTENT);

		// open tag: <author>
		xmlSerializer.startTag("", Study.AUTHOR);
		xmlSerializer.text(study.mAuthor);
		// close tag: </author>
		xmlSerializer.endTag("", Study.AUTHOR);

		// open tag: <date>
		xmlSerializer.startTag("", Study.DATE);
		xmlSerializer.text(study.mDate);
		// close tag: </date>
		xmlSerializer.endTag("", Study.DATE);

		// close tag: </study>
		xmlSerializer.endTag("", Study.STUDY);
		// close tag: </record>
		xmlSerializer.endTag("", Study.RECORD);

		// end DOCUMENT
		xmlSerializer.endDocument();

		return writer.toString();
	}

The Good:

  • Not like DOM, it requires only at least two objects to create XML document.
  • The syntax is so damn easy to work with.
  • The performance is pretty good.

The Bad:

  • At this point, I’ve not faced any drawback of using this.

4. Simple Framework

Homepage: http://simple.sourceforge.net/

This is an amazing XML Serialization/De-serialization, it is really easy to use and having quite good performance on XML data structure.

However, I’m not going to talk about it here, so you might want to discover about it yourselves. Or maybe, I will give a quick look over this framework in another article.

Well, it’s end for today post, and happy learning!

Be await for the next article on the series.

Cheers,

Pete Houston

Android XML Adventure – Parsing XML Data with DOM

October 12, 2011 1 comment

Article Series: Android XML Adventure

Author: Pete Houston (aka. `xjaphx`)

TABLE OF CONTENTS

  1. What is the “Thing” called XML?
  2. Parsing XML Data w/ SAXParser
  3. Parsing XML Data w/ DOMParser
  4. Parsing XML Data w/ XMLPullParser
  5. Create & Write XML Data
  6. Compare: XML Parsers
  7. Parsing XML using XPath
  8. Parsing HTML using HtmlCleaner
  9. Parsing HTML using JSoup
  10. Sample Project 1: RSS Parser – using SAXParser
  11. Sample Project 1: RSS Parser – using DOM Parser
  12. Sample Project 1: RSS Parser – using XMLPullParser
  13. Sample Project 2: HTML Parser – using HtmlCleaner
  14. Sample Project 2: HTML Parser – using JSoup
  15. Finalization on the “Thing” called XML!

=========================================

In this article, I’d like to introduce you the concepts of parsing XML using DOM (Document Object Model).

In DOM, everything is treated like a node, as you know in data structure; and all nodes link together and form as a whole, called `Document Tree`. There’s always a node called `ROOT` which contains the links to all other child nodes. In a XML document, there is one and only one `ROOT`, often called `Root Element` or `Document Element`.

I assume that you’re already know what XML is, and what kinds of data it holds. In DOM, a node is an interface that presents all other XML objects.

So what can a `Node` in DOM be?

a DOM Node presents a XML object

a DOM Node presents a XML object

Reference to Android – DOM Node Class

That’s the concept of DOM. About how it works, to avoid lots of words to read that make you feel tiring and boring, I’ve visualized for a better explanation and presentation:

DOM - XML Parsing Process

DOM - XML Parsing Process

From `DocumentBuilderFactory`, create an `DocumentBuilder`, at this point, it needs an input for XML data (like a string or an input stream…) in order to create a `Document` that represents a DOM tree. For parsing, it starts with retrieving the very very first `RootElement`, afterward you can choose which XML tags in document to find and parse them by using `getElementByTagName()`.

Have you understood how DOM works for now? Let’s go for a practice, we will use the input XML from previous article to parse. Having the same layout and main application interface, we just need to update the `StudyParser.java` implementing DOM methods.


package pete.android.study.parser;

import java.io.IOException;
import java.io.InputStream;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import pete.android.study.data.Study;

public class StudyParser {
	public static Study parse(InputStream is) {
		// create new Study object to hold data
		Study study = null;

		try {
			// init Study object
			study = new Study();
			// create Document object
			Document xmlDoc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
			// get `root` Element, the first thing always should be done!!!
			Element root = xmlDoc.getDocumentElement();
			// collect child tags need to be parsed <study>
			NodeList listStudyNodes = root.getElementsByTagName(Study.STUDY);
			// iterate through list nodes of <study> tag
			for(int i = 0; i < listStudyNodes.getLength(); ++i) {
				// get current node
				Node curNode = listStudyNodes.item(i);
				// this tag is <study>, get `id` attribute first
				study.mId = Integer.parseInt( ((Attr)(curNode.getAttributes().item(0))).getValue() );

				// get all child tags inside <study>
				NodeList listChilds = curNode.getChildNodes();
				for(int j = 0; j < listChilds.getLength(); ++j) {
					// get a child
					Node child = listChilds.item(j);
					// if this node is ELEMENT type
					if(child.getNodeType() == Node.ELEMENT_NODE) {
						// get element name
						String childName = child.getNodeName();
						// get element text content
						String childValue = ((Element)child).getTextContent();
						// if this tag is <topic>
						if(Study.TOPIC.equalsIgnoreCase(childName)) {
							study.mTopic = childValue;
						}
						// if this tag is <content>
						else if(Study.CONTENT.equalsIgnoreCase(childName)) {
							study.mContent = childValue;
						}
						// if this tag is <author>
						else if(Study.AUTHOR.equalsIgnoreCase(childName)) {
							study.mAuthor = childValue;
						}
						// if this tag is <date>
						else if(Study.DATE.equalsIgnoreCase(childName)) {
							study.mDate = childValue;
						}
					}
				}
			}
		// of course, handling exception
		} catch (ParserConfigurationException e) {
			study = null;
		} catch (SAXException e) {
			study = null;
		} catch (IOException e) {
			study = null;
		}

		// return Study object
		return study;
	}
}

Reading the comment in the source code for inner-sight explanation.

These are some notes I’d suggest you when implementing DOM method:

1. A `Node` can re-present any XML object, so always check `Node`’s type before using it.

2. Always convert `Node` to its correct type for better commands (methods, properties).

3. If XML document is really deep (lots of childs’ of child tag), I suggest not to use DOM since it would really reduce the performance, since it needs time to construct and search for every single node.

4. If you use DOM a lot, always remember to refer to Developers’ API (Java SE 7 or Android DOM Reference).

OK! It’s done for now, try to practice as much as possible to learn it. I’ve just created and showed you the way, the fundamentals out of it. You need to keep walking by yourself if you want to study more about DOM.

Good luck and see you in the next article!

Cheers,

Pete Houston