Home > Tutorials > Android XML Adventure – Parsing XML Data with SAXParser

Android XML Adventure – Parsing XML Data with SAXParser


Article Series: Android XML Adventure

Author: Pete Houston (aka. `xjaphx`)

TABLE OF CONTENTS

  1. What is the “Thing” called XML?
  2. Parsing XML Data w/ SAXParser
  3. Parsing XML Data w/ DOMParser
  4. Parsing XML Data w/ XMLPullParser
  5. Create & Write XML Data
  6. Compare: XML Parsers
  7. Parsing XML using XPath
  8. Parsing HTML using HtmlCleaner
  9. Parsing HTML using JSoup
  10. Sample Project 1: RSS Parser – using SAXParser
  11. Sample Project 1: RSS Parser – using DOM Parser
  12. Sample Project 1: RSS Parser – using XMLPullParser
  13. Sample Project 2: HTML Parser – using HtmlCleaner
  14. Sample Project 2: HTML Parser – using JSoup
  15. Finalization on the “Thing” called XML!

=========================================

Hope you’ve already known what XML is in previous article `What is the “Thing” called XML?`.

At the start of the series, I will talk about how to parse XML data. Why parsing first? Well, it’s because most of the applications tend to parse XML data from other sources, like RSS, which is very common.

There are many ways to parse XML file in Android, however, the most three common methods are:

  1. SAXParser
  2. DOM
  3. XmlPullParser

I’ll talk about SAXParser in this article.

For a quick and easy understandings, I’ve created this flow-chart to describe how SAX-Parser works.

How SAX-Parser works!

How SAX-Parser works!

All the elements in XML document will be parsed through a ContentHandler with five pre-defined methods handling actions taken.

Class Overview

This is the main interface that most SAX applications implement: if the application needs to be informed of basic parsing events, it implements this interface and registers an instance with the SAX parser using the setContentHandler method. The parser uses the instance to report basic document-related events like the start and end of elements and character data.

The order of events in this interface is very important, and mirrors the order of information in the document itself. For example, all of an element’s content (character data, processing instructions, and/or subelements) will appear, in order, between the startElement event and the corresponding endElement event.

This interface is similar to the now-deprecated SAX 1.0 DocumentHandler interface, but it adds support for Namespaces and for reporting skipped entities (in non-validating XML processors).

(@Quote from: http://developer.android.com/reference/org/xml/sax/ContentHandler.html)

I guess the picture I drew above explains everything. Let’s head to work.

First, we need a sample XML file, I called it: “record.xml” and put it under “assets” directory, using as assets.

<?xml version="1.0" encoding="UTF-8"?>
<record>
	<study id="1">
		<topic>SAX Parser</topic>
		<content>Learn how to parse XML using SAXParser</content>
		<author>Pete Houston</author>
		<date>02-Oct-2011</date>
	</study>
</record>

We will use SAX-Parser to parse these data and display on a TextView.

The data we need: [ study id, topic, content, author, date ], I store them as an entity called “Study“, `Study.java`.

package pete.android.study.data;

public class Study {
	public int mId;
	public String mTopic;
	public String mContent;
	public String mAuthor;
	public String mDate;

	public static final String STUDY = "study";
	public static final String ID = "id";
	public static final String TOPIC = "topic";
	public static final String CONTENT = "content";
	public static final String AUTHOR = "author";
	public static final String DATE = "date";}

Great, we’ve done half of the job. Next, we need a ContentHandler that does the job of parsing every single elements in XML document, called it “StudyHandler.java“:

package pete.android.study.parser;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import pete.android.study.data.Study;

/*
 * class StudyHandler *
 */
public class StudyHandler extends DefaultHandler {

	// members
	private boolean isTopic;
	private boolean isContent;
	private boolean isAuthor;
	private boolean isDate;
	// 'Study' entity to parse
	private Study mStudy;

	// 'getter' is enough
	public Study getStudy() {
		return mStudy;
	}

	/*
	 * (non-Javadoc)
	 * @see org.xml.sax.helpers.DefaultHandler#startDocument()
	 */
	@Override
	public void startDocument() throws SAXException {
		// create new object
		mStudy = new Study();
	}

	/*
	 * (non-Javadoc)
	 * @see org.xml.sax.helpers.DefaultHandler#endDocument()
	 */
	@Override
	public void endDocument() throws SAXException {
		// nothing we need to do here
	}

	/*
	 * (non-Javadoc)
	 * @see org.xml.sax.helpers.DefaultHandler#startElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes)
	 */
	@Override
	public void startElement(String namespaceURI, String localName, String qName, Attributes atts)
							throws SAXException {
		// if this element value equals "study"
		if(localName.equals(Study.STUDY)) {
			// get id right away
			mStudy.mId = Integer.parseInt(atts.getValue(Study.ID));
		}
		// if this element value equals "topic"
		else if(localName.equals(Study.TOPIC)) {
			// mark current element is "topic"
			isTopic = true;
		}
		// if this element value equals "content"
		else if(localName.equals(Study.CONTENT)) {
			// mark current element is "content"
			isContent = true;
		}
		// if this element value equals "author"
		else if(localName.equals(Study.AUTHOR)) {
			// mark current element is "author"
			isAuthor = true;
		}
		// if this element value equals "date"
		else if(localName.equals(Study.DATE)) {
			// mark current element is "date"
			isDate = true;
		}
	}

	/*
	 * (non-Javadoc)
	 * @see org.xml.sax.helpers.DefaultHandler#endElement(java.lang.String, java.lang.String, java.lang.String)
	 */
	@Override
	public void endElement(String namespaceURI, String localName, String qName) throws SAXException {

		if(localName.equals(Study.STUDY)) {
		    // already get the attribute "id", nothing needs to do here
		}
		// if this element value equals "topic"
		else if(localName.equals(Study.TOPIC)) {
			// uncheck marking
		    isTopic = false;
		}
		// if this element value equals "topic"
		else if(localName.equals(Study.CONTENT)) {
			// uncheck marking
			isContent = false;
		}
		// if this element value equals "topic"
		else if(localName.equals(Study.AUTHOR)) {
			// uncheck marking
			isAuthor = false;
		}
		// if this element value equals "topic"
		else if(localName.equals(Study.DATE)) {
			// uncheck marking
			isDate = false;
		}
	}

	/*
	 * (non-Javadoc)
	 * @see org.xml.sax.helpers.DefaultHandler#characters(char[], int, int)
	 */
	@Override
	public void characters(char ch[], int start, int length) {
		// get all text value inside the element tag
		String chars = new String(ch, start, length);
	    chars = chars.trim(); // remove all white-space characters

	    // if this tag is "topic", set "topic" value
	    if(isTopic) mStudy.mTopic = chars;
	    // if this tag is "content", set "content" value
	    else if(isContent) mStudy.mContent = chars;
	    // if this tag is "author", set "author" value
	    else if(isAuthor) mStudy.mAuthor = chars;
	    // if this tag is "date", set "date" value
	    else if(isDate) mStudy.mDate = chars;
	}
}

There we go a data handler done. Basically, that’s enough; however, for simplicity, I’d like to create an utility class that handles the whole parsing stuffs, called “StudyParser.java“:

package pete.android.study.parser;

import java.io.InputStream;

import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;

import android.util.Log;

import pete.android.study.data.Study;

public class StudyParser {

	public static Study parse(InputStream is) {
		Study study = null;
		try {
			// create a XMLReader from SAXParser
			XMLReader xmlReader = SAXParserFactory.newInstance().newSAXParser().getXMLReader();
			// create a StudyHandler too
			StudyHandler studyHandler = new StudyHandler();
			// apply handler to the XMLReader
			xmlReader.setContentHandler(studyHandler);
			// the process starts
			xmlReader.parse(new InputSource(is));
			// get the target `Study`
			study = studyHandler.getStudy();

		} catch(Exception ex) {
			Log.d("XML", "StudyParser: parse() failed");
		}

		// return Study we found
		return study;
	}
}

OK! So in main program, just giving a static call to “StudyParser.parse()” will return us a “Study” object from input XML.
Here my main program, you can just create any program you like. Mine having a TextView to display result is so enough for demonstration.

package pete.android.study;

import java.io.IOException;

import pete.android.study.data.Study;
import pete.android.study.parser.StudyParser;
import android.app.Activity;
import android.os.Bundle;
import android.util.Log;
import android.widget.TextView;

public class Main extends Activity {
    TextView tvStudy;
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        tvStudy = (TextView)findViewById(R.id.text);
        Study study = null;
        try {
			study = StudyParser.parse(getAssets().open("record.xml"));
		} catch (IOException e) {
			Log.d("XML","onCreate(): parse() failed");
			return;
		}

		String output = "";
		output += "Study ID: " + study.mId + "\n";
		output += "Topic: " + study.mTopic + "\n";
		output += "Content: " + study.mContent + "\n";
		output += "Author: " + study.mAuthor + "\n";
		output += "Date: " + study.mDate + "\n";

		tvStudy.setText(output);
    }
}

And this is the output result:

SAX-Parser Result

SAX-Parser Result

The data displays as expected!

Simple enough with SAX-Parser?

If you’re beginner or newbie, just read this article slowly and don’t skim, and you will understand it.

There are still more about SAX-Parser, however, I just guide you the most common usage and the most simple concept of parsing XML data using SAX-Parser. You will find more by Googling, I believe!

See you in the next article!!!

Cheers,

Pete Houston

  1. October 12, 2013 at 4:13 pm

    This is pure excellence. The ONLY tutorial that actually WORKED for me.
    You deserve more than a THANK YOU.

  2. October 6, 2013 at 1:57 am

    Thank you!
    Very halpfull article – form me is missing only one thing – parsing XML with multiple elements (an Array of elements).
    One again – nice work!

  3. James
    December 8, 2012 at 2:32 am

    Great Stuff. Thanks!

  4. Sobuj
    November 26, 2012 at 8:26 pm

    Thanks Pete Houston

  5. vaish
    October 7, 2012 at 2:37 pm

    hey some1 pls email me a tutorial, for xml sax parsing which parse data from web using async task… pls help!!..
    Tenx in advnce…

  6. khatri
    September 6, 2012 at 2:14 pm

    thanks for this tutorial

  7. Dardie
    June 28, 2012 at 10:38 am

    I am getting an error when I try to run any activity with any parser… it says

    Error occurred during initialization of VM
    java/lang/NoClassDefFoundError: java/lang/ref/FinalReference

  8. March 31, 2012 at 2:29 pm

    Hi Pete
    Thanks for this helpfull blog.I am working on applications which needs lots of XML Parsing.And I use the same way as you have taught us in this blog for parsing.But my problem is that where should I save the parsed data.
    Right now what I am doing is I parse the data and save that data in static ArrayList or HashMap variables.So whenever I need that data I directly uses the static variables and show to my ListView or wherever I needed.And it works fine most of the time.
    But what exactly problem I am facing is when I switch to some other app like Browser so I lost my previous apps data (static variable which I initilaized by parsing XMLs).So in such cases how can I retain my app’s status.

    Please Help

  9. jimmy
    December 1, 2011 at 7:18 pm

    so what if we had many records and wanted to just get only the study id in an array???

    • Cyril
      June 19, 2013 at 3:54 pm

      Hey Jimmy, did you solve that problem? I am looking for passing a variable from my activity to select just one element too. I think it could be used like this :

      in the StartElement :

      if(localName.equals(Study.STUDY)) {

      mStudy.mId = Integer.parseInt(atts.getValue(Study.ID));
      if (mStudy.mId==MYVARIABLE) {

      // get id right away

      }
      }
      .. but I just dont know how to pass MYVARIABLE to xmlhandler..

  10. aniket
    November 30, 2011 at 7:50 pm

    what if i have large xml file and show onto the view as that data loading progresses….

  11. Tuan
    November 13, 2011 at 12:08 am

    can you share ur source code

  12. rechinul13gabriel
    October 28, 2011 at 10:19 pm

    Great article. I have a question, how can I pars an xml with the same tag () multiple times?
    like this:

    SAX Parser
    Learn how to parse XML using SAXParser
    Pete Houston-0-
    Pete Houston-1-
    Pete Houston-2-
    Pete Houston-3-
    02-Oct-2011

  13. emmanuel
    October 12, 2011 at 6:53 pm

    After a long journey i find soln here….

  1. No trackbacks yet.

Leave a comment