Posts Tagged ‘timeout’

A Note when Using Jsoup: User-Agent

January 29, 2013 1 comment

Several days ago, I’ve tried to run Jsoup on mobile testing for data parsing. My goal is to parse all questions posted on

However, the result doesn’t fit me well.

First run on simple Android code:

public class MainScreen extends Activity
    ArrayList<String> mData =  new ArrayList<String>();
    ListView mListView;
    ArrayAdapter<String> mAdapter;

    public void onCreate(Bundle savedInstanceState)

        mListView = (ListView) findViewById(;


        mAdapter = new ArrayAdapter<String>(this, android.R.layout.simple_list_item_1,, mData);

    private void processData() {
        String URL = "";
        try {
            Document doc = Jsoup.connect(URL).get();
            Elements questions =".summary h3 a");
            for(Element question: questions) {

            if(mData.size() == 0) {
                mData.add("Empty result");

        } catch (Exception ex) {
            mData.add("Exception: " + ex.toString());

The result is empty. Well, thought of something else, so my next try is to print HTML from “doc” object, it outputs parts of full expected HTML results. So I parse with this selector: “div.nav li a”. The results show up but not for “.summary h3 a”.

After two days, working with Johnathan Hedley on GitHub, finally, found the problem is that: the mobile browser user-agent differs from the desktop browser; therefore, the HTML responses differ.

Make a note to mobile developers that use Jsoup:

+ always set a desktop user-agent

+ set a timeout

That’s good practice to avoid unexpectation.

This is the update working line:

Document doc = Jsoup.connect(URL).userAgent("Mozilla/5.0 (Macintosh; U; Intel Mac OS X; de-de) AppleWebKit/523.10.3 (KHTML, like Gecko) Version/3.0.4 Safari/523.10").get(

This issue was discussed here in GitHub:


Pete Houston

Categories: Tricks & Tips Tags: , , , , ,