I have played with HTMLUnit before for similar purposes.
In fact, you can find all the necessary information here . HTMLUnit includes AJAX support by default, so when you get a newPage object in your code, you can issue click events on the page (search for a specific element and call its click() ). The hardest part is that AJAX is asynchronous, so you have to call wait() or sleep() after a virtual click, so the Javascript code on the site can handle the actions. This is not a good approach, as using a network makes sleep() unreliable. You can find something on the page that changes when an event triggers AJAX calls (for example, a change in the title of the header), so you can regularly check to see if this change has occurred to the site or not. (I should mention that an event resynchronizer is built into HTMLUnit, however, I was not able to get it to work as I expected.) I use the Firebug or Chrome toolbar to examine the site. You can check the DOM tree before and after AJAX calls, and this way you will learn how to link to certain controls (such as links and drop-down menus) on the page.
I would use XPath to get certain elements, for example. you can do this (from HTML block examples):
//get div which has a 'name' attribute of 'John' final HtmlDivision div = (HtmlDivision) page.getByXPath("//div[@name='John']").get(0);
In fact, YouTube does not use AJAX to get the result. When you click on the sorting drop-down menu on the results page (this is decorated with a <button> ), the absolute positioning of <ul> (this emulates the combo drop-down part) appears, which has <li> elements for each menu item. The <li> elements contain a special <span> element with the href attribute attached. When you click on the <span> element, Javascript moves the browser to this href value.
For example, in my case, the <span> relevance sort item looks like this:
<span href="/results?search_type=videos&search_query=test&suggested_categories=2%2C24%2C10%2C1%2C28" class=" yt-uix-button-menu-item" onclick=";window.location.href=this.getAttribute('href');return false;">Relevancia</span>
You can get a list of these intervals relatively easily, since hosting <ul> is the only such child element of <body> . Although first you need to click on the drop-down button because it will create an <ul> element with all the child elements described above using Javascript. You can get button sorting using this XPath:
//div[@class='sort-by floatR']/button
You can test your XPath queries, for example. directly in Chrome if you open the developer tools and the Javascript developer panel in the toolbar. Then you can test the following:
> $x("//div[@class='sort-by floatR']/button") [ <button type=β"button" class=β" yt-uix-button yt-uix-button-text yt-uix-button-active" onclick=β";βreturn false;β" role=β"button" aria-pressed=β"true" aria-expanded=β"true" aria-haspopup=β"true" aria-activedescendant data-button-listener=β"26">ββ¦β</button>β ]
Hope this helps you in the right direction.