Parser for Exported Bookmarks HTML file Google Chrome and Mozilla in Java

How can I parse the exported bookmark file from Google Chrome and Mozilla Firefox in Java. Are there any libraries available to parse them and get the URLs in them.

Code examples for parsing them in Java are also most welcome.

+6
source share
2 answers

In the posted new comments, the solution will be to use the JSOUP Open Source Program for this. JSOUP only accepts HTTP or HTTPS, so you might want to host the exported HTML bookmark on a local server, such as tomcat, and get the DOM from it

http://yourip:<port>/<yourProject>/<bookmark.html>. 

JSOUP is pretty straightforward.

Other simpler ways:

Chrome and Firefox bookmarks are stored as JSON, as shown below.

Java way: I would suggest you use JSON to parse them. Create a Java reference object based on the structure below.

or just use the UNIX command line and run

  grep -i "url" <bookmark file path> | cut -d":" -f2 

However, if you are still interested in using the Chrome APIs, visit: http://developer.chrome.com/extensions/bookmarks.html

 { "checksum": "702d8e600a3d70beccfc78e82ca7caba", "roots": { "bookmark_bar": { "children": [ { "date_added": "12939920104154671", "id": "3", "name": "Development/Tutorials/Git/git-svn - KDE TechBase", "type": "url", "url": "http://techbase.kde.org/Development/Tutorials/Git/git-svn" }, { "date_added": "12939995405838705", "id": "4", "name": "QJson - Usage", "type": "url", "url": "http://qjson.sourceforge.net/usage.html" 
+2
source

In most cases, you do not need to parse the HTML file. Chrome stores bookmarks in a JSON file. It is much easier to just read this file using the JSON parser.

The file that interests you is in (on Linux, anyway, Google for other O / S):

 /home/your_name/.config/google-chrome/Default/Bookmarks 

JSON parsing is simple. Google or start with How to parse JSON in Java .

If you want to visualize JSON data before you start digging into it, also see http://chris.photobooks.com/json/default.htm .

+7
source

All Articles