Parsing a css file using java

First I want to explain what I'm doing, and then my problem. I need to scan the css file and get all its internal links (mainly images), but I need to get the line number where the links were found.

Now I parse the files using the flute library, and it works fine, and I use LineNumberReader to get the line number where the link was found, but this class calls the wrong line number.

For example: the link .. /../image/bg.gif is on line 350, but the getLineNumber method in the LineNumberReader class says 490.

So, I would be grateful if one of you could drive me right and give me a possible explanation of why the LineNumberReader class does this.

pd: another solution would be much appreciated.

  • Sorry for any typos; English is not my first language.
+1
source share
3 answers

Hi @eakbas and @Favonius Thanks for your reply.
I finally got a solution, maybe this is not the best, but at least it works for me.
As I said, I used the flute library to implement the DocumentHandler class of the org.w3c.sac package of the package to parse the css file.
Therefore, I applied the 'property' method, this method has 3 parameters, the name of the property, the LexicalUnit object and a boolean value indicating that the property has an important operator or not.

public void property(String property, LexicalUnit lexicalUnit, boolean important) 

Since I need the line number where the particular property was found, I did a search, and I realized that the class that uses the flute to implement the LexicalUnit interface contains the line number (this is LexicalUnitImp), so I used reflection to cast from the LexicalUnit interface to one object LexicalUnitImp.

 Class<?> clazz = ClassUtils.getClass("org.w3c.flute.parser.LexicalUnitImpl"); Object lexicalObject = clazz.cast(lexicalUnit); Integer line = (Integer)MethodUtils.invokeMethod(lexicalObject, "getLineNumber", null, null); 

I did it this way because the LexicalUnitImpl class is “protected” and I cannot use it in the traditional way.

 class LexicalUnitImpl implements LexicalUnit 

Note. The ClassUtils class and MethodUtils are part of the apache commons-beanutils library.

0
source

Another solution - Look at these parser generation tools ...

JavaCC and Antlr provide a way to get the row number and column number.

A possible reason for your problem ... line number one ... maybe because of how the parser generation tools work ... They are trying to find the best match ... for this someday they are for tracking / rewinding a thread .. . and because of this, your LineNumberReader instance crashes ....

The ideal way to get the row or column number is to use the methods provided by the tool itself.

+1
source

Alternatively, you can use ph-css as a parsing library. See the “Visit All URLs Contained in CSS” example at https://github.com/phax/ph-css#code-examples for an example of how to extract the URLs and determine the correct source position.

0
source

All Articles