With the caveat that the significance of the “lines of code” metric is very doubtful, you can start by striking out empty lines.
find . -name '*.php' -print0 | xargs -0 cat | egrep -v '^[ \t]*$' | wc
(eg).
For languages like JavaScript, personal coding style can significantly affect the original LOC. Think that some people write like this:
if (testSomething()) return null; if (somethingElse()) { doThis(); } else { doThat(); }
And some people write like this:
if (testSomething()) { return null; } if (somethingElse()) { doThis(); } else { doThat(); }
What would be more useful (although, in my opinion, doubtful), would be something like "statements." Of course, you will need a tool that clearly understood the syntax of different languages.
I call these statistics “doubtful,” because in organizations the weak nature of numbers is usually forgotten because it breaks into the table after the table. Project managers begin to extract trends based on LOC, errors (also doubtful), checkins (ditto), etc. are recorded, and the fact that there are such weak correlations with real performance is simply lost.
Sermon on :-)
source share