Lines of Code? We Don’t Need No Stinking Lines of Code

Let’s look at an old application my department started in 2001 and developed sporadically until finally launching it in 2007:

[me@desktop old]$ find . | grep -E '(php|js)$' | xargs wc -l | tail -n 1
53063 total
[me@desktop old]$ find . | grep -E '(html|css)$' | xargs wc -l | tail -n 1
9726 total

Now let’s switch over to the replacement application I started in January, 2009 and launched in May, 2009:

[me@desktop new]$ find . | grep -E '(php|js)$' | xargs wc -l | tail -n 1
5955 total
[me@desktop new]$ find . | grep -E '(html|css)$' | xargs wc -l | tail -n 1
2302 total

For those unfamiliar with find, grep, regular expressions, xargs, or tail, that means the old application took 53,063 lines of PHP and JavaScript to do what I did in 5,955.  The old used 9,726 lines of HTML and CSS; I used only 2,302.

So, basically I had fewer total lines of content in the entire rewritten application than its predecessor had merely of markup and styles.  That’s fantastic!

Of course, some of you are thinking that’s just because I write much longer lines of code, right?  And you naturally want me to compute the average number of characters per line used in each application to compare, right?  And you demand — demand — that it be done with a single command chain in UNIX?  As you wish!

[me@desktop old]$ (find . | grep -E '(html|css|php|js)$' | tee temp | xargs wc -l | tail -n 1 | awk '{print $1}' ; cat temp | xargs wc -c | tail -n 1 | awk '{print $1}') | sed 'N;s/\n/ /' | awk '{print $2 " / " $1 " = " $2 / $1}'
1523405 / 51063 = 29.8338

[me@desktop new]$ (find . | grep -E '(html|css|php|js)$' | tee temp | xargs wc -l | tail -n 1 | awk '{print $1}' ; cat temp | xargs wc -c | tail -n 1 | awk '{print $1}') | sed 'N;s/\n/ /' | awk '{print $2 " / " $1 " = " $2 / $1}'
252837 / 8257 = 30.6209

See?  I used only one extra character per line!  On the other hand, I spent a good 30 minutes writing that command: 1 minute composing what I put there and the other 29 trying to figure out a way to do it without either running the find twice or writing anything to a temp file.  (You’ll notice I gave up and threw in a tee halfway through.)

UPDATE: I was too focused on the chaining problem to recognize that I could just have wc calculate the number of characters and lines at the same time.  This would have worked just as well, with no temp file, and with far less complexity:

find . | grep -E '(php|js|html|css)$' | xargs wc -l -c | tail -n 1 | awk '{print $2 " / " $1 " = " $2/$1}'