Doofus Software: A Linux one-liner to find all the acronyms in your Latex files

Wednesday, 5 September 2012

A Linux one-liner to find all the acronyms in your Latex files

At the beginning of my PhD thesis, I include a List of Acronyms. Of course, I would like to be sure that my list is comprehensive. I don't want any strange acronyms to appear in the text of my thesis, without first appearing in my list of acronyms. But how can I easily identify all of the acronyms in the Latex source, without having to read all 244 pages manually?

grep to the rescue, again

Like most other areas of my life, this problem can be easily solved with a Linux one-liner centered around grep:

cat *.tex | grep -wo "[A-Z]\+\{2,10\}" | sort | uniq -c | sort -gr

Let's take a look at the pipeline:

The cat *.tex outputs all my Latex to standard output.
The grep -wo "[A-Z]\+\{2,10\}" matches whole words (the -w flag) that contain between 2 and 10 upper case letters. The -o flag returns only the match, not the entire line.
The first sort sorts the acronyms, which is useful for the next step.
The uniq gets rid of duplicates, but retains a counter because of the -c flag.
Finally, the second sort sorts the entries numerically (-g) and reverses the results (-r).


    292 IR
    241 LDA
    166 LSI
    125 VSM
     87 TCP
     80 EM
     35 APFD
     34 SUT
     29 HSD
     22 TOPIC
     22 II
     18 MALLET
     16 LOC
     14 PS
     14 OR
     14 IDE
     12 CALLG
     11 ICA
     10 RNDM
     10 MAP
     10 KL
     10 CS
     ...

Note that this command works with any text file; it is not unique to Latex. Just change the cat command.

3 comments:

Homayoun7 January 2013 at 03:19
Hi Stephen,

Two questions.

1- There are some acronyms like "QoS" (quality of Service) having a combination of uppercase and lowercase together. How I can change your command to detect such acronyms as well.

2- what are the numbers coming in left hand side of the acronyms.

PS: it seems that some words written in uppercase are included in the list.

Cheers,
Homayoon
ReplyDelete
Replies

Add comment