Doofus Software: A Linux one-liner to count the unique words in a set of files

Thursday, 26 July 2012

A Linux one-liner to count the unique words in a set of files

Say you have a bunch of text files, and you want to find out all the unique words in them, and how often each words appears. Using the brilliance of Linux commands, this can be achieved in a few strokes of the finger:

$ cat *.txt | tr " " "\n" | sort | uniq

Use tr to put all the words (separated by spaces, but you can make this more complex if you need) into their own line. sort will then sort all these words. Finally, uniq will get rid of duplicate lines, and the -c flag will add counts. I'll run this on a small set of bug reports from the Eclipse Bugzilla repository:

$ cat * | tr " " "\n" | sort | uniq -c
      1 ^^^^
      6 able
     12 about
      1 aboutdialog
      1 abovebackground
      1 absolute
     18 abstract
      2 abstractannotationprocessormanager
      4 abstractcompletiontest
      1 abstractdebugeventhandler
    [...]

The words will be displayed in alphabetical order. (Yes, the word "^^^^" actually appeared somewhere in the bug reports.) To instead display them by usage in decreasing order, run it all through sort one more time:

$ cat * | tr " " "\n" | sort | uniq -c | sort -gr
   1118 java
   1006 eclipse
   1005 at
    956 org
    825 the
    546 internal
    361 to
    356 jdt
    320 ui
    318 in
   [...]

The -g option sorts by numeric values, instead of alphanumeric, and the -r option reverses the sort. Voilà!

19 comments:

Unknown23 July 2014 at 23:05
Nice post dear. I like it Thin Clients & Zero Client
ReplyDelete
Replies
Unknown4 November 2014 at 03:48
Thanks for this one-liner ! Using the tr command is key here.
ReplyDelete
Replies
JESii7 November 2017 at 05:49
And tr is SO much faster than using a "while read line" loop
ReplyDelete
Replies
Miles O'Neal3 June 2018 at 14:33
Thank you! I was a bit boggled to realize there wasn't a utility for this.
ReplyDelete
Replies
Adams Young10 July 2020 at 09:17
Thanks for taking the time to discuss that, I feel strongly about this and so really like getting to know more on this kind of field. Do you mind updating your blog post with additional insight? It should be really useful for all of us. this
ReplyDelete
Replies
Unknown27 July 2021 at 13:33
instagram takipçi satın al
ucuz takipçi
takipçi satın al
https://takipcikenti.com
https://ucsatinal.org
instagram takipçi satın al
https://perdemodelleri.org
https://yazanadam.com
instagram takipçi satın al
balon perdeler
petek üstü perde
mutfak tül modelleri
kısa perde modelleri
fon perde modelleri
tül perde modelleri
https://atakanmedya.com
https://fatihmedya.com
https://smmpaketleri.com
https://takipcialdim.com
https://yazanadam.com
yasaklı sitelere giriş
aşk kitapları
yabancı şarkılar
sigorta sorgula
https://cozumlec.com
word indir ücretsiz
tiktok jeton hilesi
rastgele görüntülü sohbet
fitness moves
gym workouts
https://marsbahiscasino.org
http://4mcafee.com
http://paydayloansonlineare.com
ReplyDelete
Replies
Anonymous30 April 2022 at 17:19
Mmorpg oyunları
instagram takipçi satın al
tiktok jeton hilesi
Tiktok jeton hilesi
antalya saç ekimi
referans kimliği nedir
İNSTAGRAM TAKİPÇİ SATIN AL
Metin2 pvp serverler
Instagram Takipçi
ReplyDelete
Replies
taksi14 December 2022 at 19:29
Good content. You write beautiful things.
sportsbet
hacklink
mrbahis
vbet
vbet
mrbahis
hacklink
korsan taksi
sportsbet
ReplyDelete
Replies
canlı poker siteleri24 December 2022 at 11:02
Success Write content success. Thanks.
betturkey
canlı slot siteleri
betpark
betmatik
kıbrıs bahis siteleri
kralbet
deneme bonusu
ReplyDelete
Replies
sinan26 July 2023 at 14:12
kırıkkale
kütahya
niğde
ardahan
bolu

PRNXAZ
ReplyDelete
Replies
Emine29 July 2023 at 18:09
yurtdışı kargo
resimli magnet
instagram takipçi satın al
yurtdışı kargo
sms onay
dijital kartvizit
dijital kartvizit
https://nobetci-eczane.org/
ZCP
ReplyDelete
Replies
ırmak6 August 2023 at 02:33
ağrı
muş
mersin
afyon
uşak

JVDİQZ
ReplyDelete
Replies
Anonymous27 December 2023 at 06:44
شركة عزل اسطح
عزل اسطح
ReplyDelete
Replies
Anonymous27 December 2023 at 06:44
شركة عزل اسطح
عزل اسطح
ReplyDelete
Replies
Anonymous23 October 2024 at 03:00
شركة مكافحة حشرات 22oSZokb1N
ReplyDelete
Replies
Anonymous23 October 2024 at 04:59
شركة مكافحة حشرات RPMHiV2b3H
ReplyDelete
Replies
Anonymous3 November 2024 at 01:41
شركة مكافحة بق الفراش بالجبيل PQhrx9ryUw
ReplyDelete
Replies
Anonymous11 November 2024 at 06:16
شركة تسليك مجاري بالهفوف ZwA8sypoAW
ReplyDelete
Replies
Anonymous5 May 2025 at 06:10
تنظيف مجالس بجازان
QaEhueb8Di
ReplyDelete
Replies

Subscribe to: Post Comments (Atom)