MCSA Learning Channel: Fun in Linux Terminal – Play with Word and Character Counts

Thursday, June 29, 2017

Fun in Linux Terminal – Play with Word and Character Counts

Linux command line has a lot of fun around itself and many tedious task can be performed very easily yet with perfection. Playing with words and characters, their frequency in a text file, etc is what we are going to see in this article.

The only command that comes to our mind, for tweaking Linux command line to manipulate words and characters from a text file is wc command.

Fun with Word and Letter Counts in Shell

A ‘wc‘ command which stands for word count is capable of Printing Newline, word & byte counts from a text file.

To work with the small scripts to analyze text file, we must have a text file. To maintain uniformity, we are creating a text file with the output of man command, as described below.

$ man man > man.txt

The above command creates a text file ‘man.txt‘ with the content of ‘manual page‘ for ‘man‘ command.

We want to check the most common words, in the above created ‘Text File‘ by running the below script.

$ cat man.txt | tr ' '  '\012' | tr '[:upper:]' '[:lower:]' | tr -d '[:punct:]' | grep -v '[^a-z]' | sort | uniq -c | sort -rn | head

Sample Output

7557 
262 the 
163 to 
112 is 
112 a 
78 of 
78 manual 
76 and 
64 if 
63 be

The above one liner simple script shows, ten most frequently appearing words and their frequency of appearance, in the text file.

How about breaking down a word into individual using following command.

$ echo 'tecmint team' | fold -w1

Sample Output

t 
e 
c 
m 
i 
n 
t 
t 
e 
a 
m

Note: Here, ‘-w1’ is for width.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

► Read more: http://adf.ly/1nBVXq

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

MCSA Learning Channel

Pages

Thursday, June 29, 2017

Fun in Linux Terminal – Play with Word and Character Counts

Sample Output

Sample Output

No comments:

Post a Comment

Blog Archive