Return to Snippet

Revision: 55129
at January 27, 2012 13:17 by clopez


Initial Code
tr -sc 'A-Za-z' '\012' < text_file.txt | sort | uniq -c | sort -nr > output_ngram.txt

Initial URL

                                

Initial Description
For any text document this snippet generates an output file that looks like this:

  30 m
  29 por
  29 aplicaci
  27 modelo
  27 datos
  24 con
  21 este
  21 esta

Initial Title
Get word frequency distribution

Initial Tags
Bash, text

Initial Language
Bash