Revision: 55130
Initial Code
Initial URL
Initial Description
Initial Title
Initial Tags
Initial Language
at January 27, 2012 13:29 by clopez
Initial Code
tr -sc 'A-Za-z' '\012' < text.txt | sort | uniq -c | sort -nr > output_ngram.txt
Initial URL
Initial Description
When you run this over a text.txt with some text you will get the word distribution on output_ngram.txt as follows: 30 m 29 por 29 aplicaci 27 modelo 27 datos 24 con 21 este 21 esta 20 En 18 posible 18 palabras 18 como 17 texto 14 tem 14 no 14 documentos 14 cada 14 Por 13 ya 13 todo 13 textos 13 proceso
Initial Title
Get word frequency distribution
Initial Tags
Bash, text
Initial Language
Bash