Need to do a word count in Emacs? Strangely enough, that’s one thing that is not as easy as one would think. Anyway, if you DO need word count, chances are it’s because your using Latex to write whatever it is you are writing, and exporting it to PDF. If so, here’s a solution that should work if you’re using Linux:
(defun pdf-word-count ()
(setq cfile (buffer-file-name))
(setq txt-file (concat cfile ".txt"))
(shell-command (concat "pdftotext " cfile " " txt-file))
(shell-command (concat "wc " txt-file))
(shell-command-to-string (concat "rm " txt-file)))
(global-set-key "\C-xw" 'pdf-word-count)
Just add this to your .emacs file, and you will be able to count the words in any PDF-file. To count the words in your Latex-document, there are three steps:
- C-c C-c: Compile the PDF, as always
- C-c C-c again: Now, the generated PDF should be opened in Emacs, so you can see it, and the buffer containing it will be active.
- C-x w: This executes the code and displays the count in the minibuffer.
The result should look something like this:
24 7490 46689 /home/mystuff/mydocument.pdf.txt
The first number is the new-line count, the second is the word count, and the last is the number of bytes (in the temporary file which follows; in this case, “mydocument.pdf.txt”, which contains the extracted text from “mydocument.pdf”).
The way this works is a little crude, but it gets the job done; It actually fetches the name of the file in the currently active buffer, and creats a variable with the same name, only with “.txt” appended to it. The text in the pdf is then extracted with the Linux command “pdftotext”, and stored to a file with that name. The word count is then performed on that text using the “wc”-command, before the text-file is deleted.
- This should work on any PDF-file, as long as the text can be exported.
- If you want to use a different key-combo, just change “\C-xw” in the last line