At least for some pdf files, google does an excellent job at preserving math formulas in their "View as HTML" view. Basically, what it appears to do is replace the math symbols by their corresponding unicode character, so ideally most of the information is preserved. Emacspeak currently has some trouble reading such characters, but emacs 22 has promising features to remedy this problem some time in the (hopefully near) future. In the meantime, I use the following command to read the name of the character at point: (require 'descr-text) (defun unicode-name-at (pos) (interactive "d") (let* ((char (char-after pos)) (unicode (or (get-char-property pos 'untranslated-utf-8) (encode-char char 'ucs)))) (message "%s" (downcase (or (cadr (assoc "Name" (describe-char-unicode-data unicode))) "Unknown character"))))) This is emacs 22 only and make sure you look at the documentation of describe-char-unicodedata-file. Naturally, this can only work in multibyte mode. Of course, the above only helps you with pdf files that were indexed by google. It would be interesting to know how exactly a pdf must be made up for this conversion to work and what kind of pdf to HTML converter they use. Best regards, Lukas Kalyan Mukherjea writes ("Re: This is off-topic? perhaps."): > > The only formula in Mannin.txt (the text file produced by pdftotxt) > caught my attention when it was read out: > > I heard: > 32 + 42 = 52!!! > > Naturally I "woke up" paid attention and realized that this was the > rendition of the Pythagorean identity: > > $3^2+ 4^2= 5^2$. ----------------------------------------------------------------------------- To unsubscribe from the emacspeak list or change your address on the emacspeak list send mail to "emacspeak-request@xxxxxxxxxxx" with a subject of "unsubscribe" or "help"
If you have questions about this archive or had problems using it, please send mail to:
priestdo@xxxxxxxxxxx No Soliciting!Emacspeak List Archive | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998 | Pre 1998