I've installed via brew the Tesseract 4 library on the mac.
I also am using the Tess4j library, which seems to have some issues with Tesseract.
In particular, when I try to OCR a file, I get an error that many, many others have reported. I've yet to find a working solution.
!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 209 #
# A fatal error has been detected by the Java Runtime Environment:
# SIGILL (0x4) at pc=0x000000012db6626e, pid=1762, tid=0x0000000000005e03
The issue is that you need to set the environment variable for locale. On the Mac, if you type "locale" at the command window, you may have the situation where "LC_ALL" has no right hand side value. The authors of the API apparently didn't check for that missing right-hand-side value situation before doing a "strcmp" and crashing the JVM. On Windows Server, I had no issue with the locale as on the Mac. I installed Tesseract from this site: Tessearct Download Windows and it worked right away.
If you want to install other languages, other than default English, you can find the language training files here: Tesseract Training Files.
The other thing to really watch is that you have to make sure you tell Tesseract where the training data directory is for languages. On the Mac, that's (for me): /usr/local/share/tessdata. On Windows: C:\Program Files (x86)\Tesseract-OCR\tessdata