This week's giveaway is in the Cloud/Virtualization forum.
We're giving away four copies of Production-Ready Serverless (Operational Best Practices) and have Yan Cui on-line!
See this thread for details.
Win a copy of Production-Ready Serverless (Operational Best Practices) this week in the Cloud/Virtualization forum!

Mike London

+ Follow
since Jul 12, 2002
Cows and Likes
Total received
In last 30 days
Total given
Total received
Received in last 30 days
Total given
Given in last 30 days
Forums and Threads
Scavenger Hunt
expand Rancher Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Mike London


The issue was that there was a special version of the inbox.close() that takes a Boolean.

New line of code in finally block: inbox.close(false);

Resource issue fixed.


-- mike
2 weeks ago
It isn't that it "always worked before" it's that it works on Mac but not on Windows.

I created another REST service (non SpringBoot, non-"mstor") deployed in Tomcat which using regular Java file streams opens a file, read it, closes it, causes no problems. File is correctly closed.

I'll try a SpringBoot simple file read example as the last test before I conclude it's the mstor library in some way not releasing the file.


- mike
2 weeks ago
I ran the exact same mstor code in a standalone Java application on Windows 2016 server and it did not keep the file open.

The problem only happens when running under Tomcat.


-- mike
2 weeks ago
Thanks Tim.

I've never had Tomcat keep a file open beyond a method's completion (on Mac or Windows), and there is thus no way this code is production ready or able to deliver to a client.

Therefore, practically speaking, I believe there's a problem with the API itself.


-- mike
2 weeks ago
I am executing a method to read an inbox using the mstor API, but for some reason, Tomcat (on Windows only) is keeping a reference to the 'mbox' file. I've attached the error that appears below if I try to, say, rename the file on the desktop after the code runs.

I went into the Windows 2016 server management. Selected Folders, then files, but no files show up there.

Went into Tomcat manager after running the code once, but nothing jumps out there either as to how Tomcat is keeping this file open.

Also tried to add: "antiResourceLocking=true" in the Tomcat context.xml (within the <context> tags) since I read that can help with resource locking issues (not in this case, however).

Nothing is working thus far to close this file in Windows after the code runs. (On the Mac, the same code works fine, not that that helps here.)

Using Java 8 (201), Tomcat 8.5 installed as a service (with 4 GB available).

On both Mac and Windows, the code works; it's just that on Windows Tomcat is keeping the file open somehow so the code is not ready for the client...

Suggestions appreciated!

2 weeks ago
All fixed. Thanks!!!

-- mike
Great info!

I changed my code to use the Executor Service, like this:

This works great.

I ended up creating a separate file per thread as I was having difficulty synchronizing the writes to a single file using a FileWriter. Most of the file was OK, but there would be random writes (non synchronized in some way) here and there.

-- mike

I created four threads.

When the code runs, I see:

Thread-0 is running.
Thread-1 is running.
Thread-2 is running.
Thread-3 is running.

Yet, when the code executes, all of Thread-0 runs first, then Thread-1 and so on.

Since I have a multi-core machine, I expected to see the thread numbers jumbled up to show that multiple threads were running at the same time, but that doesn't appear to be happening.

I am not using ".join()".

Just creating four threads and starting them, like this:

(Hard-coded loop is only for testing)

I've done some searching on this but it's still not clear.

Is creating threads enough to have the system execute them in parallel or do I really need to use fork-join framework?


-- mike
Follow up ....

I found that you don't need to compile your own mbox data storage class using C or other methods.

There's a maven dependency for mbox processing you can add to your project and parse mbox files.

- mike
1 month ago

Tim Holloway wrote:The docs on MBox-store are vague on whether the C compiling is required if you don't want/need native-os file locking.

From experience, I recommend avoiding anything that requires compiling C code to build it. Not only is C definitely not write-once/run-anywhere, it's not even write-once/compile-anywhere. The example seemed to use a compiler named "C89". I'm not sure if that's a generic compiler reference or maybe a Sun/Solaris/Oracle product.

Not only are C builds touchy based on the OS platform, but they often break if you don't have the right compiler, right version of the compiler or right installed development libraries and that's even before you get into tweaking compiler options. I pretty well gave up on Perl CPAN because of the constant breakage in C components - Python's equivalent is much saner.


In Python, I'm assuming you mean "import mailbox". It's strange Java doesn't have something like that built-in (similar comment for machine learning, statistics, and other stuff Python excels at), say, in the JavaMAIL API. I did see a couple of Java code samples that were supposed to work but neither did.

I suppose in my REST service, I could just run the Python code using something like: "Process p = Runtime.getRuntime().exec("python");"


- mike
1 month ago
Yep, I saw these too, but was hoping for something a bit more ready to go.

Apache Tika also has an "mboxParser" so I'll test that as well.

Thanks Tim.
1 month ago
Having looked for an API that would handle opening/reading mbox files (email programs), I haven't really found anything.

Since the mbox format is well documented, is this requirement a case where I'd be just as well reading the mbox file line by line using Java File classes?

-- mike
1 month ago

I've installed via brew the Tesseract 4 library on the mac.

I also am using the Tess4j library, which seems to have some issues with Tesseract.

In particular, when I try to OCR a file, I get an error that many, many others have reported. I've yet to find a working solution.

!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 209
# A fatal error has been detected by the Java Runtime Environment:
#  SIGILL (0x4) at pc=0x000000012db6626e, pid=1762, tid=0x0000000000005e03



The issue is that you need to set the environment variable for locale. On the Mac, if you type "locale" at the command window, you may have the situation where "LC_ALL" has no right hand side value. The authors of the API apparently didn't check for that missing right-hand-side value situation before doing a "strcmp" and crashing the JVM.  On Windows Server, I had no issue with the locale as on the Mac. I installed Tesseract from this site: Tessearct Download Windows and it worked right away.

If you want to install other languages, other than default English, you can find the language training files here: Tesseract Training Files.

The other thing to really watch is that you have to make sure you tell Tesseract where the training data directory is for languages. On the Mac, that's (for me): /usr/local/share/tessdata.  On Windows: C:\Program Files (x86)\Tesseract-OCR\tessdata

Hope this update helps someone.

-- mike
1 month ago