Mike London

Bartender
+ Follow
since Jul 12, 2002
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
In last 30 days
0
Forums and Threads

Recent posts by Mike London

That was just an example program to see if the libraries would work.

I wasn't worried about the System.out.println.

But they do work since it's basically a console Spring (TEST) application.

The real SpringBoot uses Log4j, of course.

I don't need help with logging, and that was not my question.

I also posted the actual error received by the more detailed logging.

Thanks Tim.
3 months ago
I have a running Spring REST service that uses the Tika libraries.

Currently, these Tika libraries are 2.9.1.

At some point while adding other features (perhaps this is the issue), the DOCX portion of the Tika extract stopped working. The same basic code in a standalone (not REST) Maven project works OK. so I'm a bit baffled.

Having spent about 8 hours on this, I thought I'd ask the community if anyone had run across this issue with DOCX files in Tika.

The error generated in the SpringBoot REST service is: "TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser"

Below is the test code that works in the standalone Spring project and is the same code spread across Spring's Controller and Service methods but doesn't work.



In the REST project where the error occurs, the data are "POSTed" -- not referenced with a disk path, but XLSX, PDF, TXT all work fine. It's just DOCX that is failing.

Thanks in advance for any suggestions.

- mike
3 months ago
On Windows, it looks like the logging is eventually working, but there are tons of start up messages logged that I don't see at all on the Mac. Same application.  I've tried various log4j2 things to try to suppress all these startup messages that only appear on Windows (basic log file is 350KB after these), but nothing has worked.

Reading the environment variable set in the log4j2.properties file does not work in Windows (but works fine on Mac)  --> ${sys:LOG_DIR}. This is the most disappointing of all.

Well, just wanted to post a quick update.

Thx,

- mike
11 months ago

Tim Holloway wrote:Judging from the sheer variety of JNDI objects not registered, I'd have to say that you haven't configured Log4j properly.

This looks like a good helper: https://www.baeldung.com/spring-boot-logback-log4j2

There are two noteworthy items in Section 3 (Log4j2). First, to MAKE SURE that Logback is not present by overriding the default Maven bundle.

Secondly, the alternative name for your log4j.properties file for reasons explained therein.



Thanks Tim.

Here is the relevant pom.xml section which disables LogBack:



On the mac, the configuration appears to be fine. I can modify values in the log4j2.properties file and the log file responds appropriately.

I have no idea what could be wrong on Windows.

I've worked through every tutorial and "AI Assistant", but no luck thus far.

There are no compile errors...the logging works (on the Mac).

Not sure how to debug/fix this problem.

Thanks,

-- mike
11 months ago
Thanks for the replies.

The problem (still not resolved on Windows!) was that I didn't have a log4j configuration file.

Once I created that, logging and rollover worked as expected .. on the Mac.

Yet, on Windows, the same code throws hundreds of JNDI erors like these:



In the log4j2.properties file, I read the system environment variable for the log directory like this:

property.basePath = ${sys:LOG_DIR}

Again, this works on Mac. It also creates the log file itself on Windows (so it seems it's getting the right log directory from the environment variable on Windows), but then on Windows it fills the log with hundreds of JNDI errors like I posted above.

I've also tried a hard-coded full path to the log directory on Windows, but I get the same hundreds of JNDI errors.

How can this work perfectly on the Mac but not at all on Windows using the exact Tomcat and WAR file?

Baffling!!!

Suggestions?

Thanks,

-- mike

11 months ago
When I had SpringBoot Logback installed (SpringBoot's default logger), SpringBoot would rotate the log files as desired.

Yet, due to a compatibility problem with a library, I had to switch to SpringBoot logging to log4j.

Logging still works, but the rotating log files no longer do. It's just one log file that gets bigger and bigger.

My pom.xml includes:



My application.properties file includes:



Yet, the logs do not rotate any longer like they did when I used Spring's default logger.... Logback.

I am not sure if I need a separate log4j.properties file (I don't think so) or what exactly is off - that is, why the log files no longer rotate.

Would appreciate suggestions.
11 months ago
I have a Java program in a Rest service that runs a Python program using the "Runtime.getRuntime().exec(cmdArray)". The command array has the name of the python program and the data file it should read.

Now, if I run this program from the command line it works fine -- expected result.

If I run this program in Tomcat debug mode, it also works fine -- expected result.

Yet, when running from the actual WAR file inside Tomcat (not in debug mode), the Python code appears to be calling a similarly named class (in the Java Apache math3 library) which has a bug. The python program result is actually the Java (Apache) result.

-----------
Specifics:
-----------
The Python program computes the Mann-Whitney-U statistical test. I had to use Python since the Apache math3 library has a serious bug with this logic but also since the Apache 4 math library is only in "beta 1" (still beta 1, since Dec, 2022). Can't use "beta 1" for a paying client.

The name of the MannWhitneyU test method call is the same for both Python and Java.

Somehow when running in Tomcat with the Apache math3 libraries present in the "lib" directory, the Python code (which is essentially running at the command line with the "exec()") is calling the Java MannWhitneyU test in the Apache math 3 library instead of the referenced MannWhitneyU in the Python code. I've also tried using a complete path to the python MannWhitneyU test in code, but that didn't make any difference.

If I turn on remote debugging and run the code, then the issue does not occur.

If I run this WAR file on Windows, then this issue does not occur.

Unfortunately, I need the Apache math3 libraries for other calculations so until Apache has a non-buggy replacement, I'm stuck with version 3.

Has anyone run into a similar situation with advice for moving forward?

Thanks in advance,

-- mike
11 months ago
Sorry for any confusion. This PDF was related to a court filing, but it's public (and a regular PDF) so I don't think there is any need for concern. My comment above was just checking with the client first.

Thanks

- mike
11 months ago

Tim Holloway wrote:Have you analyzed the document structure with a PDF editing tool? Aside from Adobe, there are several open-source tools that will break down a PDF into its elements, including OpenOffice, Okular and PDFEdit. Or you could use the pdf2ps utility, although digging through the raw PostScript would probably be more tedious.



That's a good tip about LibreOffice.. I didn't know it could edit PDFs.

In any case, clicking on the bates number at the bottom just highlights the box around it. I still don't see any obvious way to remove it.

I've attached a sample of what I see in the footer area of the PDF. The "RE-15" is easily removable, but the bates number (in red), isn't so far...

Thanks,
1 year ago
I have a bates number at the bottom of a PDF that is in the footer area.

I am using Apache PDFBOX that uses a white image to overwrite the header and footer and this works fine.

The bates number, however, I cannot remove. This number appears to be a clickable object that is perhaps in another plane in the z-direction?  

This is a complex PDF created using ITEXT originally, it appears, not that that's too important.

I'm wondering what's going on since even if I place the bates number right on top of the footer (again suggesting there's a depth issue or other object I am not aware of how to handle), the bates number is still there even after the code writes over the footer with the white png image.

Suggestions?

1 year ago

Campbell Ritchie wrote:

Ron McLeod wrote:. . . remove them before using trim.  U+00A0, U+2007, and U+202F are the likely ones. . . .

It is a long time since I used trim() and I might be mistaken, but I believe it doesn't remove such characters as hard space (\u00a0). Try String#strip() instead.




Cool tip, thank you! Will experiment.

-- mike
1 year ago

Ron McLeod wrote:It looks like that file is UTF-8 with a BOM sequence.

I tried this and it did work.  It helps show what the problem is, but is a bit of a hack and a better solution should be used.



Thanks. Weird. I just created two new BBEdit files and they worked, too.

Don't you love Encoding issues?

Thanks very much.

-- mike
1 year ago

Ron McLeod wrote:Can you share a file (attachment to post, not text in post) which is problematic?



Sure, attached.

Just set the path to the text file (list1.txt) on your system, set a breakpoint on the SOP and note that no trimming occurred of the first value.

Look forward to hearing your results.

Thanks Ron.

-- mike
1 year ago

Ron McLeod wrote:If you can determine which characters are causing you grief, you could remove them before using trim.  U+00A0, U+2007, and U+202F are the likely ones.



That's a good idea, I'll investigate ... but here's the thing... this should work (famous last words, I know).

In BBEdit, I created a simple text file:

   1
2
3
4

In a standalone Java program -> should convert to UTF-8, so why the Unicode characters?:



Putting the break point on the sop, I get:

0 = "    1"
1 = "2"
2 = "3"
3 = "4"

WTF?
1 year ago