• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Bear Bibeault
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Devaka Cooray
  • Junilu Lacar
  • Tim Cooke
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Tim Holloway
  • Claude Moore
  • Stephan van Hulst
Bartenders:
  • Winston Gutkowski
  • Carey Brown
  • Frits Walraven

Opening/Parsing mbox file?  RSS feed

 
Bartender
Posts: 1649
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Having looked for an API that would handle opening/reading mbox files (email programs), I haven't really found anything.

Since the mbox format is well documented, is this requirement a case where I'd be just as well reading the mbox file line by line using Java File classes?

-- mike
 
Saloon Keeper
Posts: 5288
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm aware of two options, Mbox-Store and mstor, both of which work with JavaMail. The former requires you to build it using C, though. I haven't used either, so can't say which one might be easier to use.
 
Mike London
Bartender
Posts: 1649
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yep, I saw these too, but was hoping for something a bit more ready to go.

Apache Tika also has an "mboxParser" so I'll test that as well.

Thanks Tim.
 
Saloon Keeper
Posts: 20510
115
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The docs on MBox-store are vague on whether the C compiling is required if you don't want/need native-os file locking.

From experience, I recommend avoiding anything that requires compiling C code to build it. Not only is C definitely not write-once/run-anywhere, it's not even write-once/compile-anywhere. The example seemed to use a compiler named "C89". I'm not sure if that's a generic compiler reference or maybe a Sun/Solaris/Oracle product.

Not only are C builds touchy based on the OS platform, but they often break if you don't have the right compiler, right version of the compiler or right installed development libraries and that's even before you get into tweaking compiler options. I pretty well gave up on Perl CPAN because of the constant breakage in C components - Python's equivalent is much saner.
 
Mike London
Bartender
Posts: 1649
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Tim Holloway wrote:The docs on MBox-store are vague on whether the C compiling is required if you don't want/need native-os file locking.

From experience, I recommend avoiding anything that requires compiling C code to build it. Not only is C definitely not write-once/run-anywhere, it's not even write-once/compile-anywhere. The example seemed to use a compiler named "C89". I'm not sure if that's a generic compiler reference or maybe a Sun/Solaris/Oracle product.

Not only are C builds touchy based on the OS platform, but they often break if you don't have the right compiler, right version of the compiler or right installed development libraries and that's even before you get into tweaking compiler options. I pretty well gave up on Perl CPAN because of the constant breakage in C components - Python's equivalent is much saner.



LOL.

In Python, I'm assuming you mean "import mailbox". It's strange Java doesn't have something like that built-in (similar comment for machine learning, statistics, and other stuff Python excels at), say, in the JavaMAIL API. I did see a couple of Java code samples that were supposed to work but neither did.

I suppose in my REST service, I could just run the Python code using something like: "Process p = Runtime.getRuntime().exec("python yourapp.py");"

Thanks,

- mike
 
Tim Holloway
Saloon Keeper
Posts: 20510
115
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Actually, I was making a blanket statement about languages that require C compiled extensions versus those that do not.

I've had so many projects break at random times because they were written in Perl and used CPAN modules. The Python equivalents to CPAN have never given me that kind of grief.

Although as it happens, I do have a cron job that runs a python script daily on my mailboxes. It sorts the low-grade stuff from the important stuff and has differing levels of aging so that daily ads get purged after a week, but monthly ads stick around longer.

I also had an OSGi app running in Apache Kafka that scanned emails from a popular job-posting source. The source was keyed to look for "remote work", but the mail-scanner had to be used to excise all the ads that actually said "remote work not allowed". So it pulled mail from my inbox, cleaned out the gunk and replaced it.
 
Mike London
Bartender
Posts: 1649
17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Follow up ....

I found that you don't need to compile your own mbox data storage class using C or other methods.

There's a maven dependency for mbox processing you can add to your project and parse mbox files.

- mike
 
it's a teeny, tiny, wafer thin ad:
Become a Java guru with IntelliJ IDEA
https://www.jetbrains.com/idea/
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!