Win a copy of Beginning Java 17 Fundamentals: Object-Oriented Programming in Java 17 this week in the Java in General forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Ron McLeod
  • Liutauras Vilda
  • Jeanne Boyarsky
Sheriffs:
  • Junilu Lacar
  • Rob Spoor
  • Paul Clapham
Saloon Keepers:
  • Tim Holloway
  • Tim Moores
  • Jesse Silverman
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Piet Souris
  • Frits Walraven

Generating 700,000 pdfs in one night?

 
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

We have a project coming up in my financial firm requiring we generate 700,000 PDFs in one night. Each PDF will be 7-10 pages and will contain charts and graphs.

We are leaning towards XSL:FO and some type of SVG graphics package. Is there anyone with experience with this kind of thing that could give some adivce or provide some links to assist my search for a solution?

Any help is appreciated, Thanks,
 
Rancher
Posts: 43027
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In my perception, FOP (the standard Java implmentation of XSL/FO) is not the speediest package, and combining it with SVG is probably not going to make it any faster. You might want to take a look at the iText library as well. It can include images that you have created elsewhere, which may or may suffice for your purposes.

You'd need to create around 2 files each second, so performance is an issue. I'm not sure either FOP or iText can be that fast for documents of the size you mention that include graphics/images, so I'd do some timings first.
 
author
Posts: 288
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Following sites may help


http://www.jroller.com/page/fate?entry=good_vs_evil

http://www.oreillynet.com/cs/user/view/cs_msg/20299
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I did a project for a client transforming XML to PDF - the documents ended up being 200 - 300 pages in some cases, taking 20-30 seconds on my slow machine. There were some embedded graphics in jpg or gif format, didnt try SVG. PDF file sizes over 1.5mb.

If your data can easily be created in XML format I would suggest giving XSL:FO a try.

Since this runs overnight, you may have spare processing power sitting on peoples desktops that could be used for a "render farm" type of operation if it proves too much for one machine.
Bill
[ July 18, 2006: Message edited by: William Brogden ]
 
author and iconoclast
Posts: 24203
44
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by William Brogden:
if it proves too much for one machine.



As indeed it probably will. Even if we define "overnight" as between 5PM and 9AM, that's 57600 seconds; To do 700000 documents in that time, you'd have to do over 12 documents per second that entire time, which sounds like a very ambitious goal for a single machine.
 
Ranch Hand
Posts: 1847
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If the documents are mostly static data, a lot can be pulled into the XSL file and precompiled, speeding up the generation process enormously.
But still it sounds like a job that will require at least a multi CPU machine with heaps of RAM and fast connections to your database (or EJB serverfarm or whatever).
 
Ranch Hand
Posts: 1923
Scala Postgres Database Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm reading data from a csv-file, filter it, do some calculation, and write to pdf with itext.
For a 25-pages output with about 60 sections with 60 tables and diagrams I need 3 s / Document.
The diagram is very simple made with pngencoder: http://catcode.com/pngencoder/

The user may choose more elaborated Diagrams with JFreeChart: http://www.jfree.org/jfreechart/index.php

That leads to 8 s per PDF.
My machine is a 2 Ghz Pentium M.

Since I don't create multiple documents in one take, you may save some time there, if you can reuse parts of the document.

You may try to generate 700 000 Foo-Pdfs without just the word 'Foo' to find out the lower limit for your machine.
 
WHAT is your favorite color? Blue, no yellow, ahhhhhhh! Tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic