• Post Reply Bookmark Topic Watch Topic
  • New Topic

Generating 700,000 pdfs in one night?  RSS feed

 
Nick Delauney
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

We have a project coming up in my financial firm requiring we generate 700,000 PDFs in one night. Each PDF will be 7-10 pages and will contain charts and graphs.

We are leaning towards XSL:FO and some type of SVG graphics package. Is there anyone with experience with this kind of thing that could give some adivce or provide some links to assist my search for a solution?

Any help is appreciated, Thanks,
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In my perception, FOP (the standard Java implmentation of XSL/FO) is not the speediest package, and combining it with SVG is probably not going to make it any faster. You might want to take a look at the iText library as well. It can include images that you have created elsewhere, which may or may suffice for your purposes.

You'd need to create around 2 files each second, so performance is an issue. I'm not sure either FOP or iText can be that fast for documents of the size you mention that include graphics/images, so I'd do some timings first.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I did a project for a client transforming XML to PDF - the documents ended up being 200 - 300 pages in some cases, taking 20-30 seconds on my slow machine. There were some embedded graphics in jpg or gif format, didnt try SVG. PDF file sizes over 1.5mb.

If your data can easily be created in XML format I would suggest giving XSL:FO a try.

Since this runs overnight, you may have spare processing power sitting on peoples desktops that could be used for a "render farm" type of operation if it proves too much for one machine.
Bill
[ July 18, 2006: Message edited by: William Brogden ]
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by William Brogden:
if it proves too much for one machine.


As indeed it probably will. Even if we define "overnight" as between 5PM and 9AM, that's 57600 seconds; To do 700000 documents in that time, you'd have to do over 12 documents per second that entire time, which sounds like a very ambitious goal for a single machine.
 
Jeroen T Wenting
Ranch Hand
Posts: 1847
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If the documents are mostly static data, a lot can be pulled into the XSL file and precompiled, speeding up the generation process enormously.
But still it sounds like a job that will require at least a multi CPU machine with heaps of RAM and fast connections to your database (or EJB serverfarm or whatever).
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm reading data from a csv-file, filter it, do some calculation, and write to pdf with itext.
For a 25-pages output with about 60 sections with 60 tables and diagrams I need 3 s / Document.
The diagram is very simple made with pngencoder: http://catcode.com/pngencoder/

The user may choose more elaborated Diagrams with JFreeChart: http://www.jfree.org/jfreechart/index.php

That leads to 8 s per PDF.
My machine is a 2 Ghz Pentium M.

Since I don't create multiple documents in one take, you may save some time there, if you can reuse parts of the document.

You may try to generate 700 000 Foo-Pdfs without just the word 'Foo' to find out the lower limit for your machine.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!