Hi!
I'm having an issue... It appears you cannot remove an element from a PDF with PDFBox, but I need to do that. Or something like that.
Not too long ago I found out about these really cool things called Optional Content Groups; it's basically PDFBox's name for a 'layer', but I actually like their name better because the way these layers work isn't the same way they work in Photoshop or whatever; their primary purpose isn't to have layered display things, and if you're talking in terms of display, these 'layers' actually have overlapping objects within them, meaning that they aren't actually real 'layers', and they *contain* real 'layers'. What they really are is "optional content" groups that you can use to say "Hey, I don't want to display this. Turn this off."
When you do that though, it doesn't actually remove it from the document, it just adds a flag that says "Don't show this." So I have a situation where I need to selectively hide images on a PDF, and save it so people can view it in-browser.
So here's the problem: It seems that all of the most commonly used browsers come with a default PDF-viewing application though, that has NO support for PDF layers... Which is incredibly inconvenient, and kind of irritating. That seems like kind of an important part of the PDF specification; I don't know how a browser that 'views PDFs' can just ignore that. So they support "almost" all of the PDF specification. On a related note, there's "almost" enough oxygen dissolved in ocean
water for me to breathe it.
So I have some options... In order of most to least terrible:
1) I could just add a note that tells people "Hey, this probably won't display right in your browser, download it and open it with acrobatReader. It's free." But I don't want to do that for many, many reasons.
2) I can *add* images with PDFBox; I could just add a little white square ontop of the things I need to hide, but that is a kludge and I do not want to do that. Also, I'd have to know exactly what positions to add them at. Also I think it would take a while; I'd have to have it add several images before saving every time someone asked us to generate a PDF.
3) I could put text fields with non-transparent backgrounds over the images I might need to hide, then populate that with a few spaces when I want to hide a given image. That is also a kludge but it isn't as bad of a kludge and it would be less resource-intensive.
4) I could try to find a way to remove the elements from the document entirely; not just make them "Non-optional", but kill them. I have actually found the object in the Document COSObject list, and I removed it, and I removed it from the OptionalContentGroups list; I've removed it from the COSObject Dictionary; but none of that has worked. Every time, it always stubbornly appears on the page still after it's saved. This is the solution I would prefer to use, but I can't figure out how to make it happen. Has anyone else run into this sort of thing before? I found a few records of people who wanted *all* images removed or extracted, but I don't want them all gone, and it isn't the image *resources* I care about, it's the PDF elements that use them...
On second thought though, since the images aren't really being referenced at their original locations prior to importation, it's possible that they're all there as separate image objects in the document resources... I'm going to try combing through that and seeing if I can eliminate it from there, then I'll head back here and report on whether that worked.