This week's book giveaway is in the NodeJS forum. We're giving away four copies of Serverless Applications with Node.js and have Slobodan Stojanovic & Aleksandar Simovic on-line! See this thread for details.
The basic problem is that it is very hard to extract style information from a PDF. POI can create Word files, but has no PDF capabilities. PDFBox can extract text from a PDF, but has no easy API to extract style information.
The PDFRenderer project on GitHub can display PDFs, so obviously it knows how to extract styles. You could check what it does and try to do the same. Be prepared for much work.
The bottom line is that this will be a lot of work, and I predict that you will not get it to work. So let's take a step back and ask: why do you think you need to do it?
You should be able to extract the styles with dpfbox, I used it before to create a buffered image and show it in an imageview, the same logic could be used to place the buffered image in a word file.
Poi I have used before to create excel sheets that contained styling and images, although there I believe I had to create the style myself, it wasn't automaticly copied
It's good to be able to use someting, it's better to understand how it works.
You have two different document formats here: Word is a word processor, PDF is text layout.
They may look a lot alike, but there's a very big difference.
In a PDF, the page layout is fixed (mostly - there's a format called "reflowable PDF"). The sizes and positions of everything are quite firmly nailed down onto each page of the document.
In a Word document, the page layout is more fluid, as anyone who's taken a Word document and moved it to a different computer - or even opened it with a different word-processor such as LibreOffice - can attest. Paragraphs slide around from page to page. Fonts don't always match (this used to be a major problem before FreeType).
So at a minimum, expect to lose some things when converting.
A PDF is a set of metadata combined with a series of PostScript commands. A Word document can be represented using Rich Text Format in a plain text file, or in traditional .doc format or in the XML-based .docx format. The command set is the same, regardless, so only the notation varies.
There are some websites that claim to be able to convert PDF's or PostScript to Word format. Personally, I prefer something I can run in-house, for security reasons. But I haven't been able to actually find anything like that.
Linux has a very rich set of document-processing tools, so it's possible I could pipe a few of them together to do what you want, but no immediate solution comes to mind.
When it comes to destroying a civilization, gas chambers cannot hold a candle to echo chambers.