• Post Reply Bookmark Topic Watch Topic
  • New Topic

Read RTF file, Replace Placer Holder with Text and Save Back to File System  RSS feed

 
Hitesh Patel Patel
Greenhorn
Posts: 28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi there,

I need your help & here is my requirement.

1. Read RTF file
2. Find PlaceHolder i.e. {REPLACE_ME} and replace it with some text " I AM REPLACED ".
3. Save RTF back to File System.

I looked into Apache POI but it seems it supports only DOC and DOCX. Can anyone advise how this can be achieved?

Thanks - Hitesh
 
Tim Moores
Saloon Keeper
Posts: 4035
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One advantage of RTF is that it's plain text, so you can use the Reader/Writer classes in the java.io package, and perform the replacement with the String class. No other libraries are needed (and yes, POI is for doc and docx files).
 
Norm Radder
Rancher
Posts: 2240
28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
An RTF file may be composed of ASCII text but it does contain control fields that need to be parsed. Look at a RTF file in an editor that does not do any parsing to see what I mean.
 
J. Kevin Robbins
Bartender
Posts: 1801
28
Chrome Eclipse IDE Firefox Browser jQuery Linux MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I once had to write a program to manipulate RTF files. I used a class called RTFEditorKit. It's part of Swing, but I wrote a console app with it, not a Swing app.

Maybe that will help you.
 
Tim Holloway
Saloon Keeper
Posts: 18799
74
Android Eclipse IDE Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Norm Radder wrote:An RTF file may be composed of ASCII text but it does contain control fields that need to be parsed. Look at a RTF file in an editor that does not do any parsing to see what I mean.


That's overkill here. The requirement is to do simple text content substitution. Nothing was said about reformatting the document or even about adjusting its appearance to allow for the differences in real estate occupied by the text, and since MS-WORD and its relatives are word processors and not page layout programs, they re-arrange the page based on content except as explicitly instructed.

Absent any additional qualifications, I could accomplish this task with no programming whatsovever using the Unix/Linux sed utility program. The RTF does not need to be parsed because it's all text itself and therefore doesn't have to be broken down into its components to find the text to be replaced.
 
Norm Radder
Rancher
Posts: 2240
28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's the contents of an RTF file that has formatting in it. The text is intermingled with format controls:

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fswiss\fcharset0 Arial;}{\f1\fswiss\fprq2\fcharset0 Arial;}{\f2\fswiss\fprq2\fcharset0 Arial Narrow;}}
{\colortbl ;\red255\green0\blue0;\red0\green0\blue0;}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\cf1\f0\fs20 This\cf0 is\cf2\b\f1 some\f2\fs40 \cf0\b0 text\f0\fs20\par
}


How would the text: "This is some" be found for a simple edit to change it?
 
Tim Moores
Saloon Keeper
Posts: 4035
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The task is not general text substitution, the task is to replace certain predefined placeholders. That the remainder of the RTF is full of control structures is not relevant in that context.
 
Norm Radder
Rancher
Posts: 2240
28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Whoops. I missed that. I guess there wouldn't be a problem if the text being replace was a contiguous string of characters.
 
Tim Holloway
Saloon Keeper
Posts: 18799
74
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As it happens, "This is some" in the sample RTF is NOT plain text, but text that shifts between fonts and colors within the selection to be replaced. That would not be the case if a simple placeholder (for example: "{REPLACE_ME}") was the text to be substituted.

Replacing the placeholder with text that itself contains RTF directives is not a problem, as long at you do it in a way that would result in valid RTF after replacement.

Trying to generically replace a given of sequence of text that internally shifted RTF attributes would be MUCH stickier, however. You could apply brute-force and regexes, but basically once you've declared a format free-for-all, even parsing out the RTF wouldn't be a one-size-fits-all solution, just a way of arranging data in the hopes that you could find the magic text sequence pieces amongst the general detritus.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!