• Post Reply Bookmark Topic Watch Topic
  • New Topic

manipulating a string  RSS feed

 
jos graber
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hey all,


I have strings in forms like s="(S (NP (DT This)) (VP (VBZ is) (NP (DT an) (JJ easy) (NN sentence))))"

Programatically, I need to have words which end with ')' to be in an extra pair of brackets.. such that

s = "(S (NP (DT (This))) (VP (VBZ (is)) (NP (DT (an)) (JJ (easy)) (NN (sentence)))))

Note that , words like "this" and "is" are now enclosed in brackets.

I tried to create a program which would search for an alphabet followed by ')' and would put it in bracket... but then arrays can't have insertions in between.

Is there any way this can be achieved ?

Thanks

Jos
 
Campbell Ritchie
Marshal
Posts: 56541
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

What sort of program is that? It looks like a parser for English sentences.
Why are you putting the letters in an array? Why do you not have classes to represent the words, e.g. Word, Noun, Verb? Can you give those classes a prependBrackets method? Can you make the top of the inheritance tree (Word) an interface?
 
Tim Cooke
Marshal
Posts: 4044
239
Clojure IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Jos, While we don't mind if you cross post your questions on other sites we do ask that you let us know that you have done so. Please have a read of the link to see why -> BeForthrightWhenCrossPostingToOtherSites.

This question was also posted here
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
jos graber wrote:Programatically, I need to have words which end with ')' to be in an extra pair of brackets..
...
Is there any way this can be achieved ?

Sure, but you first need to define precisely what a "word" is (for your purposes at least). And don't write one line of Java code until you have one that covers EVERY possible eventuality, including nasty ones that might be hard to work out.

So: what is a "word"? Describe it for us - in detail.

You might also want to think about scenarios where you have more than one word, eg:
...(DT This is an)...
(if it can happen), and also cases where the input contains invalid information (unless you've been specifically told to assume that it will always be "correct").

Anyone can write code that works when everything is fine; good programmers write code that works when it isn't.

HIH

Winston
 
jos graber
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey all,
just consider it a normal string...

By "word" I mean words like "This", "is", "an" and so on..

I managed to put a starting bracket before the "words" such that

S=(S (NP (DT This)) (VP (VBZ is) (NP (DT an) (JJ easy) (NN sentence))))

becomes
S=(S(NP(DT(This))(VP(VBZ(is)(NP(DT(an)(JJ(easy)(NN(sentence)))) // removal of whitespaces is alrite

The code is :




What I need is a closing bracket after the words such that

S=(S(NP(DT(This)))(VP(VBZ(is))(NP(DT(an))(JJ(easy))(NN(sentence)))))


Thanks
 
jos graber
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Regarding the cross posting, sorry about that, I should have told the forum.
 
jos graber
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:

You might also want to think about scenarios where you have more than one word, eg:
...(DT This is an)...
(if it can happen), and also cases where the input contains invalid information (unless you've been specifically told to assume that it will always be "correct").




It is assumed that it will always be "correct"
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
jos graber wrote:By "word" I mean words like "This", "is", "an" and so on...

Yes, but you need to define it precisely. Clearly, things like "DT" and "VBZ" are not words as far as you're concerned, even though they're alphabetic; so what it it exactly that makes "This" and "is" words? What about hyphenated words? Or punctuation? Or special strings like email addresses or filenames ... ?

You need a precise description of exactly what a "word" is before you can start tackling a problem like this.

Just a few possibilities:
  • Do you have a list of known "labels ("VP", "S", "NP"... etc)? If so, a word could be any alpha string that is NOT a label.
  • Is a "word" always followed by a ")"? And if so, does the ")" follow it immediately, or can there be whitespace in between?
  • Is a "word" always preceded by a label in the same set of brackets?

  • Winston
     
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!