Win a copy of Programming with Types this week in the Angular and TypeScript forum
or The Design of Web APIs in the Web Services forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Bear Bibeault
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Junilu Lacar
  • Knute Snortum
  • Henry Wong
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
Bartenders:
  • Frits Walraven
  • Joe Ess
  • salvin francis

What's wrong with my code?

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need help with this method, which is to count how many times "hi" occurs in a string.

The code below works with "", " ", "hi" or anything with no more than one "hi" but seems to enters an infinite loop with strings that contain more than one "hi" such as "hihi".


 
Sheriff
Posts: 14614
243
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Line 12 is your problem. Walk through the code and track the values that i goes through.
 
Marshal
Posts: 66975
255
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is there any need to use substring() at all?
 
Ranch Foreman
Posts: 240
5
Eclipse IDE C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i think the code can be simplified to remove some unnecessary constructs, and that will make it easier to read as well as debug.
fewer IF statements, avoid the break statement, and avoid overuse of comments.
here's a quick stab at rewriting it, i didn't test it.



you can also rewrite it with a for-loop instead of a while loop if you prefer it.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
SF: Please don't post complete solutions. Please note what it says on the title page of this forum:-

We're all here to learn, so when responding to others, please focus on helping them discover their own solutions, instead of simply providing answers.

I think your solution would have worked before I got at it, and I might (if I remember) restore it later. I think there is another solution not using substring().
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Half an hour ago, I wrote:Is there any need to use substring() at all?

I have tried it and it works without substring().
 
Bao Truong
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks guys, I fixed the problem and made a less bulky version without using substring()
 
S Fox
Ranch Foreman
Posts: 240
5
Eclipse IDE C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you should post the code of your solution, what did you come up with?
i'd like to know what the most efficient way to do this actually is, because i need to count occurrences of substrings all the time in the programs i make.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In which case you need a utility class:-
 
Bao Truong
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's the first method with some changes and the method with no substring()

 
S Fox
Ranch Foreman
Posts: 240
5
Eclipse IDE C++ Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i do have a utility class for all the special methods that i created.

i meant the actual most efficient implementation of this, because i'm calling this function into a hashset of 6 million+ unique strings. inefficiency in my algorithms hurts performance. so i'm trying to learn how to make things more efficient and "scalable." i got only partway there with rewriting my app to use multi-threading.
 
Junilu Lacar
Sheriff
Posts: 14614
243
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Likes 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is one case where a regex is actually a pretty good solution, in my opinion:

I tried this out here: https://repl.it/repls/GrowlingMuddyMachinecodeinstruction
 
Saloon Keeper
Posts: 6588
61
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The caller would need to make sure that 'substr' did not contain any regex meta characters. I would suggest renaming the 'substr' parameter to 'regex' to remind the developer of this potential gotcha.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Bao Truong wrote:Here's the first method with some changes and the method with no substring()

I am afraid the second method is simply another way to implement substring(). I tested it and it failed to count for me.
The first method looks dreadfully complicated. You do not need to use the substring() method nor create other Strings equal to“hi”.
 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Now I rechecked  indexOf has fromIndex argument. Can be used without substring  method as I think Campbell tried to point.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You appear to have worked out that you need to start searching a String from i + 1 otherwise you will find the same substring repeatedly. Let's try it on JShell (some responses omitted):-

jshell> String longString = "This is the sort of thing in which one would seek his his. How many do you think there are?"
jshell> String shortString = "hi";
jshell> int countInstances(String longString, String shortString)
  ...> {
  ...>    int count = 0;
  ...>    int i = 0;
  ...>    while ((i = longString.indexOf(shortString, i)) >= 0)
  ...>    {
  ...>       count++;
  ...>       i++;
  ...>    }
  ...>    return count;
  ...> }
jshell> countInstances(longString, shortString)
$5 ==> -99999

Yes, there is an awkward line with an assignment inside the while (...) I could have made it even more awkward by changing i to i++ and leaving it out from later. The suggestion in the previous sentence wouldn't have worked. Otherwise you would repeat the indexOf() call.
Try the code yourself and see if you can't get a more accurate count than I got
Note that indexOf doesn't seem to throw out of bounds exceptions:-

jshell> "Campbell".indexOf("amp", 99999)
$10 ==> -1

jshell> "Campbell".indexOf("amp", -99999)
$11 ==> 1

 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is what my code would have looked like without using JShell. In which case the method would have been public and static.Followed by
 
Junilu Lacar
Sheriff
Posts: 14614
243
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Carey Brown wrote:The caller would need to make sure that 'substr' did not contain any regex meta characters. I would suggest renaming the 'substr' parameter to 'regex' to remind the developer of this potential gotcha.


Sorry for the late reply on this...

Renaming any parameter to suggest what the implementation is would be a poor choice. The API of a method should support the semantics of the intent, not the implementation. However, the gotcha is valid and might be addressed in a couple of different ways:

1. JavaDocs calling out the gotcha and what to do about special regex characters in the search string
2. Add code to automatically escape special regex characters. I'd probably add another (private) method to do that.

I think I'd prefer the second approach.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think I tend to agree with Carey. Methods of the String class taking a regex usually call the parameter regex. That reminds the user that the argument should conform to the usual conventions for regexes. If I were writing that method, I hope I would take a third approach, combining the method to add escape characters with a comment in the Javadoc to explain what happens.
Remembering that a definite and fixed Javadoc comment may condemn commit me to maintaining an implementation detail for ever.
 
kiros haregewoine
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:
  ...> {
  ...>    int count = 0;
  ...>    int i = 0;
  ...>    while ((i = longString.indexOf(shortString, i)) >= 0)
  ...>    {
  ...>       count++;
  ...>       i++;
  ...>    }
  ...>    return count;
  ...> }
jshell> countInstances(longString, shortString)
$5 ==> -99999


Maybe we need to change to

to handle matching the last charater and string of one character.
 
Junilu Lacar
Sheriff
Posts: 14614
243
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:I think I tend to agree with Carey. Methods of the String class taking a regex usually call the parameter regex. That reminds the user that the argument should conform to the usual conventions for regexes. If I were writing that method, I hope I would take a third approach, combining the method to add escape characters with a comment in the Javadoc to explain what happens.
Remembering that a definite and fixed Javadoc comment may condemn commit me to maintaining an implementation detail for ever.


If the requirement specifically was to accept regular expressions, I'd agree that regex would be an appropriate parameter name. However, the original requirement was to see how many times one string occurred in another and there was no mention of using regular expressions at all. I would assume, therefore, that the substring is intended to be taken as raw input and placing the burden of escaping special regex characters on the user of the API just because the implementation happened to use regex is not appropriate, IMO.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I shall have to let you correct me; if it is raw input, then you wouldn't want to call it regex.
 
Junilu Lacar
Sheriff
Posts: 14614
243
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell wrote:I shall have to let you correct me...


Not really a matter of correcting you in my mind, just weighing options. You were right after all about getting boxed in to a specific implementation if you only resorted to a callout in the API documentation. These are the kind of design discussions I'd expect to have with a good teammate: point, counterpoint and then consensus. To me it's about deciding on the best path forward, not who has the best (or worst) idea. I like having all the bad ideas on the table, especially if they're my bad ideas, so they don't have a chance to come back and bite me. Whether it's me or someone else I'm working with who calls it out doesn't matter that much to me.
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

kiros haregewoine wrote:. . . Maybe we need to change . . . to handle matching the last charater and string of one character.

No, you don't. Maybe beginners should avoid break; except after case xyz: anyway, but the break; is unnecessary. The loop will finish when you run out of text because after the last occurrence of shortString, the next index will be −1. Your line 3 will therefore never be executed. That makes the else keyword redundant. Don't consider having one character left to investigate because you are thereby restricting yourself to shortStrings with two letters.

Many people are happy to use break; but it is always possible to write a loop without break;

Please don't write an if all on one line, and please be sure to indent your code correctly. If you have problems about indentation, it is you yourself who is most likely to get confused.
 
Bartender
Posts: 2431
107
Google Web Toolkit Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Junilu Lacar wrote:...
2. Add code to automatically escape special regex characters. I'd probably add another (private) method to do that.

I think I'd prefer the second approach.



Instead of using your own method, you can simply call: Pattern.quote(String s)
 
Campbell Ritchie
Marshal
Posts: 66975
255
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think the time has come to restore S Fox's solution which I deleted a few days ago:-
 
salvin francis
Bartender
Posts: 2431
107
Google Web Toolkit Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Would like to point out two issues in the code above:
  • The code is searching the string twice for every iteration using "contains" and "indexOf" methods
  • Every iteration creates a new String object that is a subset of the original string using substring method
  •  
    Bartender
    Posts: 1735
    17
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Mr. OP: You're making the code more complicated than it needs to be.

    This works fine in my testing and is much simpler with fewer methods.

    I also genericized the code so it didn't have hard-coded strings. So, you could now take the below and create a general purpose method.



    Prints: Found: 0 occurrences of HI in Hi, the 1st. And hi the 2nd, and Hi the 3rd and hi the 4th

    HTH

    -- mike
     
    kiros haregewoine
    Greenhorn
    Posts: 12
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    @Mike, why toUpperCase, which is not the requirement of OP.
     
    salvin francis
    Bartender
    Posts: 2431
    107
    Google Web Toolkit Eclipse IDE Java
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    even if you don't want the method to be case-sensitive ...

    why are both the calls to "toUpperCase" within a loop ?
     
    Mike London
    Bartender
    Posts: 1735
    17
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator

    salvin francis wrote:even if you don't want the method to be case-sensitive ...

    why are both the calls to "toUpperCase" within a loop ?



    You're right. I was inferring a requirement and not implementing as efficiently as possible.

    I updated the code above to remove the toUpperCase().

    Thanks!

    -- mike

    ----

    “A program is never less than 90% complete, and never more than 95% complete.”
        – Terry Baker
     
    Mike London
    Bartender
    Posts: 1735
    17
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator

    kiros haregewoine wrote:@Mike, why toUpperCase, which is not the requirement of OP.



    Yep, agreed. Removed.

    Code even simpler now.

    Thanks for noticing that.

    -- mike
     
    You ridiculous clown, did you think you could get away with it? This is my favorite tiny ad!
    Java file APIs (DOC, XLS, PDF, and many more)
    https://products.aspose.com/total/java
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!