• Post Reply Bookmark Topic Watch Topic
  • New Topic

Complexity of this program: search a word based on prefix program  RSS feed

 
Ranch Hand
Posts: 78
1
Java jQuery Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Problem statement is like:
Read a file and find a word which is starts provided prefix. The word is not equal to prefix.
The sample data in a file like:

My name is atul more
I am using Windows operating system
sample text for testing
Designation assign to me is tech lead
I want to convey only this today



Lets say you provide prefix as "to", so the program should return "today" because it is the maximum word string which starts with prefix "to".

The program is:


I want to understand, how can I measure the time (best and worst)complexity of the program.


Thanks,
Atul
 
Master Rancher
Posts: 1251
36
Android Chrome IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One thing that you may be able to do is to run a code profiler/analyzer. There are some which can be run standalone and others which can be installed into IDEs.

I pretty sure that the deeper the nested the "IF" statements or constructs are the more complex the code profilters/analyzers will state the program is.
I recall using CodeMaid (http://www.codemaid.net/) for Visual Studio when I was programming in C# and I suspect that there is something which does approximately the same for Java IDEs or as a standalone program.

However, if you are asking how do you know before you create the program so that you could use words list "easy" or "difficult" if you were to talk to your manager about this then I'm not too sure.

Did you want a tool which can be installed into an IDE? If so which IDE are you using?
 
Pete Letkeman
Master Rancher
Posts: 1251
36
Android Chrome IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I came across this from a five days ago which may be add to the conversion
https://coderanch.com/t/685206/java/Real-World-Programming-Calculating-Complexity
 
Author
Ranch Hand
Posts: 157
29
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Atul,

If I understand the requirements correctly, here is how I would write a functional implementation of the program using Java 8 standard constructs:



As you can see, this is a simple fold of a list (although Java 8 calls it "reduce"). The complexity is then O(n).
 
Atul More
Ranch Hand
Posts: 78
1
Java jQuery Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Pierre-Yves,

Thanks for the solution.
Even I tried the same thing to convert the Set into List and then traverse it.
There is always O(n) complexity I found.

Just tried to reduce it by some way but don't know how?

Thanks,
Atul
 
Saloon Keeper
Posts: 8457
155
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could implement a prefix tree or use one from a library, such as PatriciaTrie from the Apache Commons library. Use the prefix to search for the node that is the root of all words that start with the prefix, and then from that node iterate through the tree to find the leaf containing the longest word.

The worst case is when all words are prefixes of other words, e.g. "a", "at", "attack", "attacker", "attackers". In that case, the tree devolves to a linked list. In the best case, the tree is completely balanced.

Getting all words with a common prefix is an O(m) operation if child nodes are stored in a HashMap, where m is the length of the prefix. If the tree is balanced, this reduces the size of the tree to iterate through to O(log n), where n is the number of nodes in the tree. If the tree is a linked list, the size of the sub-tree remains in O(n).

So in the best case of a completely balanced prefix tree, getting the longest word with a certain prefix is an O(m + log n) operation, while in the worst case of a linked list of words that are all prefixes of the longest word, getting the longest word is an O(n) operation.
 
Pierre-Yves Saumont
Author
Ranch Hand
Posts: 157
29
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You would have to add the time for constructing the prefix tree. If you don't take in account the time for constructing a dedicated data structure, there are faster solutions, for example:



I did not include the prefixes function, but it is trivial. It just computes a list of all possible prefixes for a given word.
 
Stephan van Hulst
Saloon Keeper
Posts: 8457
155
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, that would give constant time complexity, at the cost of exponential memory usage. I chose a prefix tree because they work well even in the case of very large word files.

On an unrelated note, personally I really don't like using the reduce() method with stateful operations, even if the reduction is performed sequentially. If the reduction can't be performed easily with chained collectors, then write a custom collector. That way, you can even perform the reduction in parallel. In this case however, you can just do it with a single toMap() collector.
 
Pierre-Yves Saumont
Author
Ranch Hand
Posts: 157
29
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Of course, your solution using a prefix tree is much better. What I wanted to show is that the main source of problem is not choosing a sub-optimal solution, but solving the wrong problem. The requirement was "Read a file and find the longest word starting with a given prefix." So if a specific data structure is to be used, the complexity should take in account populating this data structure from reading the file.

Of course, my solution is uninteresting, since it pre-computes all possible results.  But it could still be the best solution depending upon how many words there can be, and how many times the function would be called, and whether the number of words could change between function calls, and more.

Given this information, one could chose between solutions with longer insertion time and shorter retrieval time, or the other way round.

If I had to solve the problem, I would chose to to try implementing a functional prefix tree. But this would not be optimal in terms of development time!

Regarding reduce, I completely agree. Java streams were obviously designed with mutation based reduction in mind. The reduce method is interesting to show how a fold could be used in functional programming, but it should be used with an immutable map. Using a collector makes much more sense, of course.

 
Stephan van Hulst
Saloon Keeper
Posts: 8457
155
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Fair point about the initial data structure setup costs, and the costs of code development.

As a final remark, I would say that the initial costs may be well worth it if multiple retrievals are performed with the same input data.
 
Atul More
Ranch Hand
Posts: 78
1
Java jQuery Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Made some improvement in the code and fix couple of issues.



Thanks,
Atul
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!