Win a copy of The Way of the Web Tester: A Beginner's Guide to Automating Tests this week in the Testing forum!

# Need Suggestion to increase efficiency

Vipul Bahuguna
Greenhorn
Posts: 4
Hi All,

I have an ArrayList with huge number of Strings added to it.
I have a String as an input to my program which will be a concatenation of two Strings present in the ArrayList.
I need to write a program that will check if the input String can be matched with the concatenation of two Strings in the ArrayList.

If I go by brute force approach I can achieve this as follows -

ArrayList<String> lst = new ArrayList<String>();
...
...

String[] values = lst.toArray(new String[0]);
int len = values.length;
StringBuilder sb = new StringBuilder();
int sbLen=sb.length();
for (int i=0; i<len; i++)
{
for (int j=0; j<len; j++)
{
sb.delete(0, sbLen);
sb.append(values[i]);
sb.append(values[j]);

if (sb.toString().equals(input) )
{
System.out.println("Match Found at location i:" + i + " and j: " + j);
break;
}
}
}

Because my ArrayList can have very huge number of records, going by the brute force technique can have serious performance issues.
Is there a way I can use Executor framework or some other threading technique to enhance the performance of this logic.

I would appreciate any good suggestions.

Paul Clapham
Sheriff
Posts: 21443
33
Well, comparing against the concatenations of all possible pairs of strings is about the worst possible method you could imagine. It's an O(N^2) algorithm.

What I would do is this:

(1) Find if the input string starts with one of the strings in the huge list.

(2) If it does, find if the rest of the input string (the part after what you matched in step 1) is the same as one of the strings in the huge list.

That reduces the complexity to O(N). And if you sorted the huge list first and used binary search, that would reduce it to O(log N).

Raymond Tong
Ranch Hand
Posts: 255
2
Is using ArrayList one of the requirements? or your choice?
If not, using Set may help.

assert assertion
Greenhorn
Posts: 8
hi Vipul

How often this list changes, because if it does not change much we can pre process this list to create multiple lists (Using Executor Framework, since the thread access will be read only sync should not be an issue). Once this list is prepared, search should just be a matter of locating the correct map.

Please let me know the update frequency of this list

Vipul Bahuguna
Greenhorn
Posts: 4
Hi guys,

thanks for the responses.

i have managed to reduce the time of search by using executor framework.

this is what i m doing ... created one thread that reads the list and creates another String which is concatenation of two Strings in the list. And then put it in a snychronized LinkedList. Another two threads start picking up the concatenated Strings kept in this LinkedList and comparing them to the input String. If the match is found I exit and shutdown my executor.

1. Obtained array of Strings from ArrayList.
2. Created a thread that starts reading String's in array and concatenating them and putting them in the LinkedList.
3. The two other threads start removing them from the LinkedList (linkedList.remove(0)). And then compare with the input. If match found stop all threads and exit.

Now i believe i can try to further increase performance of this program by creating may be 2 threads to read the original array and concatenate Strings and put them in the LinkedList??
One question ... how to determine how many threads would be optimal. I believe creating too many threads will also decrease the performance ???

assert assertion
Greenhorn
Posts: 8
well I guess you could do a thread pool to avoid creating too many threads. Out of curiosity, are you creating threads for searching. Or you are building a data dictionary using multiple threads, and then using that structure ?

Vipul Bahuguna
Greenhorn
Posts: 4
Hi assert,
I m using threads to create dictionary and searching both. and i m using executorservice to execute them.

Nitesh Kant
Bartender
Posts: 1638
Vipul, I think by increasing efficiency, someone (if you were asked this question) would have intended to increase the algorithmic/space efficiency and not program efficiency by having more threads.
Since, this is essentially a partial match of a string, Paul's solution is good. In order to implement the first step of his solution you can use a Trie datastructure to find the matching word with a worst time complexity O(m) where m is the length of the input string.

assert assertion
Greenhorn
Posts: 8
If you are using thread for searching, you need to have inter thread communication. Since it is about strings (java automatically creates a pool for string, does not instantiate a new one every time) I was wondering if we can have a map of maps for the starting words.

This one time dictionary can be built using Executor framework threadpool.

Once it is built the algorithm will increase the search effeciancy substantially.