• Post Reply Bookmark Topic Watch Topic
  • New Topic

file search using scanner  RSS feed

 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator



how do i optimize the scanner object to scan through the files to find out given search parameter?
 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This code will not compile as is (code that defines "directory" is commented out) . Could you post your actual working code?

You are eating an error in line 70 which will just confuse you . You should at least use e.printStackTrace() and not use finally to output the results of your operation as if it succeeded. Finally is really for closing resources.
I guess a better question is, does/did this code work the way you want it to and now you want to optimize it? If so, what is driving your need to optimize it?
 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why not use scanner in the "readCSVFile" method:



I think that StringTokenizer has been supplanted by scanner (for use with files) and / or String.split(). In fact, this is in the javadocs for stringTokenizer:
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

You might want to add the following jars in your classpath.




As requested by you , i provide the working code for optimizing


 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The first thing that jumps out is that you are making the directory object and the full list of files over and over again, once for each string in the list:



wouldn't it make sense to just do this once as follows?



The second thing is to think about which loop you want as your outside loop. Is it better to loop through and open each file in the list of files once or many times?

 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator







According above code, when we use scanner for splitting comma seperated values in csv file, it returns me only one element in the list. Why does it return only one string by appendeding all values in the file ?
If we print out the count, it does print list.size() =1. If the files has 100 values, it should return 100 as size. Am i not understanding clearly?



The above code prints out count =100

list count100.
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

If I use this method, it will add each element in csv file into my list. Now the list will hava all 100 values.

 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
prince davies wrote:
when we use scanner for splitting comma seperated values in csv file, it returns me only one element in the list. Why does it return only one string by appendeding all values in the file ?
If we print out the count, it does print list.size() =1. If the files has 100 values, it should return 100 as size. Am i not understanding clearly?


I would expect that to happen if your delimiter wasn't correct. the one I used:


means the values are separated by commas followed by optional whitespace (including newlines). If your file does not match this pattern, then of course scanner wouldn't find the delimiter. So, can you post part of your csv file so we can look at the actual file contents and then we can modify the useDelimiter() to match.
 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
prince davies wrote:
If I use this method, it will add each element in csv file into my list. Now the list will hava all 100 values.



exactly. You used a delimiter that worked for your file. It is hard for me to see into your hard drive and know the structure of these CSV documents. IF your CSV file only has 100 elements, you should be able to grab them all in one operation instead of going line by line, but if you are happy with this method, then go for it.
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
attached the csv document. Since forum does not allow me to attach csv or xls ; please copy these values into csv file and run it. I need to take each value from this csv file and search in all files in directory to find if it does exist in file.

PCONCO100
PCONN25
PCORPDMP
PCPT1
PCPT2
PCPT3
PCPT4
PCPT5
PCPT6
PCRFV45S6
PCRFV55S6
PCRFV80S6
PCRPDMPI
PCRPMHSUB
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
if thare 100 values in csv and 1000 files in a folder, i need to take each value from csv file and loop through all files in that directory. that means thre will be 100*1000 times.

How is your method fuctioning for search a value in all files? When i DEBUG your method, it concatenate all values into string and see if that concateneated value exists in each file in that directory.Thats not my requirement. The value return from your method is a list and it contains a string like this "PCONCO100PCONN25 PCORPDMP PCPT1 PCPT2 PCPT3 PCPT4 PCPT5 PCPT6" and then search this value in each file in a folder.It does not seem to be correct. My requirement is take first value egCONCO100 and search it in all flies in a folder. then take next one PCONN25 and search it in each file located in that folder..so on and on.... I hope you understand my requirement.
 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
prince davies wrote:attached the csv document. Since forum does not allow me to attach csv or xls ; please copy these values into csv file and run it. I need to take each value from this csv file and search in all files in directory to find if it does exist in file.



I don't see any commas in what you pasted here, so obviously we shouldn't use commas as our delimiter. If each line only contains one value, then "By default, a scanner uses white space to separate tokens" and you don't have to use the useDelimiter at all. You can make your method much simpler and more efficient by following the example at http://docs.oracle.com/javase/tutorial/essential/io/scanning.html .

 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
prince davies wrote:if thare 100 values in csv and 1000 files in a folder, i need to take each value from csv file and loop through all files in that directory. that means thre will be 100*1000 times.

How is your method fuctioning for search a value in all files? When i DEBUG your method, it concatenate all values into string and see if that concateneated value exists in each file in that directory.Thats not my requirement. The value return from your method is a list and it contains a string like this "PCONCO100PCONN25 PCORPDMP PCPT1 PCPT2 PCPT3 PCPT4 PCPT5 PCPT6" and then search this value in each file in a folder.It does not seem to be correct. My requirement is take first value egCONCO100 and search it in all flies in a folder. then take next one PCONN25 and search it in each file located in that folder..so on and on.... I hope you understand my requirement.


I understand your requirement, but you will never get there if you don't understand the concepts. Do you understand why my method returns one big string?
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you understand why my method returns one big string?


Sorry Tim, I really do not understand why your method returs one big string. can you please explain to me.
 
Tim McGuire
Ranch Hand
Posts: 820
IntelliJ IDE Tomcat Server VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yup. My method assumed that there were commas separating the values in the csv file (I assumed this because csv stands for Comma Separated Values). I should have asked to see the file first.

when you used my method with a comma as a delimiter when scanning your file where there is no comma, the scanner would "scan" through the entire file and only return one big string. To my scanner using comma as delimiter, your file looks like one value. So, if you do not declare a delimiter, scanner resorts to using whitespace (which includes linebreaks) as the delimiter and will work the way you want on your file.

it will look like:
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks a lot Tim.. i sincerely appreciate your help....thank you thank you......
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Tim,

Ihave another post where i used lucene as a search engine. But there is a catch , i cant search all file formats, i need to provide file format before running that program. You can find my question @

http://www.coderanch.com/t/573001/Streams/java/search-content-file-file-search
Heading = search content in a file , file search

Can you please help me out to use that program without providing file format?

Thanks in advance
 
prince davies
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


This is the code for lucene search engine where they asking for suffix to be supplied.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!