Script needs to read excel file which has let's say column A having these set of keywords let' s say 100 such keywords in this column and
after reading these keywords from this column 'A' python script should search them to a particular path in D drive for all folders and subfolders
containing miscellaneous file types(.xml,.txt,.html,.yml,.sh...etc.)and once match is found lets say for first keyword it finds
a match in particular file at specific line number in this file and again it finds the same keyword in another file at some other line number
so here totally it was found 2 times and at different line nos in different files but it may also be the scenario that same keyword in same file
is found more than once on same line no itself or at different line no as well in the same file. So we need finally this statics for this keyword :- 1)
File names where it was found 2) At what lines ( it may be repeated more than one line nos as it may depend on the frequency of that particular keyword
how many times it's occuring in a single file itself ) in these files it was found. 3)
Total count of that particular keyword where all it was found while searching all the file in the
given disk drive for all the files,folders,subfolders in it.
also after reading this excel file it should write the found these details :-1)Keyword matching file names 2) All the line nos where all
it was found with their file names as well 3) Total count for each keyword which were found different no. of times while searching in this
D drive( which has different types of files in it) for all the files,folders,subfolders in it.
Tried below code but it's not producing desired output:-
yourpath = 'D:\\Tool devlopment\\tool\\Folder
timestr = time.strftime("%Y%m%d-%H%M%S")
file2 = open("D:\\Tool devlopment\\tool\\Module\\excel.txt", 'r')
text_string1 = file2.read().lower()
word2 = text_string1.split()
sast_txt = "D:\\Tool devlopment\\tool\\Module\\SAST_ScanReport" + timestr + ".txt"
FO = open(sast_txt, 'w')
cnt = Counter()
for root, dirs, files in os.walk(yourpath, topdown = False):
for name in files:
path = os.path.join(root, name)
files = glob.glob(path)
for name in files:
with open(name, encoding = "utf8", errors = 'ignore') as f:
text_string1 = f.read().lower()
word1 = re.split("[^a-zA-Z]*", text_string1)
for i in range(0, len(word2)):
if (word2[i] in set(word1)):
cnt[word2[i]] += 1
str2 = "Vuernabilitiy found at line: " + str(i + 1) + ' in file ' + os.path.join(root, name) + "\n"
str2 = ''