Win a copy of Java 9 Revealed this week in the Features new in Java 9 forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Read number of occurrence of word from file  RSS feed

 
Atul More
Greenhorn
Posts: 24
1
Java jQuery Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have written a code which read the file from location, count the word based on the separator and count the occurrence of each word.
Just want to analyze the code and is it well written? Is there any improvement required. I am trying to write it on Java8.

 
Stephan van Hulst
Saloon Keeper
Posts: 7179
117
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
  • Only make classes public when they're needed by classes outside of their package. Only make classes non-final when you plan to override them.
  • Don't use hard-coded file paths.
  • Constants should be static.
  • occuranceData should be occurrenceData, or simply occurrences. There are also typos in calculateTopPhresesFromFile and calcualteOccurance. Be diligent.
  • Don't allow nulls to be passed to your methods, unless that implies something specific. Never return null from a method.
  • Don't use arrays in method signatures or return types. Use collections.
  • Don't use String to represent anything other than names. For file paths, use Path.
  • Prefer enhanced for loops if you don't expressly need an index.
  • Don't create objects over and over again if you don't have to. fileFilter can be a static final field.
  • Don't use forEach if you can use a collector. You can use flatMap() and Arrays.stream() on your lines.
  • Don't catch exceptions if you don't have a good way of handling them. Propagate them from your method if you want the caller to deal with them.
  • Don't pass Comparator.reverseOrder() to a Comparator. Just use reversed() on the Comparator.
  • You don't need to pass a merge function to toMap() if you can guarantee that the key selector never returns duplicates.
  • You should perform parameter checking in all your methods and constructors.
  • Make your fields final wherever possible.
  •  
    Campbell Ritchie
    Sheriff
    Posts: 54030
    130
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    The field limit_count shou‍ld read limitCount, or as a constant LIMIT_COUNT.

    Documentation comments don't need two *s at their end. Line 42 shou‍ld read */ not * */
    No space before [] in arrays.
    Line 70 is too long; divide it with a new line before each dot.I am sure there is a better way to do that. What you are trying to do is get all the words from your file into a List. I am getting suspicious because I could see a forEach method call.
    How do you know you will have enough information to make it worthwhile parallelising the Stream? Why not parallelise it earlier?
    Now, let's try something different. You get a Stream<String> from the file reading with the lines method.
    Then let's parallelise it straight away.
    Now, let's see if we can't get the Stream<String> into another Stream<String> with its flatMap() method, which takes some Function or other as its argument. Let's see if we can get away with a straight λ call to String#split. I tried it and it didn't work.
    You have to use map to get a Stream<String[]> and then pass Arrays::stream as a method reference to flatMap, or possibly a λ. Look at my lines 3‑4. [Addition: That is the same as Stephan said.]
    As a terminal operation, let's see if we can't turn that straight into a List. You can use the Collectors class' methods to get a ready‑made Collector, and this method returns a Collector which creates a List. Let's see whether this will work:-Your methods look confused. You have multiple return statements, and a lot of methods seem simply to call other methods with similar names. It is difficult to read the code or follow the flow of execution. I agree with Stephan: don't let nulls in and don't let nulls out. There is nothing wrong with a 0‑length List.

    Please explain more about line 82‑88. That look interesting, but is too tightly packed to read properly.
     
    Scott Shipp
    Ranch Hand
    Posts: 222
    11
    Eclipse IDE IntelliJ IDE Java Scala Ubuntu
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Stephan van Hulst wrote:
  • Don't use arrays in method signatures or return types. Use collections.



  • What is the rationale for this advice? And does this apply to methods with any access modifier or would you differentiate between public, private, etc? Would you add any caveats for performance considerations?
     
    Stephan van Hulst
    Saloon Keeper
    Posts: 7179
    117
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Arrays are not good for APIs. They're basic building blocks of more advanced data structures. Whether a method is private or not doesn't matter. You're still a consumer of your own private methods, and collections are much more versatile and lead to code that's easier to read. Performance will rarely play a part, because collections that use an array as their backing store (ArrayList, ArrayDeque, ArrayBlockingQueue) shouldn't perform that much poorer.

    There might be some caveats, but those will be known by software architects who have a very specific and well thought out reason to use an array instead of a collection, and only after profiling. If you're not sure you need one, then you don't need one.
     
    Piet Souris
    Rancher
    Posts: 1821
    60
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    An alternative to
          
    is

    This is not limited to limit_count, but in your code you might miss some Strings, if the Set of Strings with equal frequency is larger than limit_count.

    Edit: and oh yeah: is it not wise to convert all strings to lowercase?
     
    Scott Shipp
    Ranch Hand
    Posts: 222
    11
    Eclipse IDE IntelliJ IDE Java Scala Ubuntu
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Stephan van Hulst wrote:Arrays are not good for APIs. They're basic building blocks of more advanced data structures. Whether a method is private or not doesn't matter. You're still a consumer of your own private methods, and collections are much more versatile and lead to code that's easier to read. Performance will rarely play a part, because collections that use an array as their backing store (ArrayList, ArrayDeque, ArrayBlockingQueue) shouldn't perform that much poorer.

    There might be some caveats, but those will be known by software architects who have a very specific and well thought out reason to use an array instead of a collection, and only after profiling. If you're not sure you need one, then you don't need one.


    What about all of these methods that take arrays in the standard Java library? What do you say about those? Or these?
     
    Campbell Ritchie
    Sheriff
    Posts: 54030
    130
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    If people are going to use arrays, they will require methods to handle them. The existence of the Arrays class means that some people use arrays, not necessarily that they are a good thing to use in production code. The Java® APIs are designed to provide methods for anybody who might want them, not simply methods frequently used.
     
    Stephan van Hulst
    Saloon Keeper
    Posts: 7179
    117
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    It would be silly for a utility class for array operations, not to work on arrays.

    String is an extremely low level data structure. It's not a good example of a high level application API.
     
    Atul More
    Greenhorn
    Posts: 24
    1
    Java jQuery Spring
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi,

    Thanks Stephan for your inputs, will keep in mind those.
    @Campbell: Thanks for the explanation and alternative. The code from line 82 on wards till the end to read the list of words and get the count of each word.
    Let me write it again.

    Thanks,
    Atul
     
    Campbell Ritchie
    Sheriff
    Posts: 54030
    130
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Atul More wrote:. . . The code from line 82 on wards till the end to read the list of words and get the count of each word. . . .
    I suggest you start by writing an explanation of each method call and what it returns; seeing the structure of the code will probably help you enhance it.
     
    Campbell Ritchie
    Sheriff
    Posts: 54030
    130
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    . . . Also please explain how you are working out whether to use a sequential Stream or a parallel Stream.
     
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!