• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Ron McLeod
  • Jeanne Boyarsky
  • Paul Clapham
Sheriffs:
  • Junilu Lacar
  • Tim Cooke
Saloon Keepers:
  • Carey Brown
  • Stephan van Hulst
  • Tim Holloway
  • Peter Rooke
  • Himai Minh
Bartenders:
  • Piet Souris
  • Mikalai Zaikin

Trying to understand why string arrays are different

 
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As a fairly green, relatively amateur Java programmer, I am bit upset after spending somewhere around three hours trying to figure out how to integrate the commons lang libraries into my project (which I eventually did figure out and was thereafter able to use the libraries successfully in my code), only to learn that the ArrayUtils.removeElement function will NOT remove an array element from a String[] Array BY ITS INDEX ... apparently it will do it based on a String pattern (at least I think it does), but not by the index number specifically ... so I did a little googling to see what I could do about this, only to learn that removing an element from a String[] Array BY ITS INDEX is just something you don't do ... so to heck with you if you WANT or NEED to do it ... better switch up to a ListArray and handle your stuff that way...

All I wanna do is ArrayUtils.removeElement(myArray, myArray.length - 1) or something simple like that ... WHY? Why does Java make String Arrays behave differently than any other class of Array ... maybe if someone can help me understand the unique qualities of a String array such that ... manipulating the Array by its index isn't really natural because of said qualities ... I wouldn't be so flustered over this...

Sorry for the rant, but there is a question buried in here too...

:-)

Mike
 
Bartender
Posts: 11497
19
Android Google Web Toolkit Mac Eclipse IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Umm did you check out
 
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Assuming by ArrayUtils you mean the Apache Commons class, then none of the removeElement methods work using an index. They all state:
"Removes the first occurrence of the specified element from the specified array."

A String[] is, in that regard, no different than any of the other arrays.

If you want to remove by index, then the API has a remove method which states:
"Removes the element at the specified position from the specified array."
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Note that in Java, arrays have a fixed length once they are created. If you need a data structure in which the number of elements should vary at runtime, it is better to use a collection class.

Use, for example, a List<String> instead of a String[].
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Dave Tolls wrote:If you want to remove by index, then the API has a remove method which states: "Removes the element at the specified position from the specified array."



You see, thats EXACTLY how I read that to mean as well ... but it DOESN'T mean that ... look at all of the return classes that the ArrayUtil can send back ... NONE of them are String[], and as far as the "position" to be removed, it appears to me that it is only a position that it locates based on criteria that you provide which has everything to do with the data in the array, and not the array index itself.

Either way, I switched over to a List array type and life is much better now.

I had to tackle an interesting problem with this project, which inspired me to blow the dust off my Java knowledge which consists of about two Java classes and two OOP classes in college. But essentially, what I needed to do was take a text file, and format it sufficiently for importing into a database. The text files that will be converted can be anywhere from 10,000 lines of text to almost a hundred thousand or more. And here is a sample of what kind of data I had to deal with...

March 8, 2014, 8:49 a.m. Buy from xmasterg
1 18597
March 8, 2014, 8:47 a.m. Buy from xmasterg
1 18596
March 8, 2014, 8:43 a.m. Buy from xmasterg
1 18595
March 8, 2014, 8:41 a.m. Buy from xmasterg
43 22322
Oct. 27, 2013, 8:37 a.m. Buy from somerandomuser
1 61
Oct. 27, 2013, 6:33 a.m. Buy from laurahoney
if your interested, I have an easle you can have
25 60
Oct. 27, 2013, 6:23 a.m. Buy from spades
have you ever painted that in red before
5 35
Oct. 27, 2013, 6:22 a.m. Buy from spades
thank you for the lovely art piece
5 30
Oct. 27, 2013, 6 a.m. Buy from riverdude


What had to happen, is that each line in the file needed to start with a date ... and the date formats were all over the map. Sometimes they spelled out the whole month, sometimes they abbreviated it. Time formats were almost always consistent unless the transaction happened at the top of the hour in which case, the was no :00 after the hour. Sometimes AM or PM was presented as a.m. or am or p.m. or PM or P.M.... and with all of the different kinds of transactions there are (only one is shown here out of six), each transaction type did not have the same number of fields in it ... some transactions had only date, time, transaction type and monetary values with no comments at all ... other transactions had to do with actually purchasing items ... and all of this is presented in a single text file.

The hardest part ended up being how to load the data into an array, then properly append the broken lines together. What I ended up doing - because I re-wrote the code last night and made it a HECK of a lot more efficient, was load the file into a ListArray, then starting at the bottom (the largest element number), I set the index to an integer, then I threw it to a boolean hasDate function which checks the beginning of the line against an array of every possible month that they could throw at me abbreviated, full month names etc., usually the last line would be a false, so the integer is DECREMENTED, then the array is checked again, if the check fails again, it keeps decrementing until it finds a month then it takes that index checking integer and starts increasing it, appending each next increment to the index that tested positive for the date. Once its done, it deletes the index that just appended to the positive match index, the it continues, checking each next index for a date ... if it hits positive then it has sufficiently appended each line and created a single line and it starts over again .... decrementing until it finds a date etc.

here is the new code (it's about a fifth as large as the first attempt:



Anyways, it ended up being a real challenge for me ... I also got a good dose of Regex knowledge, something I used once or twice in Microsoft Word ... but the application, RegExRx was critical in learning RegEx and in testing my RE match strings. So besides straightening out all the weird line issues, I had to use two different date functions from Javas core classes (no commons at that time to use), several sub routines to split the data out into different files that comprise the tables that will get imported into the database ... the need to use RegEx to properly insert missing tabs or take away too many tabs for that particular transaction type ...

This thing ended up being a hell of a program ... and I recently added a JavaFX form (never messed with those in my life) so that the data can be copied and pasted right from a web site without the need to save it to a file ... and thats working very well...

Since I have a ton of experience with Access and VBA, I created an Access 2013 database for this project and used strictly VBA to import all of the tables (which are as close to third normal form as I think anyone could achieve), create the relationships, define the constraints and of course add new data as it comes in ... now I have to write the queries to answer all of the questions that I ultimately needed to answer with this data. But all of the access stuff was done in SQL using record sets based on SQL queries, so it could easily port over to mySQL or whatever.

It's been a fun project ... still much to do, but it's awoken the closet programmer in me.
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Michael D Sims wrote:

Dave Tolls wrote:If you want to remove by index, then the API has a remove method which states: "Removes the element at the specified position from the specified array."



You see, thats EXACTLY how I read that to mean as well ... but it DOESN'T mean that ... look at all of the return classes that the ArrayUtil can send back ... NONE of them are String[], and as far as the "position" to be removed, it appears to me that it is only a position that it locates based on criteria that you provide which has everything to do with the data in the array, and not the array index itself.



A String is an Object.
It uses the Object[] version of the ArrayUtils.remove().
See:

Which results in the following output:
Before = [A, B, C, D, E]
After = [A, C, D, E]
 
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Michael D Sims wrote:
here is the new code (it's about a fifth as large as the first attempt:
...

It's been a fun project ... still much to do, but it's awoken the closet programmer in me.


Not trying to rain on your parade but if you look at the structure of your new code and read this article: http://blog.codinghorror.com/flattening-arrow-code/, you'll realize that there's still some things that your newly-awakened closet programmer can do to shake off the cobwebs
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:
Not trying to rain on your parade but if you look at the structure of your new code and read this article: http://blog.codinghorror.com/flattening-arrow-code/, you'll realize that there's still some things that your newly-awakened closet programmer can do to shake off the cobwebs


Seriously, I'm NEVER above correction, coaching, training, being code slapped ... thank you for the link, I already started reading it.
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:
Not trying to rain on your parade but if you look at the structure of your new code and read this article: http://blog.codinghorror.com/flattening-arrow-code/, you'll realize that there's still some things that your newly-awakened closet programmer can do to shake off the cobwebs



(referring to the code snippet below) I moved the iCheckIndex variable out into public to avoid needing to pass both a List and an int back to the caller in the CrunchLines function as I THINK I would have to make a custom data type and get real funky with the code in order to pass back two different data types from a function ... but I can't be sure about that.

Anyways ... hows this for aesthetic complexity reduction hehe ....

 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ok, since we're doing a code review now...

1. You're not following Java naming conventions. Method names, variables, fields should be start with lowercase and camel case for the rest of the name. MakeSingleLines should be makeSingleLines, Counter should be counter, CrunchLines should be crunchLines, etc.

2. Comments like the ones you wrote, while I applaud the effort, are likely to be ignored. Plus, they tend not to get updated along with the code so they become obsolete fairly quickly. When I see comments that explain what the code is doing, I strive to refactor to make the code more expressive and make those comments redundant. As I was saying in another thread, I reserve writing comments for those times when I feel I need to explain why the design is the way it is.

3. Style-wise, your indentation and brace placement are non-standard, too. I use the more common style, which is default in Eclipse and I believe what has been in the official Java Coding Style Guide since forever. If I'm not mistaken, JavaRanch Cattle Drive recommended style is to put braces on a separate line aligned with the statement. Examples:

It's a matter of taste but it's better to follow one of the latter two so more people are comfortable reading the code. People are funny when it comes to style and indentation and brace placement.

If you're up for it, we can do some refactoring as I would do it with developers on my team. And I don't mean just addressing the style and, as you say, "aesthetics" of the code either. I'd go for clarity first, then flexibility and maintainability next. Performance always comes last, if I feel that I need to address it. If this is something that will be used often and there's a chance that you may need to modify the code as you encounter more date formats that you have to handle, then it may be worth structuring the code so you can do that with minimal effort and change.
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Ok, since we're doing a code review now...

1. You're not following Java naming conventions. Method names, variables, fields should start with lowercase and camel case for the rest of the name. MakeSingleLines should be makeSingleLines, Counter should be counter, CrunchLines should be crunchLines, etc.


I have been use to doing as you describe, but I saw (several times) on various web pages, where people were capitalizing the first letter ... and even that didn't follow suit when it came to the naming of a variable vs. a method or a class ... so I've been a little lazy when it comes to that in this project. I actually prefer the lowercase / camel case look. What confuses me, however, is when i'm cruising along, and I'm using the IDEs feature of helping me find class methods etc., I often see methods start with capital and lower case letters ... now, after reading your reply here, I'm thinking that the ones that are capitalized must be something completely different than the ones that aren't. It's been about 4 hours since I started writing this reply, because when I realized I could not define the difference between capitalized class methods and lower case class methods, I decided to *TRY* and not waste too much of your time, so I googled around, and found a PDF version of Oracle's "Java, The Complete Reference" ... at the moment, I'm reading about literals, but I skip ahead when I get to stuff I understand fairly well. It's been many years since I took Java in college, and I had completely forgotten about polymorphism, inheritance, and honestly, the basic structure of classes themselves - specifically, getters and setters and class scope including how to properly interact with each. I am beginning to realize that my little project is becoming more and more a bowl of spaghetti than I would ever want to admit. Looking it over, it's not easily scalable, easy to read, nor does it have any custom classes when it probably should in several places. Everything is in the main class except for code excerpts that I found via Google and tossed into the mix.

I have yet to read anything in the book about naming conventions though ... although I'm sure I'll run into it soon.

2. Comments like the ones you wrote, while I applaud the effort, are likely to be ignored. Plus, they tend not to get updated along with the code so they become obsolete fairly quickly. When I see comments that explain what the code is doing, I strive to refactor



Can you explain "refactor" to me please?

to make the code more expressive and make those comments redundant. As I was saying in another thread, I reserve writing comments for those times when I feel I need to explain why the design is the way it is.



This makes perfect sense to me. In fact, it seems more like common sense than anything else. It is, however, the first time I've seen any formal definition for the use of comments in at least two decades ... what I remember is "comment, comment, COMMENT - you can never have too many comments" ... obviously with the increased ability of the computer, came the increased natural language-ability of programming languages which allowed us to write our code in such a way that it almost comments itself... now ... getting use to writing code that way will be a small challenge.

3. Style-wise, your indentation and brace placement are non-standard, too. I use the more common style, which is default in Eclipse and I believe what has been in the official Java Coding Style Guide since forever. If I'm not mistaken, JavaRanch Cattle Drive recommended style is to put braces on a separate line aligned with the statement. Examples:

It's a matter of taste but it's better to follow one of the latter two so more people are comfortable reading the code. People are funny when it comes to style and indentation and brace placement.


Understood ... I had set up IntelliJ to format my braces starting on a new line, but then I had to manually indent the first brace to make it look like I wanted it to look ... but I can get use to the brace being lined up at the start of whichever declaration it is embracing code for. I just don't like it when the brace starts at the end of the declaration line ... for some reason, my brain doesn't find that method to be very clean at all.

If you're up for it, we can do some refactoring as I would do it with developers on my team. And I don't mean just addressing the style and, as you say, "aesthetics" of the code either. I'd go for clarity first, then flexibility and maintainability next. Performance always comes last, if I feel that I need to address it. If this is something that will be used often and there's a chance that you may need to modify the code as you encounter more date formats that you have to handle, then it may be worth structuring the code so you can do that with minimal effort and change.



This is a very kind offer. I know time is valuable, but I most DEFINITELY could use the advice of an experienced coder. I haven't decided yet if I want to migrate career wise from networking into a more programming centric field. I do love programming as much I love networking. I've gotten very good with routers, switches, server and workstation operating system design, analysis, implementation and troubleshooting. So good in fact, I've been a solo consultant for several years now with clients such as realtor associations, water districts, local government, highly specialized manufacturing, CPAs, retail etc. I don't do home user computer consulting AT ALL (except the occasional home computer issue where the customer is a member of one of my corporate accounts). But a recent and dramatic change in my personal life has left me feeling like I want to make a change in my professional life too. Either way, the mentoring can NEVER be a bad thing and can only make me that much more knowledgable when it comes to working with programmers which happens from time to time.

The project I'm working on now, as I've said in past posts, takes raw and completely un-normalized records and then formats then normalizes the data into proper tables with tab delimitation. I just started including JavaFX with a Scene Builder form so that the program is easier to interact with. my main.java file has gotten littered with "methods". For some reason, I still think sub-routines or functions in my head, and after my reading this morning, I realize that referring to them as methods implies actually using them as an integral part of a class where it interacts with the core properties of that class, and my code does nothing of the sort ... it's been written more like process code and not object oriented code ... I have much work to do if I'm going to clean it up I'm afraid. I would happily employ your guidance in as much as you are willing to offer it.

Mike
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I’m very proud of myself, because I was able to drastically UN-COMPLICATE this portion of my code, and I even reduced it down to PROPER OOP methodology (which was the most exciting part for me, because I seem to understand classes well enough, that I’ve reduced the code into a fairly natural read).

If you don’t remember what I had before with all the nested if statements within nested loops being executed against a ListArray complete with index shifting etc., then scroll up a few posts and take a look. But essentially, I’ve broken it down to this:

This is not my Main class, but it is the class that does the meat of the work. The Main class sets up the fxml form and then calls the ProcessData method.


The StringProcessor class is new as of this small re-mod. My intention here is to use this class for various string manipulation such as converting dates into formats that SQL tables will accept etc. The fun for me will be in creating new instances of the class and invoking its useful methods against that string in one line of code as I did for rawData.toSingleLines above.



And here you have your every day, run of the mill file reader class.

:-)

 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ah, great way to start the weekend... do a code review!

It's always good to start these with a little bit of humor, so here's something to think and perhaps chuckle about:

In a 12 August 1997 Usenet post, Netscape engineer Jamie Zawinski wrote:Some people, when confronted by a problem, think "I know, I'll use regular expressions." Now they have two problems.


We'll circle back to regexs later but first, let's talk a little about names and naming conventions.

Objects and classes represent things and therefore should generally have names that are nouns or noun phrases. Names like ProcessData, ReadWholeFile, and TestThings are verb phrases. Verbs and verb phrases are appropriate for method names, not object and class names.

Also, words like "process", "data", and "things" are very general. Combined, as in "ProcessData", they don't really say anything more specific than "DoSomethingWithSomeStuff". I always ask my team to change names like that. Put a little more effort into clearly expressing what that "process" actually does to that "data" or explain exactly what those "things" are. And don't give that name to an object or class. That's a method name. Would you name your dog "Stay"? Can you imagine how hard it would be to train that dog?

You: "Stay!"
Stay: ("Wait, what?! Is he calling me or telling me not to move? Oh, look, a squirrel!")
You:
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

You already know about the issue with the name ProcessData so let's look at line 3.

The name useTextFile is perfect; it reveals exactly how you intend to use it.

However, you declared it as public. Do you really want this available to everyone? That means that anyone and anything can see it and worse, change it. Imagine putting the switch to your master bathroom outside, on your front porch. Aside from the obvious inconvenience of having to run to your front door before and after going to the bathroom, anybody can come up to your porch at any hour and flip that switch. Imagine how disconcerting that would be if that happened while you were conducting some private "business." You might as well put that switch in a checkout lane at your nearest WalMart

Not a good idea, right? As a general rule, keep the scope of things as small as you can until you have a compelling reason to make it more generally available. This includes classes, objects, methods, instance variables, and class variables like useTextFile.

 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There's a lot more (don't get discouraged, you're making progress) but I'll leave it with this one for now:

First of all, I don't know if you intended to make this a constructor or a class method. If you wanted to make a constructor, you would not make it static or declare a return type. Not even a void return type.

Second, again with the long comments. While I applaud your effort, you should write comments with someone else in mind. You may not realize it, but most people who aren't the original author of the code get easily bored by comments like this. I can hardly make it past "dataTextBox". It's best to use a JavaDoc comment here, like so (assuming you wanted to make this a constructor):

Notice how I don't really care where the value for the filename parameter is coming from. To mention that it happens to come from a data text box on the GUI is inappropriate here. You want your comment to be irrespective of any kind of implementation detail like that - that way, the comment stays valid if the implementation details change. And of course, the name ProcessData is not a good one, as we've said before.

Read up on how to write good JavaDocs
 
Saloon Keeper
Posts: 10167
81
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You mention that the input file may become extremely large. I don't think reading the whole thing into memory is a good idea. In addition, your input file is neatly laid out in lines of text though it appears that some lines are optional. I suggest making a Record class that embodies the structure of a record, including optional fields. Then I recommend a Parser class with a getNextRecord() method that reads the text file, one line at a time, and handles the optional fields. Use BufferedReader to read the lines. Do not do massive amounts of string replace all to paste records together and then break them apart.


 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Carey raises some good points but I'm going to continue reviewing the code that Michael posted. Carey is following a tendency that many programmers have: to try to make the program more efficient. Donald Knuth is often quoted as saying "Premature optimization is the root of all evil" and I believe there's a lot of truth to that. If you do TDD for a while, your focus tends to be more about the design - you want your design to be well-factored and you want your code to clearly express the design.

So, I did a little reformatting but here's the code I want to look at next:

Let's go over the design a little bit and focus on two variables: useTextFile and dataTextBox.

If useTextFile is true, dataTextBox is pretty much ignored and the program will ask the user to choose a file in a GUI dialog. If useTextFile is false, dataTextBox is used as a String of raw data. This violates the KISS Principle: Keep It Simple, St****! (fill in the blanks yourself). The poor choice of names makes matters worse and the code itself is telling you this with line 31 where you're essentially renaming dataTextBox to rawData. This approach makes the method currently known as ProcessData a bit incoherent; it's all over the place with causing a GUI dialog to pause execution and wait for user input, checking if the input matches a file name pattern, to causing a file to be opened and read in, or using a parameter as raw data. If you imagine a harried housewife running around trying to cook, do dishes, vacuum, do laundry, iron, shop for groceries, mow the lawn, etc. it wouldn't be very different from how I see this code.

KISS is key to making the program more coherent and methods simpler and more cohesive.

So what should you do? Here's my suggestion for refactoring this: separate the work of processing the input from the work of obtaining the input. This is known as "Separation of Concerns" or as I like to think of it "The left hand not knowing what the right hand does."

I've provided a number of links that will lead you to more information that you will find useful. Right now I have to go watch a football game (Go Bucks!). I'll post a followup on this later.
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Halftime and the Bucks are up by a touchdown , despite two turnovers

As I was saying, after some renaming of variables and methods, I would continue refactoring by separating some of these concerns that are making a mixed bag of this one method. Before writing any code, I'd try to list out what those concerns are. Still looking only at the code from lines 8 to 32, some candidate concerns/tasks are:

1. Reading text to be processed from a data file
2. Letting user pick a file to read
3. Using a given string as the text to be processed

I not convinced that you need the useTextFile field. Looking back at that long comment again:

I would say change your UI a bit so that you have a text box into which the user can paste data directly and a button they can click to show a File Selection dialog. This way, you won't need a useTextFile checkbox. This is the part that deals with the concern of obtaining the text to be processed.
When you separate this concern sufficiently, then the handoff to the concern of processing the text becomes cleaner. One hand obtains the text then passes that text to the other hand which processes it. Coded in the simplest way possible:

Of course, as I've been saying all along, you should find some proper names for these classes/objects/methods/variables. But this is where I would start. Then I would drill down and start breaking up the tasks into smaller, more detailed chunks of work.

Second half is starting so I'll be back again later. Hopefully, I'll be like and not
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Ah, great way to start the weekend... do a code review!

It's always good to start these with a little bit of humor, so here's something to think and perhaps chuckle about:

In a 12 August 1997 Usenet post, Netscape engineer Jamie Zawinski wrote:Some people, when confronted by a problem, think "I know, I'll use regular expressions." Now they have two problems.




OK, I did chuckle ... but then thought ... “wait ... what?"

Junilu Lacar wrote:Objects and classes represent things and therefore should generally have names that are nouns or noun phrases. Names like ProcessData, ReadWholeFile, and TestThings are verb phrases. Verbs and verb phrases are appropriate for method names, not object and class names.



This makes TOO much sense ... I need to start being more aware of what I’m creating with my code and then more cognizant of potential humans reading it (or heck, me reading it a week later for that matter).

Junilu Lacar wrote:Also, words like "process", "data", and "things" are very general. Combined, as in "ProcessData", they don't really say anything more specific than "DoSomethingWithSomeStuff". I always ask my team to change names like that. Put a little more effort into clearly expressing what that "process" actually does to that "data" or explain exactly what those "things" are. And don't give that name to an object or class. That's a method name.



Again, this makes perfect sense.

Junilu Lacar wrote:Would you name your dog "Stay"? Can you imagine how hard it would be to train that dog?



A Stephen Wright reference???

:-)

Junilu Lacar wrote:
The name useTextFile is perfect; it reveals exactly how you intend to use it.

However, you declared it as public. Do you really want this available to everyone?



I admit … I was being lazy doing that… I didn’t want to create a setter and I assumed that my main class would have to instantiate an instance of my ProcessData class which I really didn’t want main to be tied to ProcessData that intimately because ProcessData is MUCH MUCH MUCH larger than I presented it here … I merely clipped the code that was relevant to the point I was trying to make… But now that I think about it … I could have Main instantiate ProcessData (or whatever new name I’m going to give it soon since apparently there is no such thing as a class verb … *hmphh*) … then set useTextFile and then make the call to the main proc in that class and let it run from there …

I’m thinking I really need to re-think this program a little harder and drag it’s fat arse into the light of OOP … but it will go kicking and screaming, I assure you.

Junilu Lacar wrote:There's a lot more (don't get discouraged, you're making progress) but I'll leave it with this one for now:


PLEASE! you are so far from my discouragement level that you should actually picture me like a three year old that your in the process of handing a new ice cream cone to … hand clapping and everything … :-)

Junilu Lacar wrote:

First of all, I don't know if you intended to make this a constructor or a class method.



Ummmmm….. lemma google the difference before I answer that ….

OK, I think I understand … lets realize first and foremost that I get most of my knowledge from examples from googling around … so lets take for example a Hello World bit of code that I downloaded and played with some time ago … here’s how it declared the Main class:



So following that example … it only seemed right to declare the “main” procedure for the stated class in like manner …

This was a bad assumption on my part I assume?

Junilu Lacar wrote:Second, again with the long comments. While I applaud your effort ...



I forgot to mention … those comments don’t actually exist in my code … I write them for the benefit of this forum … should I not do that anymore?

I’ll comment on some of your other posts soon …

Michael
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, here I am again, two days after the Bucks served up a nice cold dish of revenge to Sparty for last year's loss in the B1G championship game. Completely forgot about coming back after the game. Anyway...

Michael D Sims wrote:A Stephen Wright reference?


No, it was actually in reference to something that Michael Sims wrote

Google gave me a bunch of Steven Wright quotes but I've never heard of either Steven or Stephen before your mention. Funny guy though, that Steven. Thanks for the introduction.

I could have Main instantiate ProcessData


Yes, that's exactly one of the strategies suggested in our "Main Is A Pain" article. But, just to deal the fatal blow to this dead horse named ProcessData one more time, make sure to take care of that name, Ok?

So following that example … it only seemed right to declare the “main” procedure for the stated class in like manner …


I can see how you might think that and that's actually kind of logical if you didn't know otherwise. However, some things defy conventional logic which is why main can be a confusing pain. All public static void main methods have a "special" standing with the JVM since they are the ones that the runtime system will look for, literally by name, at startup. No other method has this (dubious) honor. It's like a curse to the poor fella. He's the JVM's #1 goto guy but once they know better, all the programmers want as little to do with main as possible and try to stay the heck away from him. Despite his favored position with the JVM, main is essentially just a Redshirt.

those comments don’t actually exist in my code … I write them for the benefit of this forum … should I not do that anymore?


Yeah, let's leave those out of the code tags since they're only serving to trigger my OCD tendencies towards comments. If you really feel the need to, just put your clarification notes in the main post text, like this.

PLEASE! you are so far from my discouragement level that you should actually picture me like a three year old that your in the process of handing a new ice cream cone to … hand clapping and everything … :-)


Well, I guess you're too young to be afraid of widths then. Put your big boy pants on and get ready for a lot more ice cream, buddy!

In case you missed it, that there was a reference to your friend:

Steven Wright wrote:A lot of people are afraid of heights. Not me, I'm afraid of widths.

 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Well, here I am again, two days after the Bucks served up a nice cold dish of revenge to Sparty for last year's loss in the B1G championship game. Completely forgot about coming back after the game.


Hate to say it ... but you may as well have said that the NASCAR players are having a meet against the San Jose hockey team ... thats about how much I’m into sports ... then again you did say the Bucks ... maybe that means Bucaneers? and that would be .... football?

Junilu Lacar wrote:

Michael D Sims wrote:A Stephen Wright reference?


No, it was actually in reference to something that Michael Sims wrote

Google gave me a bunch of Steven Wright quotes but I've never heard of either Steven or Stephen before your mention. Funny guy though, that Steven. Thanks for the introduction.


He had a famous stand up routine that HBO played to death in the 1980’s ... one of his jokes was, “I named my dog ‘Stay’ so that when I call him, he gets confused ... ‘come here ... stay ... come here ...... stay....’ He had another famous one ... “I spilled spot remover on my dog .... now he’s gone.”

:-)

Junilu Lacar wrote:

I could have Main instantiate ProcessData


Yes, that's exactly one of the strategies suggested in our "Main Is A Pain" article. But, just to deal the fatal blow to this dead horse named ProcessData one more time, make sure to take care of that name, Ok?


So I’ve been thinking about this whole naming thing in context of the building reference made earlier ... (at least I think you made a building reference) ... you know ... a warehouse actually houses wares ... and a restaurant literally comes from the French word " la restauration", meaning “restoration” ... a chicken coupe literally coupes chickens and so on ... so even though they are nouns, they have verbs wrapped up inside them and very neatly disguised as nouns...

Just making a point is all on that ....

But been thinking about the class name ... and I think I may have come up with some ideas ...

1) InformationCruncher
2) DataSorter
3) KnowerOfAllThingsData
4) DataMonster (Sesame Street reference - I have kids ... :-)
5) BoyTheThingsICanDoWithData
6) NotYourMomsDataCruncher

I could go on ... but I digress...

Junilu Lacar wrote:

So following that example … it only seemed right to declare the “main” procedure for the stated class in like manner …


I can see how you might think that and that's actually kind of logical if you didn't know otherwise. However, some things defy conventional logic which is why main can be a confusing pain. All public static void main methods have a "special" standing with the JVM since they are the ones that the runtime system will look for, literally by name, at startup. No other method has this (dubious) honor. It's like a curse to the poor fella. He's the JVM's #1 goto guy but once they know better, all the programmers want as little to do with main as possible and try to stay the heck away from him. Despite his favored position with the JVM, main is essentially just a Redshirt.



Yes, I remember reading this... the first main must be public so that the VM can call it ... but after that, there is OBVIOUSLY (by deduction) no such need for every class to be exposed in like manner... I get it.

Junilu Lacar wrote:

those comments don’t actually exist in my code … I write them for the benefit of this forum … should I not do that anymore?


Yeah, let's leave those out of the code tags since they're only serving to trigger my OCD tendencies towards comments. If you really feel the need to, just put your clarification notes in the main post text, like this.


Is OCD a pre-requisite for successful software engineers? Cause I run into that a lot when I deal with programmers ... it is a curious phenomena if not...

Junilu Lacar wrote:

PLEASE! you are so far from my discouragement level that you should actually picture me like a three year old that your in the process of handing a new ice cream cone to … hand clapping and everything … :-)


Well, I guess you're too young to be afraid of widths then. Put your big boy pants on and get ready for a lot more ice cream, buddy!

In case you missed it, that there was a reference to your friend:

Steven Wright wrote:A lot of people are afraid of heights. Not me, I'm afraid of widths.


Yes, I remember this! And the red shirt comment I got immediately being a trekkie myself... :-)
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's college football. Being from Columbus OH, I root for the OSU Buckeyes, as in "Go Bucks!" The Tampa Bay Buccaneers are called the "Bucs" for short.

I honestly never heard of Steven Wright before you mentioned him and I've never seen his bit about the dog Stay before. I was actually considering using "Heel", "PlayDead", or "RollOver" but settled on "Stay" for some reason. Maybe I was channeling him or something, I don't know. And for some reason, I had the image of the dog in the Disney movie "Up" when I was writing that which is why I added "Oh look, a squirrel!"

Anyway, to continue the code review...

I want to skip ahead to this code:

You already know what I'm going to say about the class name, ReadWholeFile so I'll assume you're going to change that. The method name ReadFile() should also be readFile() instead, so you're following standard naming conventions for methods.

I don't know if I can explain very well what I find "a little off" about this code. There's really nothing technically bad about it besides the issue that Carey pointed out about reading the entire file into memory. I think my inclination would be to not make it an entirely different class. I just don't see that there's enough encapsulated information and behavior in it to justify a whole 'nother class. It's kind of like how scientists didn't think that Pluto was "planet enough" anymore so they knocked its status down to "dwarf planet". It's no biggie and it's probably just that O-C part of my being a programmer kicking in again. And yes, most good programmers have a tendency to be O-C about things like indentation, matching braces, names, blank spaces, and even number of return statements. There are lines between being diligent and meticulous, being O-C, being dogmatic, and just being plain old pig-headed, so you'll have to learn how to spot the differences and deal with these on a case-to-case basis.

Maybe you should try to make a few changes based on the feedback so far before we go on, assuming you still want to go on.
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Maybe you should try to make a few changes based on the feedback so far before we go on, assuming you still want to go on.


I absolutely do ... I apologize for my lengthy time away ... had a sizable job to do and was a little under the weather to boot... going to do some code mods now... wanted to check in first to see if there was anything new.

:-)
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:There's really nothing technically bad about it besides the issue that Carey pointed out about reading the entire file into memory.


How can I enact change on the entire file at once if I don't first read the whole thing into memory? Processing this file one line at a time for my purposes is cumbersome and the code gets ... a little twisted. Doing it this way proved to be MUCH simpler and it actually formats properly under more diverse situations than the line by line method as the data thats copied into the program could face a slew of different formatting methods down the road, and the line by line method that I wrote assumes a static layout for the data. Using the RegEx method acting on all of it at once zaps the data properly without caring about different ways of possible presentation from the source.

 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Anyway, to continue the code review...

I want to skip ahead to this code:

I don't know if I can explain very well what I find "a little off" about this code. <snip> I just don't see that there's enough encapsulated information and behavior in it to justify a whole 'nother class.



How about this ... combining the StringProcessor class and the ReadWholeFile class (hence forth known as SourceDataReader) into a single class? The only thing that you *MAY* find odd about it, is that I need to process the data whether it's provided in a file, or it's been copied and pasted into a text box on the main form. I commented the line that handles this, however ... you tell me if it's acceptable or if it's ... "red neck tech".



And here's the method that calls it...



Better???
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Michael,

Tell you what, I think this problem can be solved elegantly with Java 8 streams. How about I give it a shot over the weekend and try to come up with a solution. I'll post it here and we can discuss from there. Just got back from a business trip and I need to catch up on some work but I think I can come up with something pretty quickly tonight.
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here are a few comments on your SourceDataReader class:

1. It takes on more responsibility than its name leads you to believe. One responsibility is to manage and read from a FileInputStream. Another is to convert a String to single lines (toSingleLines). These responsibilities together don't make for a very coherent class that follows the Single Responsibility Principle.

2. It shirks a responsibility that its name leads you to believe it would take on: when filePath is empty, it bails out on reading input because it assumes that data was entered directly by the user. As the SourceDataReader name implies, it should accept input from either a file or a String entered directly by the user.

3. The instance variable userString is used only in the toSingleLines method. It seems more appropriate to declare it as a local variable to that method instead.

First change I would try would be to have this API instead:

With this API, you can create a SourceDataReader with a File or a String. It's pretty obvious which case each will handle as far as the source of the data is concerned. The getEncodedData method will be responsible for the encoding, which it looks like you want UTF-8. This should work with either method of input (raw String entered by user or read from a file). Additionally, you might want to provide an overloaded getEncodedData method that accepts an encoding as a parameter.
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think in general, your code does not have responsibilities separated and assigned very well. Since it's Sunday and "The Walking Dead" is on, it's kind of like how Hershel used to say, "Everybody has their job to do" and frankly, the code you have has classes that do things that aren't really their job to do. Here's how I'd divvy up the work:

1. UI - two tabs: one tab with a File Chooser, the other tab with a text area. User can choose one way to specify the data source and then click on a button to start processing.
2. SourceReader with the API that I mentioned in the previous reply
3. DataFormatter - takes care of formatting a String (provided by a SourceReader) and formats it in the way you want it formatted.


That last part where you construct a DataFormatter and pass it the reader to use is an example of Dependency Injection, where objects don't create other objects with which they collaborate but instead have some external entity "inject" or hand them a reference to the collaborator. This results in a loosely coupled design and greater flexibility. It also leads you to better ways to separate concerns and assign responsibilities appropriately.
 
Junilu Lacar
Sheriff
Posts: 17530
300
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Another observation: I think it might be worthwhile considering using a FileOutputStream as the final output destination. It makes more sense if you get input from a file and end up with another file rather than just have a long string as the final product. And even if the input was a String that the user typed in or copy/pasted from somewhere, it still might make sense to write the final string to a file. Again, separation of concerns comes into play. If you eventually want to import the cleaned up log somewhere, that import functionality should be assigned to another class and it would be simpler to have just one source of data for that class to deal with, that is, the file that the formatting component produced. Of course, that's assuming that the import will be done some time later, in a different process. If you want to import the data in the same process right after you format it properly, then the File will not be necessary and you can deal with the String directly.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic