• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Analyzing data from several csv files and writing a result to a new file

 
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am a Java beginner and am having great difficulty getting my head around certain behaviors. I would greatly appreciate any guidance as I have hit a brick wall and don't know if what I have done is all wrong or if I am on the right path but just need to course correct.

It's my first time connecting methods and classes to each other so I need some whereabouts.

I am reading 3 different csv files within the CourseAnalytics class. However, the relevant data for the end output is consigned to 2 of them. In one file I have a column which I am counting the occurrences of a course ID, and then ultimately I need to match that up with the course names and output to a new cvs.

My generateCourseCounts method is supposed to be loading the previous methods which read the 3 csv files and then analyzing them to produce the results. However, I can't figure out how to do the calculation currently in generateCourseCounts separately from hardcoding the file reading again. My end result needs to show how many students are enrolled in each course. From what I can make out that would require counting the occurrences of the courseId data in the studentcourses.csv and outputting it together with the course name from the courses.csv

Even though the way it's currently set up isn't ideal, I can print to console the results I need to correlate but then I don't know how to use methods in another method to print to file.

Sorry if this is all confusing but am banging my head against a brick wall, struggling with terminology and putting things into action that I don't understand fully.

I got the aforementioned calculation in generateCourseCodes from StackOverFlow but don't understand the Lambda expression. I would imagine rather than it printing, it needs to return a value which can then be used together with the course name to write to a new file.

Very much appreciate any help.

1DC2E117-AD87-4819-B18F-848CEEDCE3AE.jpeg
[Thumbnail for 1DC2E117-AD87-4819-B18F-848CEEDCE3AE.jpeg]
4FF21128-11FA-48B8-AFBE-2641013A3BC0.jpeg
[Thumbnail for 4FF21128-11FA-48B8-AFBE-2641013A3BC0.jpeg]
4BD986BF-5A12-45C5-85B4-4E12606EB048.jpeg
[Thumbnail for 4BD986BF-5A12-45C5-85B4-4E12606EB048.jpeg]
 
Sheriff
Posts: 7125
184
Eclipse IDE Postgres Database VI Editor Chrome Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the Ranch, Tony Marcello!

I am having some trouble figuring out the best way to help you. Are you saying that what is printed in generateCourseCounts() is correct and you just need to figure out how to post them to a file?  Or is there more to it than that?

There are a lot of changes you could make to your classes, but maybe let's focus on getting the class working.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Knute Snortum wrote:Welcome to the Ranch, Tony Marcello!



Thanks very much for the welcome and offer to support. I have revised a lot of the code and tidied it up somewhat today and now know specifically what I wish to do with it. I left some old code in but commented out as I still think that it possibly contains part of my solution, I just need to figure out how to edit it to suit my needs.

What I would like to do is this:

generateCourseCounts loops through the StudentCourse collection using the object's courseId getter, and increments the appropriate count in the Map local variable.
generateCourseCounts then uses another local variable, a collection, to sort the counts in order of # Students, decreasing.
generateCourseCounts finally creates the expected output file CourseCounts.csv. This file will be tab-delimited, with a header row. Each line in the file (after the header row) will have two values - the name of the course, and the # students registered for that course.

Head is fried and am going to park it for tonight but hoping to get this out there now in case there are any thoughts and suggestions before I get going again. As always, thanking you.


 
Knute Snortum
Sheriff
Posts: 7125
184
Eclipse IDE Postgres Database VI Editor Chrome Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

generateCourseCounts loops through the StudentCourse collection using the object's courseId getter, and increments the appropriate count in the Map local variable.
generateCourseCounts then uses another local variable, a collection, to sort the counts in order of # Students, decreasing.
generateCourseCounts finally creates the expected output file CourseCounts.csv. This file will be tab-delimited, with a header row. Each line in the file (after the header row) will have two values - the name of the course, and the # students registered for that course.


This is too much for one method to do.  The second requirement can be done in one line, so I think we need two methods; one to create the data and one to write it to a file.

You're on the right track to use a Map<Integer, Integer> as you'll need a courseId and a count of students.

I'm not sure why you are loading Students, StudentCourses, and Courses without using the return values.  Do you need any of that?  Your commented lines don't need them.

The commented lines use the wrong type of Map and accumulate a List of Students instead of a count.

After you have your Map created, pass it to a method that writes it to a file.

There is a lot more to do, but start with cleaning up generateCourseCounts() to just create the correct Map.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Knute Snortum wrote:
The commented lines use the wrong type of Map and accumulate a List of Students instead of a count.
After you have your Map created, pass it to a method that writes it to a file.
There is a lot more to do, but start with cleaning up generateCourseCounts() to just create the correct Map.



Once again, thank you kindly for your valuable input. It is appreciated.

So, I have started working on the for loop and I have come up with this. Although it doesn't work as some of the code is wrong, it is a big win for me as I have figured out how to access the data from loadStudentCourses

I know this does not work as it stands but is this moving in the right direction?

The issue is with split AND understanding Lambda expressions, which is my Achilles Heal.


 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Alternatively this:



BUT as it stands it is not actually increasing the count... it just returns a value of 1 for each occurrence.
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You call loadStudentCourses and assign it to a list of StudentCourse objects, but then you subsquently don't use any of those objects. I think that's a significant thing.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:You call loadStudentCourses and assign it to a list of StudentCourse objects, but then you subsquently don't use any of those objects. I think that's a significant thing.



Thanks very much for your input. Would you be kind enough to advise on how I use the object? I think I have bitten off more than I can chew with this project.
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Let's have a look at this declaration. Your variable name suggests it's for a map which contains "counts" per "course". And by the way your variable names are extremely good -- descriptive variable names! They help greatly with design.

So anyway, "counts" per "course". This should be implemented as a Map<course, count>. But you haven't quite done that...
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:
This should be implemented as a Map<course, count>. But you haven't quite done that...



Again, thank you for your input/guidance.
So, something like this?:



I had to change containsKey to contains. Is that acceptable? However, I know this isn't right because the get method isn't working. sc is where the data has been loaded, so sc must be where data is iterated through, where it contains data and gets it from. Whereas accummulateCountOfStudentsPerCourse is the map which we wish to put the data into. Or am I wrong?

*****

After some longer thought, should my starting point actually be:

 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
No.



This doesn't map courses to counts. A count is obviously an integer, but a course isn't an integer.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:No.



This doesn't map courses to counts. A count is obviously an integer, but a course isn't an integer.



Thank you very much for your patience.

So, like this or am I missing the point?

 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
No, you don't want to map lists of courses to integers, you want to map courses to integers. So:

 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:No, you don't want to map lists of courses to integers, you want to map courses to integers. So:



Thank you! And does that mean that StudentCourse is now loaded? And that I don't have to do:

 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
No. Just because you declare the type of a variable, it doesn't follow that you have assigned a value to that variable. Consider



That doesn't mean you have assigned a value to that variable.

So in terms of your code: you've declared a Map which maps courses (StudentCourse to you) to integers. And you're going to assign an empty Map to that variable. Next you're going to go through the list of courses and fill in the map, using code similar to what you already wrote. Only you're going to write a loop which goes through the courses and not a loop over some integers, because after all it's the courses you're interested in.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Something like this then:



Which I think works but not entirely sure, it certainly does something.
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't understand why you're so resistant to the idea of



As for your code which updates the map, you had some reasonable-looking code with an if-else clause in an earlier post. Now you have a similar-looking if-clause but the else-clause had vanished.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Apologies but I couldn't get to grips of the way you were advising due to my ignorance.
I'd forgotten about the else and that was what was stopping it from working. However, I have definitely got it working now:



Only thing I need to edit is my loadCourses is displaying the 1st row i.e. the Header row. And then I need to output to csv but I think I am onto that part.
 
Rancher
Posts: 285
14
Eclipse IDE C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
doing it the way paul suggests is good for a few reasons:

#1 if a course number changes, then all you need to do is update that in your StudentCourse object, otherwise you'd need to edit the hashmap itself to fix that course number which isn't a very good way of maintaining records like this. you have to think of it like a database where each object is a record.

#2 if you ever need to represent 2 different courses that have the exact same courseNumber this makes it possible. for example if an autoshop class was 2121 and was discontinued and a new woodshop class is 2121, but both classes need to be represented in the database.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am 100% sure that the generous suggestions put are for my benefit but the imperative for me is to get this working and then when I fully understand the mechanics of it, I shall go back and try to understand how to refine it.
 
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
We appreciate being called generous but I think you are putting the cart before the horse. You won't be able to implement anything until you understand how it works. I haven't read enough of this discussion to know whether the following link is directly relevant, but there is a counting exercise in the Javaâ„¢ Tutorials. If you don't understand the explanation, please ask again.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:I think you are putting the cart before the horse.



I'm actually very aware that I am and normally this is not my strategy and going forward I will need to master things before I proceed but I am unfortunately not within that domain right now. I am going to look at that link right now. Thank you ever so much for sharing it.
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tony Marcello wrote:. . . . Thank you ever so much . . . .

That's a pleasure
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote: there is a counting exercise in the Javaâ„¢ Tutorials. If you don't understand the explanation, please ask again.



That is quite hilarious. Hitherto, I rarely reference the documentation as I find it somewhat impenetrable and yet there lie a cure to my ills. I thank you again!

So, this is my count:



Which produces:



Now, what I need to do is either remove the 1,2,3 = or substitute it with the names housed in:

 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tony Marcello wrote:. . . cure to my ills. I thank you again! . . .

That's a pleasure I trust you understood how it works. Isn't it nice and simple when you see it working.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:Isn't it nice and simple when you see it working.


Isn't that always the way?!

I understand it as:
For each variable of sc, it will iterate through the data in lSc.
Declare an integer variable called freq and assign it to data stored in the Map (except it isn't stored there yet).
It will then put into the map the data. If the freq is thus null, it will add 1 and then each time it loops and gets it again, it will increment by 1.

End of loop currently produces a console print out of results so I can see if it is correct.

My question is regarding:

Is sc.getCourseId() the key that is being assigned?
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have also figured out a way to display just the counted values:



But how would I use that for my final process of writing to a tab delimited file?
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Start by defining what the tab separated file (it is akin to a CSV file) will look like. You may have to give your Map a different “K” that will correspond to what you want in the output file.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:Start by defining what the tab separated file (it is akin to a CSV file) will look like. You may have to give your Map a different “K” that will correspond to what you want in the output file.



I can find plenty of resources about changing the values from one map to another but not the key.
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:



I am afraid I do not understand that. I think you're advising it to go in the existing Map that was created to do the counting but where do you initialize the new variable and how does that not override the values being established via the count?

Meanwhile, I did some more poking and came up with this which is a stone's throw from getting the final output.
The below writes a tab delimited csv file 👌 2 columns 👌 2 headers 👌 it puts the correct data that has been counted into the correct column in the correct order 👌
The only thing now missing is writing the key from loadCourses() to file. That is the element of my code that is wrong as I cannot obtain c.getKey(). I am guessing the above suggestion could rectify that so with a better understanding I can go to bed happy tonight.

 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I thought it might happen that you needed to write the name of the course to the CSV file output. But is that name not part of the StudentCourse object?

If it is, then you can solve your problem by using a Map<StudentCourse, Integer> object to summarize the courses in the way you're doing now. Only then when you finish your summarization loop, the key of the Map you have is a StudentCourse object, not an Integer, and you can get the course name directly from that.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:I thought it might happen that you needed to write the name of the course to the CSV file output. But is that name not part of the StudentCourse object?


Indeed I do need to write the name of the course to the CSV file output but the name does not originate from the StudentCourse object. Instead it lies with the Course object.
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I was wondering why the class was named "StudentCourse" and not "Course". So what's in StudentCourse, then?
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:I was wondering why the class was named "StudentCourse" and not "Course". So what's in StudentCourse, then?



I shall do this in two posts so it is easier to see. This is the StudentCourse class:

 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And this is the course class:

 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Oh dear. I would have expected a StudentCourse object to contain a reference to the related Course object. As it is, and as you've already found out, if you're given a StudentCourse object there's no way to get the related Course object.

In other words I would have expected StudentCourse to include a method "public Course getCourse()". That would be a typical object-oriented design.

However I can see that given the source of the data, it's not straightforward to make that happen. What you would need to do is, as you're reading the CSV file and creating the Course objects, you'd need to build a Map<Integer, Course> so that given a course ID you could find the Course corresponding to that ID.

For a quick finish after doing that, you could just use that Map to answer the last problem you had -- but don't forget to take care of the situation where a StudentCourse might not actually match any Course object.

It would be nicer if your code which read in all of the CSVs was changed so that it built the StudentCourse objects to contain Course references instead of course ID numbers, but that would take a fair amount of restructuring of the code you have now.
 
Tony Marcello
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:
What you would need to do is, as you're reading the CSV file and creating the Course objects, you'd need to build a Map<Integer, Course> so that given a course ID you could find the Course corresponding to that ID.
For a quick finish after doing that, you could just use that Map to answer the last problem you had



So, I created a new Map but I am still having trouble combining them.


Specifically:

for (Entry<String, String> courseKey : courses.entrySet());
for (Entry<Integer, Integer> numberKey : countStudents.entrySet()) {
printWriter.println(courseKey.getKey() + "\t" + numberKey.getValue());}
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Look at your equals() methods, remembering that a hash map will only work for “K”s with a correctly implemented equals() method.
  • 1: Why are you using getClass()? If you subclass your code, getClass() will cause equals() to return false. That problem won't occur if you make your classes final.
  • 2: You need to override hashCode() too, using exactly the same “alphabet” of variables as you used in equals().
  • No 2 is particularly important, otherwise you won't find your right place in the Map.
     
    Shiny ad:
    a bit of art, as a gift, the permaculture playing cards
    https://gardener-gift.com
    reply
      Bookmark Topic Watch Topic
    • New Topic