• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Knute Snortum
  • Paul Clapham
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Ron McLeod
  • Piet Souris
  • Frits Walraven
Bartenders:
  • Ganesh Patekar
  • Tim Holloway
  • salvin francis

How do you find the median of a CSV file?  RSS feed

 
Ranch Hand
Posts: 121
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a CSV file with several types of data. One column is all doubles, the other all ints, and the remaining are all strings. Am wondering how one might take that information into account when trying to calculate the median. I understand in the math world, the median is when you find the "middle number". So for the columns they are unsorted, so first I'll need some way of sorting the columns from least to greatest. Not sure how this would be done in code. Also, what if the column is even or odd? If it's even, then somehow the code must take two middle numbers and divides that by two?

Pretty confused about this.


If possible could someone give a step-by-step process of how to solve this?
Thank you
 
Bartender
Posts: 10759
68
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Justin Robbins wrote:I have a CSV file with several types of data. One column is all doubles, the other all ints, and the remaining are all strings.


OK, well that sounds wrong right there.

What are they? What data do they represent? What do you mean by "the median"?

I understand in the math world, the median is when you find the "middle number". So for the columns they are unsorted, so first I'll need some way of sorting the columns from least to greatest.


Which again we can't really help you with unless we know more about what we (or actually you) are talking about.

Also, what if the column is even or odd? If it's even, then somehow the code must take two middle numbers and divides that by two?


That's generally only an issue when whatever the "value" you're counting is numeric - and it doesn't have to be.

For example, the "median state" in the US for income, based on some sample you take, could be either "Illinois" or "Maryland", but there's no way to decide between them, since exactly half of your sample lies in the bottom half topped by Illinois, and the top half is "bottomed" by Maryland.

You can't have "Illiois.5".

Tell us more about this file and exactly what it contains, and then we'll be able to help you better.

Winston
 
Marshal
Posts: 5993
156
Chrome Eclipse IDE Java Postgres Database Ubuntu VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would read all the data into a list using something like the Apache Commons CSV Reader, then find the median of the list.
 
Bartender
Posts: 5851
57
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
  • Create a class ("Record"?) to represent a single row in the CSV file.
  • Have one field per column.
  • Most fields will be String except for a few that might be of type Double or Integer.
  • Have a constructor that takes a single line from the CSV file, splits it using the delimiter (assuming (,)) and populates the fields.
  • Have a main class with a main() method that opens the CSV file.
  • Iterate through all the lines.
  • Add new Records to a list using each read line.
  • Create a Comparator class (or classes) that define how you want the list to be sorted.
  • Sort the list.
  • Go to the middle of the sorted list and pull out the field in the Record that you are interested in.
  •  
    Villains always have antidotes. They're funny that way. Here's an antidote disquised as a tiny ad:
    how do I do my own kindle-like thing - without amazon
    https://coderanch.com/t/711421/engineering/kindle-amazon
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!