• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Sheriffs:
  • Tim Cooke
  • Knute Snortum
  • Bear Bibeault
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Ganesh Patekar
Bartenders:
  • Frits Walraven
  • Carey Brown
  • Tim Holloway

How would you best describe what parsing methods do?

 
Ranch Hand
Posts: 121
2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Been learning about parsing more and more but I still seem to come up empty as to what they are actually doing. I've heard parsing described as chunking things up, like a sushi chef or something, and then working with those individual parts of data. Like if you have a String array such as:



we can parse the array of string and remove the commas and work with each piece separately. But then I've also heard parsing described as conversion? so which one is it? Is it chopping up data into pieces or is it converting from one form to another like ints to String or String to double? if that even works?

The things I've seen so far, which I don't understand what they are doing are:



What do any of these things mean? how are they working? what exactly is happening here?


Thank you
 
Marshal
Posts: 24586
55
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Parsing converts data from an external format to an internal format. So for example you might convert an XML document into a tree structure of Java objects; that would be parsing. Or you might convert a line of text from a file into an array of strings by splitting it based on the commas, as in your example. That would be parsing too.

You wouldn't describe converting a string to an integer as parsing, though. Normally parsing produces a collection of objects which are related by a structure.

 
Sheriff
Posts: 13510
223
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Actually, parsing is done from a string or text. Any parse() method will take a string as its argument and produce whatever that String/text actually represents, whether it's an int, double, Boolean, Date, or even structured types like an XML document or a JSON object. It's always from a textual representation to the actual type that the information represents.

See https://en.m.wikipedia.org/wiki/Parsing
 
Paul Clapham
Marshal
Posts: 24586
55
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And yes, it would have been better to name the method "convertFrom" ...



or something like that, maybe. These days the Java designers are more careful about method naming.
 
Marshal
Posts: 64623
225
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You would not usually parse a comma‑separated list of names; you would however parse a sentence:-
"Bob, John, Victor, Sally, Janet and Fred read Coderanch."
You would divide that into subject verb object. You can parse code similarly.
 
Bartender
Posts: 10759
68
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:You would not usually parse a comma‑separated list of names; you would however parse a sentence...


So, do we actually have an answer?

Personally, I've always though of parsing as "conversion involving rules", and also - usually - conversion that doesn't involve a simple 1:1 mapping; so on that basis, converting a comma‑separated list of names would be parsing - albeit dirt-simple parsing.

And dealing with CSVs generically - ie, handling "values" that can include the delimiter - is definitely parsing, IMO.

Winston
 
Junilu Lacar
Sheriff
Posts: 13510
223
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I see nothing wrong with the name "parse" to represent the behavior that it has in Integer and other classes that have a parse method. Sure, parsing is usually associated with compilation/interpretation at a much larger scale, i.e., turning textual code in a programming language into an executable form, the mechanics of turning a String into an integer or Date are basically the same. You have text as input, you have some rules for how the text should be broken down, and you have some rules for how to treat each small bit of text that has been broken down.

In the case in an integer, the rules are pretty simple: each character must be a digit. The conversion rule is that with each character parsed from the input text, you multiply the result by 10 before adding the value of the digit. There are other rules of course but these are the main ones.

As for CSV, the main rules are simply that each line in the input represents a row and commas separate values that are in different columns/fields. A full CSV parsing also involves further parsing of each field into an appropriate data type.

I think it comes down to context and intent when naming a method "parse" or something else like "split" because technically, split is still parsing the string it is operating on.
 
Saloon Keeper
Posts: 10302
217
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For me, parsing is "assigning meaning to symbols". A symbol doesn't necessarily have to be a string of characters, you could also parse audio (which is what we do in our heads when we listen to somebody, we assign a meaning to waves of air pressure).

After parsing, you're left with something that can have meaning in one context, but not in another. You can parse it again to assign meaning to it using the extra context.

For instance, you can parse the string "1", so it means 'the number 1'. But 1 doesn't necessarily have a meaning by itself. If it's in the context of f(x) = 2x + 1, you can parse it again so it means 'the initial value in a linear function being 1'.
 
Paul Clapham
Marshal
Posts: 24586
55
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Junilu Lacar wrote:I see nothing wrong with the name "parse" to represent the behavior that it has in Integer and other classes that have a parse method.



Yes... there's no reason for me to insist that parsing must produce more than one data item. After all, one is a perfectly good number like any other so that rule would be nugatory. So I agree with you.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!