• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Frank Carver
  • Junilu Lacar
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Al Hobbs
  • Carey Brown
Bartenders:
  • Piet Souris
  • Frits Walraven
  • fred rosenberger

Text processing in Java with regex

 
Ranch Hand
Posts: 185
Netbeans IDE Firefox Browser Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I just did an interview coding test that required me to read in a line of text from a file and do the following:

- Get the frequency of each character
- Get the sum of all the numbers

the text string looked something like "7ghksh @4ndng754jndv= *&wbd234 Kner75>< wfs093"

Numbers were considered to be all numbers that were in a row ie 7, 4, 754, 234, 75, 093

I failed to finish the test in time because I got stuck extracting the numbers from the string correctly.

My question is would this have been possible and easier using a regular expression to find sequences of numbers? Its one of the things I have never looked at but after this test I plan on doing so. In the test I was using nested loops to loop through the string and try and match the numbers with a seperate array of numbers I had containing 0 - 9. It was a lot trickier than I though it would be! I know some of you think this would be a breeze but text processing is something I never really came across or learned with Java.

Thanks,
Alan


 
Marshal
Posts: 27371
88
Eclipse IDE Firefox Browser MySQL Database
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Possible? Yes, I'm pretty sure it would be possible to extract all of the numeric sequences from a string using a regex.

Easy? Well, it would be easy for somebody who had enough experience with regexes. For me it wouldn't be easy, but there are plenty of people who could tell you the correct expression right away.
 
Rancher
Posts: 43028
76
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is a really bad interview question, as it requires you to know and recall some very arcane details. During a normal workday, you'd rarely need those, and could easily google them if you actually did. I'd say either that company interviews badly, or they have a bad engineering culture. Too bad (for them and for you) if it's the former, but lucky you if it's the latter.
 
Alan Smith
Ranch Hand
Posts: 185
Netbeans IDE Firefox Browser Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ulf Dittmer wrote:This is a really bad interview question, as it requires you to know and recall some very arcane details. During a normal workday, you'd rarely need those, and could easily google them if you actually did. I'd say either that company interviews badly, or they have a bad engineering culture. Too bad (for them and for you) if it's the former, but lucky you if it's the latter.



It was actually a phone interview with the usual whats polymorphism, interfaces, generics, etc and then they sent me this test to do by email. I had an hour and a half, and I completely botched the extracting numbers part trying different things. I came close but not close enough. I felt it was a bad test as well as I am quite competent in Java but I have never used Java for this kind of thing ever. Just took me by surprise. Better luck next time hopefully.
 
author
Posts: 23928
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Alan Smith wrote:

Ulf Dittmer wrote:This is a really bad interview question, as it requires you to know and recall some very arcane details. During a normal workday, you'd rarely need those, and could easily google them if you actually did. I'd say either that company interviews badly, or they have a bad engineering culture. Too bad (for them and for you) if it's the former, but lucky you if it's the latter.



It was actually a phone interview with the usual whats polymorphism, interfaces, generics, etc and then they sent me this test to do by email. I had an hour and a half, and I completely botched the extracting numbers part trying different things. I came close but not close enough. I felt it was a bad test as well as I am quite competent in Java but I have never used Java for this kind of thing ever. Just took me by surprise. Better luck next time hopefully.




Debates about the validity of the test aside. To answer the original question, yes, in my opinion, regular expression is definitely a worthwhile tool to have your arsenal -- and definitely worth getting good at too.

In this case, the pattern for a series of digits is "\\d+", and you could have extracted all six numbers with a single loop.

Henry
 
Alan Smith
Ranch Hand
Posts: 185
Netbeans IDE Firefox Browser Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:
In this case, the pattern for a series of digits is "\\d+", and you could have extracted all six numbers with a single loop.
Henry



Thats exactly what I thought while I was doing the test. Thats good to hear. I'm off to buy this. Thanks guys.
 
Bartender
Posts: 5167
11
Netbeans IDE Opera Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Alan Smith wrote:I was using nested loops to loop through the string and try and match the numbers with a seperate array of numbers I had containing 0 - 9.


Regex aside, you need to familiarize yourself with the methods of the Character class. I'm not a programmer, but I wouldn't think it unreasonable for an interviewer to expect you to be aware of the API of all the primitive wrapper classes, and String. Professional developers, please correct me if that's not realistic.

FWIW, this took a lot less than 1½ hours, using a single loop.
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Darryl Burke wrote:FWIW, this took a lot less than 1½ hours, using a single loop...


Weirdly enough, that's pretty much exactly the way I'd have done it too; except I think I'd have used a HashMap<Character, AtomicInteger>.

@Alan: Regexes are very useful, but they're not for everything. Just for starters, it's likely that a regex-based solution would be significantly slower than Darryl's.

Secondly, when you read the book, be sure to digest the Java chapters (if it has them) as well as the standard operators. One particular one to know about is the 'possessive' operator, which is peculiar to Java (and maybe a few other languages, like perl).

Winston
 
Alan Smith
Ranch Hand
Posts: 185
Netbeans IDE Firefox Browser Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Winston Gutkowski wrote:

Darryl Burke wrote:FWIW, this took a lot less than 1½ hours, using a single loop...


Weirdly enough, that's pretty much exactly the way I'd have done it too; except I think I'd have used a HashMap<Character, AtomicInteger>.

@Alan: Regexes are very useful, but they're not for everything. Just for starters, it's likely that a regex-based solution would be significantly slower than Darryl's.

Secondly, when you read the book, be sure to digest the Java chapters (if it has them) as well as the standard operators. One particular one to know about is the 'possessive' operator, which is peculiar to Java (and maybe a few other languages, like perl).

Winston



@Darryl, very nice! Answers like yours never hit me straight away, I always over complicate things. Guess ill have to go back to the drawing board with the wrapper classes. I feel like I'm going backwards with programming!

Thanks Winston, i'll have a look, if anything regex will be good to have under my belt like another poster said.
 
Rancher
Posts: 4686
7
Mac OS X VI Editor Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Winston Gutkowski wrote:except I think I'd have used a HashMap<Character, AtomicInteger>



Why Atomic.... since its not multithreaded?
 
Master Rancher
Posts: 4280
57
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I imagine it's not the atomic part that's useful here, but just the fact that it's a mutable int-like object. For this use case, it reduces the amount of key lookups you do, having to look up one object and then insert a different object. This way it's just one lookup per update, or two for the very first update of a given key.
 
Alan Smith
Ranch Hand
Posts: 185
Netbeans IDE Firefox Browser Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Darryl Burke wrote:

Alan Smith wrote:I was using nested loops to loop through the string and try and match the numbers with a seperate array of numbers I had containing 0 - 9.


Regex aside, you need to familiarize yourself with the methods of the Character class. I'm not a programmer, but I wouldn't think it unreasonable for an interviewer to expect you to be aware of the API of all the primitive wrapper classes, and String. Professional developers, please correct me if that's not realistic.

FWIW, this took a lot less than 1½ hours, using a single loop.



Hi Darryl,

I am looking back over this again, the frequency part is ok but I am confused about what exactly is happening on this line:



Say we focus on the '745' sequence in the string... if 7 is the lastNumber variable and 4 is the current character in the loop, then that line will look like this:

7 = 7 * 10 + (4 - '0');

This works without the brackets as well but what does the - '0' actually achieve. I understand the * 10 multiplication is to get 70 + 4 ie 74 but why the - '0'? I also know it doesn't work without the - '0' but I can't see it.

Thanks,
Alan
 
Sheriff
Posts: 22683
128
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Not (4 - '0') but ('4' - '0'). The issue here is that '0' is not the same as 0.

If you take a look at http://www.asciitable.com/ you will see that the character '4' has ASCII value 52. Likewise, '0' is the same as 48. By subtracting '0' from '4' you are extracting 48 from 52, yielding 4 - the decimal value of the character.
 
Alan Smith
Ranch Hand
Posts: 185
Netbeans IDE Firefox Browser Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:Not (4 - '0') but ('4' - '0'). The issue here is that '0' is not the same as 0.

If you take a look at http://www.asciitable.com/ you will see that the character '4' has ASCII value 52. Likewise, '0' is the same as 48. By subtracting '0' from '4' you are extracting 48 from 52, yielding 4 - the decimal value of the character.



Ah cool, thanks Rob!
 
Rob Spoor
Sheriff
Posts: 22683
128
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You're welcome.
 
It's weird that we cook bacon and bake cookies. Eat this tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic