Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

[newbie] regex anomaly

 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm trying to create a program that reads source files line by line (commonly .java source files that I've pasted from a book in .chm format), and, removes the line numbers so that I can compile them straight away.

NOTE: Disregard my 2nd post, this is because when I tried to edit my intial post, the option was not available and I hacked around it by using a quote.

I'm stuck at the regex part where I'm trying to match for:


1. import java.net.*;
2. import java.awt.*;
...
...


NOTE: The line number (in bold for emphasis) should be removed.

 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Update...

Jon Camilleri wrote:I'm trying to create a program that reads source files line by line (commonly .java source files that I've pasted from a book in .chm format), and, removes the line numbers so that I can compile them straight away.

I'm stuck at the regex part where I'm trying to match for:


1. import java.net.*;
2. import java.awt.*;
...
...


NOTE: The line number (in bold for emphasis) should be removed.

 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why not just use String's replaceFirst(regex, "") method and use (any number of digits followed by a period) as the regex?

(This is really just a sed script, I'm assuming you're doing it this way to get more Java practice in.)
 
Michael Angstadt
Ranch Hand
Posts: 277
Eclipse IDE Java PHP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
When you define a regex in Java, you have to be careful about using backslashes. When you want to use a backslash as part of your regex, you have to use two backslashes in your Java string.

So this:
should be this:
 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
David Newton wrote:Why not just use String's replaceFirst(regex, "") method and use (any number of digits followed by a period) as the regex?

(This is really just a sed script, I'm assuming you're doing it this way to get more Java practice in.)


The problem is finding the right regex pattern
 
John de Michele
Rancher
Posts: 600
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
if (Pattern.matches("[\d]\..", line)) //this is incorrect should include X+ ...
{//match?!}[/b]


The '\d' is already a character class, so you don't need to put it in square brackets. You might have been thinking of this:


However, matches() most likely won't work, since it's looking for an exact match. David's suggestion of using String's replaceFirst() method is probably what you really want.

John.
 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John de Michele wrote:
if (Pattern.matches("[\d]\..", line)) //this is incorrect should include X+ ...
{//match?!}[/b]


The '\d' is already a character class, so you don't need to put it in square brackets. You might have been thinking of this:


However, matches() most likely won't work, since it's looking for an exact match. David's suggestion of using String's replaceFirst() method is probably what you really want.

John.


It's a good idea actually thanks Now I'm getting an error indicating something wrong with the escape sequence



 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Backslashes ("\") in Java Strings are meaningful, and must be escaped.
 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
David Newton wrote:Backslashes ("\") in Java Strings are meaningful, and must be escaped.

Thanks that's it
 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jon Camilleri wrote:
David Newton wrote:Backslashes ("\") in Java Strings are meaningful, and must be escaped.

Thanks that's it


I'm wondering why the String.replaceAll is not replacing the string.


Output

1. //line numbers are not removed??
import
java.net.*;
2.
import
java.awt.*;
3.
import
java.awt.event.*;
4.
import
java.io.*;
5.
import
java.util.*;
6.
import
javax.naming.*;
7.
import
javax.naming.directory.*;
8.
import
javax.swing.*;
9.
10.
/**


[HENRY: Deleted tons of output that is probably not relevant]
 
Henry Wong
author
Marshal
Pie
Posts: 21362
84
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


The pattern is one of more number of digits, followed by a period, followed by another character (any character). And since you are using a scanner to extract the token, separated by whitespace, there isn't any characters after the period.

Henry
 
Jon Camilleri
Ranch Hand
Posts: 664
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry Wong wrote:

The pattern is one of more number of digits, followed by a period, followed by another character (any character). And since you are using a scanner to extract the token, separated by whitespace, there isn't any characters after the period.

Henry

Thanks

oops
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic