• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Jeanne Boyarsky
  • Paul Clapham
Sheriffs:
  • Devaka Cooray
  • Ron McLeod
  • paul wheaton
Saloon Keepers:
  • Tim Moores
  • Piet Souris
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Frits Walraven
  • Scott Selikoff

Java Regex for Haskell.

 
Ranch Hand
Posts: 278
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I need to write a regex to read out comments from Haskell code in source.
I created a pattern but it doesnt seem to work for nested comments.



Can someone please verify the regex i wrote?
source -is haskell code



import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test10 {

public static void main(String[] args) {



String regex="--[^\\n]*\\n|\\{-.*?-\\}";
String source = "--aa\n{- longer\ncomment -}";
Pattern p = Pattern.compile(regex);

Matcher m = p.matcher(source);

System.out.println(regex);
while (m.find()) {

System.out.println(m.group()+ " " +m.start());
}


 
lowercase baba
Posts: 13081
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Lucky,

A little tip - if you wrap your java in "code" tags, it makes it much easier to read, and folks are more likely to help if you do so.
 
Lucky J Verma
Ranch Hand
Posts: 278
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes Fred :-)

 
fred rosenberger
lowercase baba
Posts: 13081
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
well...I guess that doesn't help too much if you don't have your original code formatted very well...I cleaned it up for you.
 
Marshal
Posts: 76817
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Question too difficult for "beginning Java™". Moving discussion.
 
Ranch Hand
Posts: 276
Netbeans IDE Chrome Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Lucky,

The regex does not match multi-line comments because, the "." character class matches everything except line terminators - \n(unix line terminator),\r(carriage return),\u0085(next line),\u2028(line separator),\u2029(paragraph separator). So, the \n character wont be matched by default.
To make "." match all characters, you must use the other Pattern.compile() signature

DOTALL flag when set, will match "." with any character. This must solve your problem.



Or alternatively, the regex can be defined as

Here, in the first part of the regex, ".*" wont match \n character. So you wont need that "[^\\n]*" part.
In the second part, you can see we have used "(?s)". This is a regex flag and it implies that starting from the place where "(?s)" is encountered to the place "?(-s)", "." will match everything on its way. If we skip "(?-s)", it is equivalent to DOTALL flag for the entire remaining regex.
Beware, setting the flag in the Patter.compile() method call will apply it to the entire regex.
This kind of regex flag setting will come in handy, when you need to partially apply the DOTALL flag or CASE_INSENSITIVE flag..or any others.
 
Oh. Hi guys! Look at this tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic