• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Frank Carver
  • Junilu Lacar
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Al Hobbs
  • Carey Brown
Bartenders:
  • Piet Souris
  • Frits Walraven
  • fred rosenberger

regular expression

 
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,
I have a file that I parse that looks like this :
l.1 : 452658524.542586;425.6;875.8475;4587.5478
l.2 : ;0;0;0
l.3 : 0;;;
l.4 : 458.23;475.24;4214.98;47859.69

I want to skip l.2 and l.3, indeed the lines that contain:
--> Only some ";" OR
--> Only some ";" AND ( some "0" (whatever the order or the number of "0") OR some " " (whatever the order or the number of " "))

Do you have any idea which regular expression I could use for this please ?
Thanks in advance,
 
Marshal
Posts: 76453
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Why do you think you need a regular expression?
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I need to detect that I have such a line that I want to ignore. If there is a way without using Regex I am ok to take it of course but I can't think of one. Do you maybe ?
 
Campbell Ritchie
Marshal
Posts: 76453
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You could read the line and split it on the semicolon. You could read the line and take its length.
 
Bartender
Posts: 5068
189
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi jacques,

a bit of Boolean Algebra gives that A or (A and (B or C)) is equivalent to A.
So you just need to look for some ';'.

Is this what you really intend?
 
Marshal
Posts: 8419
606
Mac OS X VI Editor BSD Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In this file those 2 lines are problem, but I suspect there could be more cases.
You need to define first what is needed to be removed and what is not AND you should think not only about this file, but about other possible cases, as:
l.3 : 0;;0;
l.3 : 0;0;;
l.3 : 0;;;0
l.3 : ;;;0
l.3 : ;;;
l.3 : 0;0;;
l.3 : 0;1;0;0

Regex expressions are powerful and at the same time very dangerous as you can remove lots of data by mistake.
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Actually I am comparing 2 files and I allow a certain percentage of difference between each data. but in one file, when it is of the type I have described, it doesn't generate a line on the other file, so I need to not compare that one and go onto the next line.
When I have data at the same line on both file, then yes I use split and compare data per data, but I need to not do this when there will be no line generated on the other file.
So that's why I think regular expression is the best way to detect when the content of my ReadLine needs to be ignored. I am just not good with regular expression and hoped someone could help me "design" the right pattern for my lines I need to ignore.
 
Liutauras Vilda
Marshal
Posts: 8419
606
Mac OS X VI Editor BSD Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

jacques dusieur wrote:When I have data at the same line on both file


l.3 : 0;0;0;1 <-- do you consider this line as data in your task?
 
Campbell Ritchie
Marshal
Posts: 76453
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You should start worrying when anybody says they think XYZ is the best way to do things. Especially if they haven't definitely specified what they are trying to do.
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:

jacques dusieur wrote:When I have data at the same line on both file


l.3 : 0;0;0;1 <-- do you consider this line as data in your task?


hello,
This line should not match the regexp (because of the 1 (or any other number for that matter))
All the other in your list should match the regexp.
 
Liutauras Vilda
Marshal
Posts: 8419
606
Mac OS X VI Editor BSD Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, all columns should contain at least 0 in order you would be able to compare, right?
In case one of the columns are empty, you cannot then, right?

I think CR already suggested you the solution with the split method, that would work.
 
Liutauras Vilda
Marshal
Posts: 8419
606
Mac OS X VI Editor BSD Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

jacques dusieur wrote:hen I have data at the same line on both file, then yes I use split and compare data per data, but I need to not do this when there will be no line generated on the other file.

For Java it is a perfect line
l.3 : ;;;

Please define clearly those points for yourself, what is comparable, and what is not.
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I will try to clarify myself :
I have File1 with 5 lines for instance :
3.4444444;8.9;6.7;6.1;1.2
;0;;;0
0;0;;;0
;;0;;0
5.6;7.4;5.6;2.2;6.3

Then File2 with 2 lines :
3.4444445;8.9;6.7;6.1;1.2
5.6;7.4;5.6;2.2;6.3

I loop through File1 and must skip the lines l2, l3 and l4 because I detect that they are of a certain pattern that I know it will not generate a matching line to compare in File2.
You can see that l1 is slightly different in both files but I want to compare them to say that there is a difference (or not if I have a paremeter saying that differences below 1% is considered valid).
 
Liutauras Vilda
Marshal
Posts: 8419
606
Mac OS X VI Editor BSD Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ok, so you can (1)read the line, (2)split by semi colon, (3)check if length of array is the same as of second file splitted line, if yes, then go ahead and compare those values.
If length is not the same, then read another line of the file where array length was shorter and do the same as above. If I understand you correctly, the second file is clean always, so you know that there are no record lines with not populated values.
Think also if there could be a case where space could be incorporated in between those semi colons. Then you might would need to use trim() method in addition to split.
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This won't work, when I will split my line, they will have all the time the same number of elements (so same length of array), it is just that some will have a content of "" and other of "0" when I will deal with a line that I need to skip.
I really see a regex as being the best solution, I just don't know how to write it :-( ...
Otherwise I could write a little function to check if the element of my arrays are only composed of "" and "0" but this would not be very efficient coding compares to Regex.
 
Liutauras Vilda
Marshal
Posts: 8419
606
Mac OS X VI Editor BSD Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Agh, you could be right, if last "column" is not empty, then split won't give desired result. If you're using Java 8, try with Streams
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Unfortunately I am using Java 1.7...
 
Campbell Ritchie
Marshal
Posts: 76453
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

jacques dusieur wrote:. . . I really see a regex as being the best solution . . . little function to check if the element of my arrays are only composed of "" and "0" but this would not be very efficient coding compares to Regex.

Why are you perseverating on the regex? For a start the check whether a String is empty or equals "0" probably runs much faster than a regex.
 
jacques dusieur
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've found it :
^[0,;]+$
 
I am mighty! And this is a mighty small ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic