regex - Trying to use /^\s*$/ match multiple blank lines and replace them failed and get a confusing result -

June 15, 2010

Pearl version: 5.16.01

I am reading a book about regex which perl 5.8

The book states that s / ^ \ s * $ / blabla / mg can match and replace many blank rows. But when I got acclaimed, I got confusing results.

  code: $ text = "c \ n \ n \ n \ n \ nb"; $ Text = ~ s / ^ \ s * $ / & lt; P & gt; / mg; Print "$ text";     
 Here is the result:  
  c: \ user \ administrator \ desktop \ regex> perl t2h.pl c & lt ; P & gt; & Lt; P & gt; B    
 I want to know that I have a single  & lt; P & gt;  is not found, but what between two between 'c' and 'b' varies after parallel to  / $ /  5.8?  
   
  Here's a lesson, beware of regular expression, which will match zero-width pattern, you can get unexpected results Can get it.  
 We can see here that both of the replacements are showing the premiere, match and post match:  
  Use strict; Use warnings; My $ text = "c \ n \ n \ n \ nb"; $ Text = ~ s {^ \ s * $} {printf qq {& lt; "% S" - "% s" - "% s" & gt; \ N}, map s / \ n / \\ n / gr, ($ `, $ & Amp; $ '); "& Lt; p & gt;" } EMG; $ Text = ~ s / \ n / \\ n / g; Print qq {results: "$ text"};    output  & lt; "Prematch" - "Mail" - "Postmatch" & gt; :  
  & lt; "C \ n" - "\ n \ n" - "\ nb" & gt; & Lt; "c \ n \ n \ n" - "" - "\ nb" & gt; The result: "C \ n     
 Actually, regex positions 2 to 4, captures 2 return characters is. After that replacement, it starts searching from position 4 and matches a zero-width pattern, so a second  and lt; P & gt; Adds .  
 One reason for this is not simple: our regex has <2> on the terms  \ n \ n  3 a  and lt; P & gt; With  however, by default it is believed that (which is  ^  special edition) treat the string, because it is any  / g  regex May not have been replaced by the previous pass. Therefore, when position 4 corresponds to, regex  c   c \ n & lt; P & gt; Instead of  instead of looking at it (shown in our output above), and so forth immediately with a difference between  ^  again and  $  will not come.  
 Its solution is by using  +  in this example instead of zero width pattern  + . Another example of  
  Secondary Example   
 is the following, simple regex  
  my $ text = "caab" ; $ Text = ~ s / a * / & lt; P & gt; / G; Print $ text;    Output:  
  & lt; P & gt; c & lt; P & gt; & Lt; P & gt; B & lt; P & gt; The positional breakdown of this match is as follows:    0c - a zero-width pattern 1a - match a 2-character pattern 2 to 3 B - A zero-width pattern 4 $ - corresponds to a zero-width pattern    Therefore, the last lesson is only to beware of regexes which will match the zero-width pattern.   

 



















Get link





Facebook





X





Pinterest





Email





Other Apps

Comments Post a Comment

Search This Blog

SET RT

regex - Trying to use /^\s*$/ match multiple blank lines and replace them failed and get a confusing result -

Comments

Post a Comment

Popular posts from this blog

Java - Error: no suitable method found for add(int, java.lang.String) -

java - JPA TypedQuery: Parameter value element did not match expected type -

c++ - static template member variable has internal linkage but is not defined -