PHP get html comments in string and wrap in <pre> tag. Regex or DOM? -


i find comment tags in string not inside <pre> tag, , wrap them in <pre> tag.

it seems there's no way of 'finding' comments using php dom.

i'm using regex of processing already, unfamiliar (have yet grasp or understand) aheads , behinds in regex.

for instance may have following code;

<!-- comment 1 -->  <pre>     <div class="some_html"></div>     <!-- comment 2 --> </pre> 

i wrap comment 1 in <pre> tags, not comment 2 resides in <pre>.

how done in regex?

here's kind of i've understood negative arounds, , attempt @ one, i'm doing wrong!

(?<!<pre>.*?)<!--.*-->(?!.*?</pre>)

you should use dom parser if planning on re-using code. every regex approach fail horribly sooner rather later when presented real-world html.

having said that, here's (but should not, see above) do:

first, identify comments, e.g. using

<!-- (?:(?!-->).)*--> 

the negative look-ahead block ensures .* not run out of comment block.

now, need figure out if comment inside <pre> block. key observation here, there number of either <pre> or </pre> elements following every comment not included in one.

so, run through rest of text, in pairs of <pre>s, , check if arrive @ end.

this like

(?=(?:(?!</?pre>).)*(?:</?pre>(?:(?!</?pre>).)*</?pre>(?:(?!</?pre>).)*)*$) 

so,

<!-- (?:(?!-->).)*-->(?=(?:(?!</?pre>).)*(?:</?pre>(?:(?!</?pre>).)*</?pre>(?:(?!</?pre>).)*)*$) 

a hurray write-only code =)

the prominent building block of expression (?:(?!</?pre>).) matches every character not starting bracket of <pre> or </pre> sequence.

allowing attributes on <pre> , proper escaping left exercise reader. see in action @ regexr.


Comments

Popular posts from this blog

assembly - 8086 TASM: Illegal Indexing Mode -

Java, LWJGL, OpenGL 1.1, decoding BufferedImage to Bytebuffer and binding to OpenGL across classes -

javascript - addthis share facebook and google+ url -