sed awk get substring instead - regex -


hi how use sed or awk extract substring matches regular expression.

i have seen several modify or change substring want matching part.

my data looks below:

<loc>http://www.a.com/sitemap1.gz</loc> <loc>http://www.a.com/sitemap2.gz</loc> <loc>http://www.a.com/sitemap3.gz</loc> <loc>http://www.a.com/sitemap4.gz</loc> <loc>http://www.a.com/sitemap5.gz</loc> <loc>http://www.a.com/sitemap6.gz</loc> <loc>http://www.a.com/sitemap7.gz</loc> <loc>http://www.a.com/sitemap8.gz</loc> 

output should

http://www.a.com/sitemap1.gz http://www.a.com/sitemap2.gz http://www.a.com/sitemap3.gz .... 

i tried

cat data | sed 's/'http.*gz'//'  

but command removes part want keep. thanks

a simple grep -o option:

$ grep -o 'http[^<]*' file http://www.a.com/sitemap1.gz http://www.a.com/sitemap2.gz http://www.a.com/sitemap3.gz http://www.a.com/sitemap4.gz http://www.a.com/sitemap5.gz http://www.a.com/sitemap6.gz http://www.a.com/sitemap7.gz http://www.a.com/sitemap8.gz 

with awk do:

$ awk -f'[<>]' '{print $3}' file http://www.a.com/sitemap1.gz http://www.a.com/sitemap2.gz http://www.a.com/sitemap3.gz http://www.a.com/sitemap4.gz http://www.a.com/sitemap5.gz http://www.a.com/sitemap6.gz http://www.a.com/sitemap7.gz http://www.a.com/sitemap8.gz 

Comments

Popular posts from this blog

c# - Send Image in Json : 400 Bad request -

jquery - Fancybox - apply a function to several elements -

An easy way to program an Android keyboard layout app -