sed awk get substring instead - regex -
hi how use sed or awk extract substring matches regular expression.
i have seen several modify or change substring want matching part.
my data looks below:
<loc>http://www.a.com/sitemap1.gz</loc> <loc>http://www.a.com/sitemap2.gz</loc> <loc>http://www.a.com/sitemap3.gz</loc> <loc>http://www.a.com/sitemap4.gz</loc> <loc>http://www.a.com/sitemap5.gz</loc> <loc>http://www.a.com/sitemap6.gz</loc> <loc>http://www.a.com/sitemap7.gz</loc> <loc>http://www.a.com/sitemap8.gz</loc>
output should
http://www.a.com/sitemap1.gz http://www.a.com/sitemap2.gz http://www.a.com/sitemap3.gz ....
i tried
cat data | sed 's/'http.*gz'//'
but command removes part want keep. thanks
a simple grep
-o
option:
$ grep -o 'http[^<]*' file http://www.a.com/sitemap1.gz http://www.a.com/sitemap2.gz http://www.a.com/sitemap3.gz http://www.a.com/sitemap4.gz http://www.a.com/sitemap5.gz http://www.a.com/sitemap6.gz http://www.a.com/sitemap7.gz http://www.a.com/sitemap8.gz
with awk
do:
$ awk -f'[<>]' '{print $3}' file http://www.a.com/sitemap1.gz http://www.a.com/sitemap2.gz http://www.a.com/sitemap3.gz http://www.a.com/sitemap4.gz http://www.a.com/sitemap5.gz http://www.a.com/sitemap6.gz http://www.a.com/sitemap7.gz http://www.a.com/sitemap8.gz
Comments
Post a Comment