optimization - How to quickly replace many matching items with a single replacement in BASH? -
i have file, "items.txt" containing list of 100,000 items need remove file "text.txt" , replace "111111111".
i wrote script works intend:
#!/bin/bash a=0 b=`wc -l < ./items.txt` while read -r line a=`expr $a + 1` sed -i "s/$line/111111111/g" text.txt echo "removed ("$a"/"$b")." done < ./items.txt
this script looks @ eat line in "items.txt", uses sed
remove each line "text.txt".
this script slow though. estimate, take more 1 week remove of items file on computer. there more efficient way replace of items quickly?
bash 4.1.5
use sed build sed script replace items:
sed 's/^/s=/;s/$/=111111111=g/' items.txt | sed -f- text.txt
update: following perl script seems faster:
#!/usr/bin/perl use warnings; use strict; open $items, '<', 'items.txt'; @items = <$items>; chomp @items; $regex = join '|', @items; $regex = qr/$regex/; open $text, '<', 'text.txt'; while (<$text>) { s/$regex/111111111/g; print; }
Comments
Post a Comment