Patrick Patoray's Blog...

How to Find and Replace text across 20,000 files

Posted March 03, 2006
to: Things I Learned Today

So I needed to remove HTML headers from about 20,000 files. I pretty much suck at shell scripting, but GUI editors choke hard on 20,000 files in a project directory. I tried Xcode and TextMate, and knew from past experience that GoLive wasn't even worth opening up for that many files.

A bit of Googling, and I was able to come up with the following snippet of code to remove the title tag but leave its title (the $1):

perl -0777 -pi -e 's/\t*<TITLE.*>(.*)<\/TITLE>//gi'

After a bit of playing around and examining the files for exactly what I needed, and I was able to create the following shell script:

I just had to place it in the root directory that I was working with and let it run through the files one by one. It's actually working right now. I'm guessing that it's going to take about an hour and a half to do it's thing.

I'm sure that there are better optimized ways to do this, feel free to offer any suggestions.