
Hello, first time poster... I have two issues that i can't seem to track down.... 1. I'm looking for a rule that will eliminate the following. I'm looking for all of my web pages that have some snippit of code before the <!DOCTYPE... my <!DOCTYPE should start the HTML on my web pages... I've seen individuals sneak the following code in: <!-- saved from --> <!DOCTYPE HTML Public ..ect... So is there a regex construct that will fail if any characters are found before <!DOCTYPE ??? 2. Secondly, looking to find ONLY the .PDF's inside a test.com domain. I wish to match the pattern http://www.test.com/snow/squall/index.pdf . I know to start my regex as http://www\.test\.com but how do i ignore all the directory stuff and key in on the .pdf extension. Thanks