Check for 404 errors in bulk using this simple Python script and a list of URLs

Eli Williams
2 min readJun 15, 2020

Continuing the theme of publishing quick and dirty scripts I use, this is a script for when I need to check the status of a bunch of URLs without overthinking it.

import requests
def get_url_status(urls): # checks status for each url in list urls

for url in urls:
try:
r = requests.get(url)
print(url + "\tStatus: " + str(r.status_code))
except Exception as e:
print(url + "\tNA FAILED TO CONNECT\t" + str(e))
return None
def main():

urls = ["https://www.foundryoutdoors.com"]
get_url_status(urls)
if __name__ == "__main__":
main()

Simply save the script, replace my urls Python list with your own, open terminal, cd into the directory with the script, and run it. If you don’t have the requests library, install it with the following at your command prompt:

pip install requests

If you own a website with thousands of pages, including a bunch of dead ones, this script can be helpful in finding which pages are returning 404 errors so you can get to work redirecting or deleting them. As you may know, the HTTP 200 success status response code indicates that your request has succeeded.

The actual version I use in practice imports from a .csv and writes to another .csv upon completion, but for simplicity’s sake the above version takes in a Python list of URLs and prints each webpage’s status on a new line, separated by a tab. You should be able to nicely copy/paste the output into an Excel or Google Sheets spreadsheet.

This script does not safely handle errors, so use with caution. It also requires the requests library to be installed. Finally, note that some websites will spit back errors if they have taken measures to prevent scraping (e.g. Amazon).

If you liked this, follow me on Twitter @elitwilliams.

--

--