This is a website for the Pro Python System Administration book, written by Rytis Sileika and published by Apress. Some of the projects described in the book are avialable on GitHub.

  Buy from Amazon (US) »   Buy from Amazon (UK) »

Preview on Google Books →

Latest blog post

Quick way to check for URL redirects

13 Mar 2013

Recently I had to go through a bunch of URLs and identify whether or not they redirect to another URL. Since the list was rather long (several hundred entries), I thought I should rather write a script to do this for me.

I used the Requests library to make the initial request for a given URL. If the response code was 3XX, then I assumed it is being redirected. There are a couple of other edge cases (timeouts and generic connection errors, either due to non-existent domain, or network error) that the script attempts to cover for.

The script accepts a single parameter, which should be a filename containing URL strings. If the filename is not provided, the script will look for domains.txt in the current directory.

#!/usr/bin/env python

import sys
import requests

def check_for_redirects(url):
        r = requests.get(url, allow_redirects=False, timeout=0.5)
        if 300 <= r.status_code < 400:
            return r.headers['location']
            return '[no redirect]'
    except requests.exceptions.Timeout:
        return '[timeout]'
    except requests.exceptions.ConnectionError:
        return '[connection error]'

def check_domains(urls):
    for url in urls:
        url_to_check = url if url.startswith('http') else "http://%s" % url
        redirect_url = check_for_redirects(url_to_check)
        print("%s => %s" % (url_to_check, redirect_url))

if __name__ == '__main__':
    fname = 'domains.txt'
        fname = sys.argv[1]
    except IndexError:
    urls = (l.strip() for l in open(fname).readlines())

Below is an example of URLs. If protocol is not defined, http will be assumed.

And the results: => => => => => => [connection error]

Jump to Blog Archive →