Note: This site is currently "Under construction". I'm migrating to a new version of my site building software. Lots of things are in a state of disrepair as a result (for example, footnote links aren't working). It's all part of the process of building in public. Most things should still be readable though.

Snippet Get Web Page in Python

TODO: Update this with the other post for `requests`

TODO: I think this is a duplicate post. Need to remove one or the other if that's the case.

Here's the simple snippet I use to scrape basic web pages in Python.

Code

import urllib.request

def get_web_page(url):
    try:
        with urllib.request.urlopen(url) as response:
            return response.read().decode("utf-8")
    except:
        return ""

Most of the examples I see only have the middle part:

Code

with urllib.request.urlopen(url) as response:
	return response.read().decode("utf-8")

The problem is that code crashes if the server sends an error code back. Wrapping it with the try lets you handle that.

Here's a full sample:

Code

#!/usr/bin/env python3

import urllib.request

def get_web_page(url):
    try:
        with urllib.request.urlopen(url) as response:
            return response.read().decode("utf-8")
    except:
        return ""

if __name__ == "__main__":
    html_doc = get_web_page("https://www.alanwsmith.com/")
    print(html_doc)