Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Comments

  1. It is suggested that you use the requests module instead of urllib unless you want to have minimal 3rd-party dependencies.

  2. You have to explicit import urllib.request in order to use it in Python 3. Please refer to https://bugs.python.org/issue36701 for more discussions. This is how Python 3 intends to work generally speaking. Of course, there are a few exceptions such as os.path.

import urllib.request

urllib.request.urlopen

r = urllib.request.urlopen("https://github.com/dclong/dsutil/releases/latest")
r.url
'https://github.com/dclong/dsutil/releases/tag/v0.10.0'

urllib.request.urlretrieve

urllib.request.urlretrieve can be used to download a file from the internet to local.

file, obj = urllib.request.urlretrieve(
    "http://www.legendu.net/media/download_code_server.py",
    "/tmp/download_code_server.py",
)
file
'/tmp/download_code_server.py'
obj
<http.client.HTTPMessage at 0x7fe9efc404a8>
!ls /tmp/download_code_server.py
/tmp/download_code_server.py
type(obj)
http.client.HTTPMessage
dir(obj)
['__bytes__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_charset', '_default_type', '_get_params_preserve', '_headers', '_payload', '_unixfrom', 'add_header', 'as_bytes', 'as_string', 'attach', 'defects', 'del_param', 'epilogue', 'get', 'get_all', 'get_boundary', 'get_charset', 'get_charsets', 'get_content_charset', 'get_content_disposition', 'get_content_maintype', 'get_content_subtype', 'get_content_type', 'get_default_type', 'get_filename', 'get_param', 'get_params', 'get_payload', 'get_unixfrom', 'getallmatchingheaders', 'is_multipart', 'items', 'keys', 'policy', 'preamble', 'raw_items', 'replace_header', 'set_boundary', 'set_charset', 'set_default_type', 'set_param', 'set_payload', 'set_raw', 'set_type', 'set_unixfrom', 'values', 'walk']
obj.as_string()
'Server: GitHub.com\nContent-Type: application/octet-stream\nLast-Modified: Fri, 24 Jan 2020 20:21:29 GMT\nETag: "5e2b51c9-2de"\nAccess-Control-Allow-Origin: *\nExpires: Fri, 24 Jan 2020 20:34:29 GMT\nCache-Control: max-age=600\nX-Proxy-Cache: MISS\nX-GitHub-Request-Id: 6ACA:869A:42BECA:4B481B:5E2B527D\nContent-Length: 734\nAccept-Ranges: bytes\nDate: Fri, 24 Jan 2020 22:14:08 GMT\nVia: 1.1 varnish\nAge: 13\nConnection: close\nX-Served-By: cache-sea4477-SEA\nX-Cache: HIT\nX-Cache-Hits: 2\nX-Timer: S1579904049.754540,VS0,VE0\nVary: Accept-Encoding\nX-Fastly-Request-ID: c6c2ef45f576ba81de6fa160a79b67dfda5beaac\n\n'