Packaging Python Dependencies for PySpark Using python-build-standalone
You can build a portable Python environment following steps below.
-
Install python-build-standalone.
-
Install Python packages using pip of the installed python-build-standalone distribution.
-
Pack the whole python-build-standalone directory into a compressed file, e.g.,
env.tar.gz
.
The GitHub repo dclong/python-portable has good examples of building portable Python environments leveraging …
Packaging Python Dependencies for PySpark Using conda-pack
python-build-standalone is a better alternative to conda-pack on managing Python dependencies for PySpark. Please refer to Packaging Python Dependencies for PySpark Using python-build-standalone for tutorials on how to use python-build-standalone to manage Python dependencies for PySpark.
Build Portable Python Environments Using conda-pack
Please refer to the GitHub repo dclong/conda_environ …
The list Collection in Python
Tips and Traps¶
list
is essentially a resizable array of objects in Python.Almosts all methods of list are in-place.
list.pop
is inplace and returns the removed element.To get unique elements in a list, you can first coerce the list to a set and then convert the set back to a list.
unique_list = list(set(alist))
Working with Iterators in Python
Iterator vs Generator¶
Generator is a special case of Iterator.
Generator is easy and convenient to use but at additional cost (memory and speed).
If you need performance, use plain iterator (with the help of the itertools
module).
If you need convenience and concise code, use generator.
Please refer to Python Generator vs Iterator for more detailed discussions.
Install Python Packages Using pip
PyPi Statistics
You can check download statistics of Python Packages on PYPI at https://pypistats.org/. This is especially helpful if you want to choose from multiple packages.
Prefer pip
pip
is preferred over OS tools
(e.g., apt-get
, yum
, wajig
, aptitude
, etc.) for managing Python packages.
If you are …