Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Hands on the Python Library pexpect

Tips and Traps

  1. The command-line tool of some (e.g., network) applications might be slow to authenticate. If you use pexect to automate such a command-line tool, it is best to wait for sometime after sending password using child.sendline(passwd). If the authentication has ouput on both success and failure, a smart way is to wait for the success or failure message to come out.

Spark Issue: Shell Related

Symptom 1

/bin/sh: hdfs: command not found

Possible Causes of Symptom 1

The command hdfs is not on the search path.

Possible Solutions to Symptom 1

  1. Use the full path to the command.
  2. Configure the environment variable PATH before you use the command.
  3. Find other alternatives to the command …

Hands on the Python module subprocess

General Tips

  1. The method subprocess.run is preferred over older high-level APIs (subprocess.call, subprocess.check_call and subprocess.check_output). The method subprocess.Popen (which powers the high-level APIs) can be used if you need advanced controls. When running a shell command using subprocess.run,

    1. Avoid using system shell (i.e., avoid using shell=True) for 2 reasons. First, avoid shell injection attack. Second, there is no need for you to manually escape special characters in the command.