Select All Columns Except a Few from a Table

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

** Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement! **

Comments¶

There is no (direct) way of select all columns except a few from a table using SQL. However, this is easily doable with DataFrame APIs (pandas, Spark/PySpark, etc.).

import pandas as pd
import findspark

findspark.init("/opt/spark-3.0.1-bin-hadoop3.2/")

from pyspark.sql import SparkSession, DataFrame
from pyspark.sql.functions import *
from pyspark.sql.types import StructType

spark = SparkSession.builder.appName("Case/When").enableHiveSupport().getOrCreate()

df = spark.createDataFrame(
    pd.DataFrame(
        data=(
            ("Ben", "Du", 1),
            ("Ben", "Du", 2),
            ("Ben", "Tu", 3),
            ("Ben", "Tu", 4),
            ("Ken", "Xu", 1),
            ("Ken", "Xu", 9),
        ),
        columns=("fname", "lname", "score"),
    )
)
df.show()

+-----+-----+-----+
|fname|lname|score|
+-----+-----+-----+
|  Ben|   Du|    1|
|  Ben|   Du|    2|
|  Ben|   Tu|    3|
|  Ben|   Tu|    4|
|  Ken|   Xu|    1|
|  Ken|   Xu|    9|
+-----+-----+-----+

Select all columns except score.

df.select(*[col for col in df.columns if col not in ("score")]).show()

+-----+-----+
|fname|lname|
+-----+-----+
|  Ben|   Du|
|  Ben|   Du|
|  Ben|   Tu|
|  Ben|   Tu|
|  Ken|   Xu|
|  Ken|   Xu|
+-----+-----+

Reference¶

http://renien.com/blog/exclude-field-hive-query/

https://stackoverflow.com/questions/34198114/alter-hive-table-add-or-drop-column