Ben Chuanlong Du's Blog

It is never too late to learn.

Select All Columns Except a Few from a Table

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Comments

There is no (direct) way of select all columns except a few from a table using SQL. However, this is easily doable with DataFrame APIs (pandas, Spark/PySpark, etc.).

In [1]:
import pandas as pd
import findspark

findspark.init("/opt/spark-3.0.1-bin-hadoop3.2/")

from pyspark.sql import SparkSession, DataFrame
from pyspark.sql.functions import *
from pyspark.sql.types import StructType

spark = SparkSession.builder.appName("Case/When").enableHiveSupport().getOrCreate()
In [2]:
df = spark.createDataFrame(
    pd.DataFrame(
        data=(
            ("Ben", "Du", 1),
            ("Ben", "Du", 2),
            ("Ben", "Tu", 3),
            ("Ben", "Tu", 4),
            ("Ken", "Xu", 1),
            ("Ken", "Xu", 9),
        ),
        columns=("fname", "lname", "score"),
    )
)
df.show()
+-----+-----+-----+
|fname|lname|score|
+-----+-----+-----+
|  Ben|   Du|    1|
|  Ben|   Du|    2|
|  Ben|   Tu|    3|
|  Ben|   Tu|    4|
|  Ken|   Xu|    1|
|  Ken|   Xu|    9|
+-----+-----+-----+

Select all columns except score.

In [4]:
df.select(*[col for col in df.columns if col not in ("score")]).show()
+-----+-----+
|fname|lname|
+-----+-----+
|  Ben|   Du|
|  Ben|   Du|
|  Ben|   Tu|
|  Ben|   Tu|
|  Ken|   Xu|
|  Ken|   Xu|
+-----+-----+

Comments