Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Quickly Create a Scala Project Using Gradle in Intellij IDEA

Easy Way

  1. Create a directory (e.g., demo_proj) for your project.

  2. Run gradle init --type scala-library in terminal in the above directory.

  3. Import the directory as a Gradle project in IntelliJ IDEA. Alternatively, you can add apply plugin: 'idea' into build.gradle and then run the command ./gradlew openIdea to …

Union RDDs in Spark

No deduplication is done (to be efficient) when unioning RDDs/DataFrames in Spark 2.1.0+.

  1. Union 2 RDDs.

    df1.union(df2)
    // or for old-fashioned RDD
    rdd1.union(rdd_2)
    
  2. Union multiple RDDs.

    df = spark.union([df1, df2, df3]) // spark is a SparkSession object
    // or for old-fashioned RDD
    rdd …