How to create a Singleton SparkSession in Scala Application ?

Hi Friends,

Today I am going to explain how to create Singleton SparkSession in Scala Application. We can create SparkSession method in a class which returns SparkSession and call that in main class.
Below is the code for the same, in which I've used Logger also to log the info.

import java.util.Properties
import org.apache.log4j.PropertyConfigurator
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.slf4j.LoggerFactory

/** * Created by anamika_singh on 2/13/2020. */

class SingletonSparkSession {

  val logger = LoggerFactory.getLogger(classOf[SingletonSparkSession])

  //Read the properties file and store the properties to use during processing
  //Creating connection to read provided variable's value in properties file.
  val connectionParam = new Properties
  connectionParam.load(getClass().getResourceAsStream("/generic.properties"))
  PropertyConfigurator.configure(connectionParam)

  //Get the required details from properties file to avoid the hardcoding value inside code

  val jobName = connectionParam.getProperty("jobName")
  val defaultParallelism = connectionParam.getProperty("defaultParallelism")

  @transient private var sparkSession: SparkSession = null
  /* Creating SparkSession */
  def getSparkSession(): SparkSession = {

    val conf = new SparkConf().setAppName(jobName).set("spark.default.parallelism", defaultParallelism)
      .setIfMissing("spark.master", "local[*]")

    if (sparkSession == null)
      sparkSession = SparkSession.builder().config(conf).getOrCreate()

    logger.info("[=============== Spark Session Created ===============]")

    sparkSession  }
}

 //Now we can use this above SparkSession across the application as per below :
 object callingSparkSession extends App {

  //Creating object of SingleTonSparkSession class to get getSparkSession method which returns SparkSession
  val singletonSparkSession = new SingletonSparkSession()
  val spark = singletonSparkSession.getSparkSession()

  //With printing, verifying the Variable's values read from properties file
  println("Spark App Name  : " + singletonSparkSession.jobName)
  println("Number of Parallelism for Spark job : " + singletonSparkSession.defaultParallelism)

  //Now we can use spark session as per requirement  //For Example below I am trying to run a spark Sql query to test it.
  val sparkSessionTest = spark.sql("select 1, 2")
  sparkSessionTest.show()

}

Output : Below is the output of above created SparkSession :

SingletonSparkSession: [=============== Spark Session Created ===============]
App Name  : SparkSession-Testing
Number of Parallelism for Spark job : 2

+---+---+
|  1  |  2  |
+---+---+
|  1  |  2  |
+---+---+

Below is the Snap for the same above output :



So this is how we can create a Singleton SparkSession to use across the Scala Application. Here I've read required variable's values from properties file, If you want to explore that how to read value from properties file in Scala, Please refer this below blog : 

How to read the value from properties file in Scala to make a Generic Application



If You like this blog, please like and share it and most welcome of your feedback and suggestions.

Thank You!

Comments

Popular posts from this blog

Transformations and Actions in Spark

How to Convert a Spark DataFrame to Map in Scala

How to Handle and Convert DateTime format in Spark-Scala.