How to Convert Json data/String to DataFrame in Spark.


Hi Friends,

Today, I'd like to show that how we can convert a json data to DataFrame for further use.
This is a very simple and small use case, but it can be helpful for learners.


Input Json String :  {"file_name":"test_20200202122754.json","object_class":"Monitor","object_class_instance":"Monitor","relation_tree":"Source~>HD_Info~>Monitor","Monitor":{"Index":"0","Vendor_Data":"58F5Y","Monitor_Type":"Lenovo Monitor","HInfoID":"650FEC74"}}

Expected Output DataFrame :



Below is code for the same :

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SparkSession


object ConvertJsonToDataFrame extends App {

  //Creating SparkSession
  lazy val conf = new SparkConf().setAppName("json-to-DataFrame").set("spark.default.parallelism", "2")
    .setIfMissing("spark.master", "local[*]")
  lazy val sparkSession = SparkSession.builder().config(conf).getOrCreate()
  lazy val sparkContext: SparkContext = sparkSession.sparkContext
  import sparkSession.implicits._

  //Raw Json Data to test
  val jsonStr = """{"file_name":"test_20200202122754.json","object_class":"Monitor","object_class_instance":"Monitor","relation_tree":"Source~>HD_Info~>Monitor","Monitor":{"Index":"0","Vendor_Data":"58F5Y","Monitor_Type":"Lenovo Monitor","HInfoID":"650FEC74"}}"""

  //Loading the Json Data to create a DataFrame
  val jsonDF = sparkSession.read.json(Seq(jsonStr).toDS)

  jsonDF.show(false)

  //Fetch all the Column;s Value and Drop the Nested Column after extract it.
  val getAllColumns = jsonDF.select($"*", $"Monitor.*").drop("Monitor")
  getAllColumns.show(false)

}






Highlighted Output as expected :




I Hope, This Post was helpful, please do like, comment and share.
Thank You !

Comments

Popular posts from this blog

Transformations and Actions in Spark

How to Convert a Spark DataFrame to Map in Scala

How to Handle and Convert DateTime format in Spark-Scala.