How to Convert Json data/String to DataFrame in Spark.


Hi Friends,

Today, I'd like to show that how we can convert a json data to DataFrame for further use.
This is a very simple and small use case, but it can be helpful for learners.


Input Json String :  {"file_name":"test_20200202122754.json","object_class":"Monitor","object_class_instance":"Monitor","relation_tree":"Source~>HD_Info~>Monitor","Monitor":{"Index":"0","Vendor_Data":"58F5Y","Monitor_Type":"Lenovo Monitor","HInfoID":"650FEC74"}}

Expected Output DataFrame :



Below is code for the same :

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SparkSession


object ConvertJsonToDataFrame extends App {

  //Creating SparkSession
  lazy val conf = new SparkConf().setAppName("json-to-DataFrame").set("spark.default.parallelism", "2")
    .setIfMissing("spark.master", "local[*]")
  lazy val sparkSession = SparkSession.builder().config(conf).getOrCreate()
  lazy val sparkContext: SparkContext = sparkSession.sparkContext
  import sparkSession.implicits._

  //Raw Json Data to test
  val jsonStr = """{"file_name":"test_20200202122754.json","object_class":"Monitor","object_class_instance":"Monitor","relation_tree":"Source~>HD_Info~>Monitor","Monitor":{"Index":"0","Vendor_Data":"58F5Y","Monitor_Type":"Lenovo Monitor","HInfoID":"650FEC74"}}"""

  //Loading the Json Data to create a DataFrame
  val jsonDF = sparkSession.read.json(Seq(jsonStr).toDS)

  jsonDF.show(false)

  //Fetch all the Column;s Value and Drop the Nested Column after extract it.
  val getAllColumns = jsonDF.select($"*", $"Monitor.*").drop("Monitor")
  getAllColumns.show(false)

}






Highlighted Output as expected :




I Hope, This Post was helpful, please do like, comment and share.
Thank You !

Comments

Popular posts from this blog

Knowledge about Apache Sqoop and its all basic commands to import and export the Data

Transformations and Actions in Spark

Data Lake, Data Warehouse, Data Mart, and Delta Lake