How to Convert Json data/String to DataFrame in Spark.
Hi Friends,
Today, I'd like to show that how we can convert a json data to DataFrame for further use.
This is a very simple and small use case, but it can be helpful for learners.
Input Json String : {"file_name":"test_20200202122754.json","object_class":"Monitor","object_class_instance":"Monitor","relation_tree":"Source~>HD_Info~>Monitor","Monitor":{"Index":"0","Vendor_Data":"58F5Y","Monitor_Type":"Lenovo Monitor","HInfoID":"650FEC74"}}
Expected Output DataFrame :
Below is code for the same :
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SparkSession
object ConvertJsonToDataFrame extends App {
//Creating SparkSession
lazy val conf = new SparkConf().setAppName("json-to-DataFrame").set("spark.default.parallelism", "2")
.setIfMissing("spark.master", "local[*]")
lazy val sparkSession = SparkSession.builder().config(conf).getOrCreate()
lazy val sparkContext: SparkContext = sparkSession.sparkContext
import sparkSession.implicits._
//Raw Json Data to test
val jsonStr = """{"file_name":"test_20200202122754.json","object_class":"Monitor","object_class_instance":"Monitor","relation_tree":"Source~>HD_Info~>Monitor","Monitor":{"Index":"0","Vendor_Data":"58F5Y","Monitor_Type":"Lenovo Monitor","HInfoID":"650FEC74"}}"""
//Loading the Json Data to create a DataFrame
val jsonDF = sparkSession.read.json(Seq(jsonStr).toDS)
jsonDF.show(false)
//Fetch all the Column;s Value and Drop the Nested Column after extract it.
val getAllColumns = jsonDF.select($"*", $"Monitor.*").drop("Monitor")
getAllColumns.show(false)
}
Highlighted Output as expected :
I Hope, This Post was helpful, please do like, comment and share.
Thank You !
Comments
Post a Comment