# SparkContext vs SparkSession

SparkContext and SparkSession are two important components in Apache Spark, but they serve different purposes.

# SparkContext

* SparkContext (sc) is the entry point for interacting with Spark and represents the connection to a Spark cluster.
    
* It was the main entry point in earlier versions of Spark (1.x and 2.x), and it is still available in Spark 3.x for backward compatibility.
    
* SparkContext provides access to the underlying Spark functionality and allows you to create RDDs (Resilient Distributed Datasets), which are the fundamental data structure in Spark.
    
* However, with the introduction of DataFrame and Dataset APIs, SparkContext is considered a lower-level API and is generally not recommended for use in new applications.
    

# SparkSession

* SparkSession is the entry point for working with structured data in Spark and is the recommended entry point for applications starting from Spark 2.x.
    
* It encapsulates SparkContext and provides a higher-level API that supports both structured and unstructured data processing.
    
* SparkSession provides a unified interface for working with different data sources, such as CSV, Parquet, JSON, databases, etc.
    
* It enables the use of DataFrames and Datasets, which are distributed collections of data organized into named columns, providing a more efficient and expressive way to work with structured data.
    
* SparkSession also includes various utility functions for working with data, such as reading, writing, querying, and manipulating data.
    

In summary, while SparkContext is the entry point for interacting with Spark and creating RDDs, SparkSession is the higher-level entry point for structured data processing, providing a unified API and supporting DataFrames and Datasets. SparkSession is generally the preferred choice for new Spark applications, as it provides more powerful abstractions and simplifies the overall development experience.

---

If you like my work and want to support me…

1. I share tips, tricks and insights on #softwareengineering, #dataengineering #cloud #ml on [**LinkedIn**](https://www.linkedin.com/in/naveen-pn/).
    
2. Do you want to connect with me, I have started mentoring others for career and interviews at [**𝐭𝐨𝐩𝐦𝐚𝐭𝐞.𝐢𝐨/𝐧𝐚𝐯𝐞𝐞𝐧𝐩𝐧**](https://topmate.io/naveenpn)
