Skip to content

anand-lab-172/PySpark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PySpark

  • PySpark also known as Spark in Python Language. Which is a widely used ETL tool in industry to perform heavy task in Big Data
  • In this repository, I have done some basic and intermediate PySpark work.

Operations like:

  • Creating of Spark Session
  • Importing the data
  • filter operation
  • withColumn
  • SQL using pyspark
  • Advanced group by and aggregation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published