Advertisement
deniswhite77

Untitled

Feb 16th, 2022
1,748
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
SPARK 0.47 KB | None | 0 0
  1.  
  2. APP_NAME = "DataFrames"
  3. SPARK_URL = "local[*]"
  4.  
  5. spark = SparkSession.builder.appName(APP_NAME) \
  6.         .config('spark.ui.showConsoleProgress', 'false') \
  7.         .getOrCreate()
  8.  
  9. taxi = spark.read.load('/datasets/pickups_terminal_5.csv',
  10.                        format='csv', header='true', inferSchema='true')
  11.  
  12. taxi = taxi.fillna(0)
  13.  
  14. taxi.registerTempTable("taxi")
  15.  
  16. print(spark.sql("SELECT COUNT(DISTINCT(CAST(date as DATE))) FROM taxi WHERE pickups > 200" ).show())
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement