Advertisement
NLinker

pyspark wordcount

Dec 7th, 2019
447
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.25 KB | None | 0 0
  1. words = sc.textFile("D:/workspace/spark/input.txt")
  2.           .flatMap(lambda line: line.split(" "))
  3. wordCounts = words.map(lambda word: (word, 1))
  4.                   .reduceByKey(lambda a,b: a + b)
  5. wordCounts.saveAsTextFile("D:/workspace/spark/output/")
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement