Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Doc in:
- https://docs.google.com/document/d/14KiTyaq0c1jCPqgmyYuyr4lwDP3xpVrf8DXx60AJVjw/edit?usp=sharing
- // Average Word Count - “normal” code
- val f = sc.textFile("file:///Users/talfranji/Dropbox/research/langmodel.py")
- val avglens = f.flatMap(_.split(" ")).filter(_.length > 0).
- map(word => (word(0), word.length)).
- groupByKey.
- map {case (k,v) => (k, v.sum.toFloat /v.size)}
- // Average Word Count - better performance
- val avglens = f.flatMap(_.split(" ")).filter(_.length > 0).
- map(word => (word(0), (word.length, 1))).
- reduceByKey{case ((tot1,count1), (tot2, count2)) => ( (tot1 + tot2) ,(count1 + count2))}.
- mapValues {case (tot, count) => tot.toFloat/count}
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement