Advertisement
nbatothemax

spark profiling

Feb 19th, 2019
138
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.29 KB | None | 0 0
  1. ============================================================
  2. Profile of RDD<id=6>
  3. ============================================================
  4. 3422798588 function calls (3422798076 primitive calls) in 12176.581 seconds
  5.  
  6. Ordered by: internal time, cumulative time
  7.  
  8. ncalls tottime percall cumtime percall filename:lineno(function)
  9. 1703581040 7212.163 0.000 9940.447 0.000 <stdin>:1(is_valid_english_word)
  10. 1703581040 2728.283 0.000 2728.283 0.000 {method 'union' of 'set' objects}
  11. 1756553 2048.382 0.001 11988.828 0.007 <stdin>:1(drop_invalid_tokens)
  12. 1756553 120.803 0.000 120.803 0.000 {method 'split' of 'str' objects}
  13. 128 31.282 0.244 12176.576 95.129 serializers.py:264(dump_stream)
  14. 351464 19.571 0.000 19.571 0.000 {method 'read' of 'file' objects}
  15. 175668 3.950 0.000 3.950 0.000 {cPickle.loads}
  16. 1756553 3.354 0.000 5.319 0.000 <stdin>:1(standardize_tokens)
  17. 1756553 1.846 0.000 122.649 0.000 <stdin>:1(tokenize_document)
  18. 1756553 1.581 0.000 2.102 0.000 <stdin>:9(<lambda>)
  19. 3688774 1.341 0.000 1.341 0.000 {len}
  20. 1756553 1.231 0.000 1.231 0.000 {range}
  21. 175796 1.068 0.000 25.826 0.000 serializers.py:160(_read_with_length)
  22. 175796 0.601 0.000 1.249 0.000 serializers.py:554(read_int)
  23. 175796 0.568 0.000 26.394 0.000 serializers.py:141(load_stream)
  24. 175668 0.301 0.000 4.251 0.000 serializers.py:433(loads)
  25. 175796 0.250 0.000 0.250 0.000 {_struct.unpack}
  26. 640/128 0.002 0.000 0.003 0.000 rdd.py:2406(pipeline_func)
  27. 128 0.001 0.000 12176.581 95.130 worker.py:167(process)
  28. 128 0.001 0.000 0.001 0.000 serializers.py:222(load_stream)
  29. 640 0.001 0.000 0.001 0.000 rdd.py:317(func)
  30. 128 0.000 0.000 0.000 0.000 serializers.py:225(_load_stream_without_unbatching)
  31. 128 0.000 0.000 0.001 0.000 rdd.py:345(func)
  32. 128 0.000 0.000 0.000 0.000 rdd.py:395(func)
  33. 128 0.000 0.000 0.000 0.000 {built-in method from_iterable}
  34. 128 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
  35. 128 0.000 0.000 0.000 0.000 {iter}
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement