Advertisement
fahadkalil

criar_csv_from_rdd_spark

Nov 1st, 2019
319
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.37 KB | None | 0 0
  1. id,matricula,disciplina
  2. 1,123,44
  3. 2,234,44
  4.  
  5.  
  6. ## gerando rdd do arquivo csv (sem cabeçalho)
  7.  
  8. data = sc.textFile("your File Path\matriculas.csv")
  9. data = data.map(lambda x: x.split(","))
  10.  
  11.  
  12. ## ALTERNATIVA (quando tem cabeçalho)
  13.  
  14. data = sc.textFile('path_to_data')
  15. header = data.first() #extract header
  16. data = data.filter(row => row != header)   #filter out header
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement