Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- from pyspark.sql import Row
- lista = [('Joao',25),('Paulo',22)]
- rdd = sc.parallelize(lista)
- pessoa = rdd.map(lambda x: Row(nome=x[0], idade=int(x[1])))
- schemaPessoa = sqlContext.createDataFrame(pessoa)
- display(schemaPessoa)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement