SparkSQL List例子

刘超 7天前 ⋅ 79 阅读   编辑

>>> df = spark.createDataFrame([[[['a','b','c'], ['d','e','f'], ['g','h','i']]]],["col1"])
>>> df.show(20, False)
+---------------------------------------------------------------------+
|col1                                                                 |
+---------------------------------------------------------------------+
|[WrappedArray(a, b, c), WrappedArray(d, e, f), WrappedArray(g, h, i)]|
+---------------------------------------------------------------------+
>>> from pyspark.sql.functions import explode
>>> out_df = df.withColumn("col2", explode(df.col1)).drop('col1')
>>>
>>> out_df .show()
+---------+
|     col2|
+---------+
|[a, b, c]|
|[d, e, f]|
|[g, h, i]|
+---------+
>>> out_df.select(out_df.col2[0].alias('c1'), out_df.col2[1].alias('c2'), out_df.col2[2].alias('c3')).show()
+---+---+---+
| c1| c2| c3|
+---+---+---+
|  a|  b|  c|
|  d|  e|  f|
|  g|  h|  i|
+---+---+---+
>>>

注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: