Spark的Dataset操作(五)-多表操作 join
不说废话了,直接上代码。
先看两个源数据表的定义:
scala> val df1 = spark.createDataset(Seq(("aaa", 1, 2), ("bbb", 3, 4), ("ccc", 3, 5), ("bbb", 4, 6)) ).toDF("key1","key2","key3") df1: org.apache.spark.sql.DataFrame = [key1: string, key2: int ... 1 more field] scala> val df2 = spark.cre...