在Scala中,你可以使用Apache Spark的SQL库来执行SQL查询并将结果转换为DataFrame
build.sbt
文件中添加以下依赖:libraryDependencies += "org.apache.spark" %% "spark-core" % "3.2.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.2.0"
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder()
.appName("Scala SQL Query Example")
.master("local")
.getOrCreate()
import org.apache.spark.sql.types._
val schema = StructType(Array(
StructField("id", IntegerType, true),
StructField("name", StringType, true),
StructField("age", IntegerType, true)
))
val df = spark.read
.schema(schema)
.csv("path/to/your/data.csv")
df.createOrReplaceTempView("people")
val sqlResult = spark.sql("SELECT id, name FROM people WHERE age > 30")
sqlResult.show()
spark.stop()
这个示例展示了如何在Scala中使用Apache Spark SQL库执行SQL查询并将结果转换为DataFrame。你可以根据需要调整代码以适应你的数据和查询。