가을기 Workspace

[개발] spark DataFrame 을 MySQL에 저장하기 본문

개발/Data Engineering

[개발] spark DataFrame 을 MySQL에 저장하기

가을기_ 2021. 6. 2. 13:05

JDBC를 통해 접근하기 때문에 driver가 필요하다.

SBT를 사용하므로, build.sbt에 maven의 mysql-connector-java 를 추가한다.

 

import org.apache.spark.sql.SaveMode

val jdbcUrl = "jdbc:mysql://{host}:{port}/{db_name}"
val df = spark.table("...")
println(df.rdd.partitions.length)
// given the number of partitions above, users can reduce the partition value by calling coalesce() or increase it by calling repartition() to manage the number of connections.
df.repartition(10)
    .write.mode(SaveMode.Append)
    .jdbc(jdbcUrl, "product_mysql", connectionProperties)
Comments