本文介绍Databricks 读写 Storage Account Gen 2
用到的链接:
DBFS目录下的文件:
https://learn.microsoft.com/zh-cn/azure/databricks/dbfs/root-locations
链接Data lake 的脚本:
配置链接datalake gen2
service_credential = dbutils.secrets.get(scope="key-vault-scope",key="sp-key")
spark.conf.set("fs.azure.account.auth.type.databriilakke.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.databriilakke.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.databriilakke.dfs.core.windows.net", "07471450-8553-49ed-9d18-c2017eef9b69")
spark.conf.set("fs.azure.account.oauth2.client.secret.databriilakke.dfs.core.windows.net", service_credential)
spark.conf.set("fs.azure.account.oauth2.client.endpoint.databriilakke.dfs.core.windows.net", "https://login.microsoftonline.com/c023101b-be0d-4a03-991a-824f9032469a/oauth2/token")
读取自带的json文件
df = spark.read.json("/databricks-datasets/iot/iot_devices.json")
写入到 datalake gen2
df.write.save("abfss://iotcontainer@databriilakke.dfs.core.windows.net/iot/jsondata")
列举出datalake中的文件
dbutils.fs.ls("abfss://iotcontainer@databriilakke.dfs.core.windows.net/iot/jsondata")
读取datalake中的文件
df2 = spark.read.load("abfss://iotcontainer@databriilakke.dfs.core.windows.net/iot/jsondata")