Access unmanaged (external) Azure Databricks Hive table via JDBCCreating External Table in Hive using HIVE JDBC : Not Possible?Difference between Hive internal tables and external tables?Unable to update hive table via JDBC`ds.` prefix on Hive tables when accessed via JDBCAccessing Hive tables from other Azure HDinsight cluster from within our clusterAttaching a library to Azure databricks clusterHow to Export Results of a SQL Query from Databricks to Azure Data Lake StoreCan we work with external APIs in Azure Databricks?Is the Databricks Operational Security Package enabled by default on Azure Databricks?Issue connecting to Databricks table from Azure Data Factory using the Spark odbc connector
Is there any reason not to eat food that's been dropped on the surface of the moon?
What are the ramifications of creating a homebrew world without an Astral Plane?
Can somebody explain Brexit in a few child-proof sentences?
Is there a problem with hiding "forgot password" until it's needed?
Is expanding the research of a group into machine learning as a PhD student risky?
What would be the benefits of having both a state and local currencies?
Increase performance creating Mandelbrot set in python
Can criminal fraud exist without damages?
What's a natural way to say that someone works somewhere (for a job)?
Modify casing of marked letters
Efficiently merge handle parallel feature branches in SFDX
voltage of sounds of mp3files
Are there any comparative studies done between Ashtavakra Gita and Buddhim?
Go Pregnant or Go Home
Why did Kant, Hegel, and Adorno leave some words and phrases in the Greek alphabet?
Using parameter substitution on a Bash array
Valid Badminton Score?
Why is delta-v is the most useful quantity for planning space travel?
apt-get update is failing in debian
Failed to fetch jessie backports repository
Why are on-board computers allowed to change controls without notifying the pilots?
Was the picture area of a CRT a parallelogram (instead of a true rectangle)?
Is there a good way to store credentials outside of a password manager?
Curses work by shouting - How to avoid collateral damage?
Access unmanaged (external) Azure Databricks Hive table via JDBC
Creating External Table in Hive using HIVE JDBC : Not Possible?Difference between Hive internal tables and external tables?Unable to update hive table via JDBC`ds.` prefix on Hive tables when accessed via JDBCAccessing Hive tables from other Azure HDinsight cluster from within our clusterAttaching a library to Azure databricks clusterHow to Export Results of a SQL Query from Databricks to Azure Data Lake StoreCan we work with external APIs in Azure Databricks?Is the Databricks Operational Security Package enabled by default on Azure Databricks?Issue connecting to Databricks table from Azure Data Factory using the Spark odbc connector
I am using Azure Databricks with Databricks Runtime 5.2 and Spark 2.4.0. I have setup external Hive tables in two different ways:
- a Databricks Delta table where the data is stored in Azure Data Lake Storage (ADLS) Gen 2, the table was created using a location setting, which points to a mounted directory in ADLS Gen 2.
- a regular DataFrame, saved as a table to ADLS Gen 2, not using the mount this time but instead the OAuth2 credentials I've set on the cluster level using spark.sparkContext.hadoopConfiguration
Both the mount point and the direct access (hadoopConfiguration) have been configured using OAuth2 credentials and an Azure AD Service Principal, which has the necessary access rights to Data Lake.
Both tables show up correctly in Databricks UI and can be queried.
Both tables are also visible in a BI tool (Looker), where I have successfully configured a JDBC connection to my Databricks instance. After this the differences begin:
1) table configured using the mount point does not allow me to run a DESCRIBE operation in the BI tool, not to mention a query. Everything fails with error "com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /mnt/xxx/yyy/zzz for resolving path '/yyy/zzz' within mount at '/mnt/xxx'."
2) table configured using without the mount point allows me to run DESCRIBE operation, but a query fails with error "java.util.concurrent.ExecutionException: java.io.IOException: There is no primary group for UGI (Basic token) (auth:SIMPLE)".
JDBC connection and querying from the BI tool to a managed table in Databricks works fine.
As far as I know, there isn't anything I could configure differently when creating the external tables, configuring the mounting point or the OAuth2 credentials. It seems to me that when using JDBC, the mount is not visible at all, so the request to the underlying datasource (ADLS Gen 2) can not succeed. On the other hand, the second scenario (number 2 above) is a bit more puzzling and in my mind seems like something somewhere under the hood, deep, and I have no idea about what to do with that.
One peculiar thing is also my username which shows up in scenario 2. I don't know where that comes from, as it is not involved when setting up the ADLS Gen 2 access using the Service Principal.
jdbc hive azure-databricks
add a comment |
I am using Azure Databricks with Databricks Runtime 5.2 and Spark 2.4.0. I have setup external Hive tables in two different ways:
- a Databricks Delta table where the data is stored in Azure Data Lake Storage (ADLS) Gen 2, the table was created using a location setting, which points to a mounted directory in ADLS Gen 2.
- a regular DataFrame, saved as a table to ADLS Gen 2, not using the mount this time but instead the OAuth2 credentials I've set on the cluster level using spark.sparkContext.hadoopConfiguration
Both the mount point and the direct access (hadoopConfiguration) have been configured using OAuth2 credentials and an Azure AD Service Principal, which has the necessary access rights to Data Lake.
Both tables show up correctly in Databricks UI and can be queried.
Both tables are also visible in a BI tool (Looker), where I have successfully configured a JDBC connection to my Databricks instance. After this the differences begin:
1) table configured using the mount point does not allow me to run a DESCRIBE operation in the BI tool, not to mention a query. Everything fails with error "com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /mnt/xxx/yyy/zzz for resolving path '/yyy/zzz' within mount at '/mnt/xxx'."
2) table configured using without the mount point allows me to run DESCRIBE operation, but a query fails with error "java.util.concurrent.ExecutionException: java.io.IOException: There is no primary group for UGI (Basic token) (auth:SIMPLE)".
JDBC connection and querying from the BI tool to a managed table in Databricks works fine.
As far as I know, there isn't anything I could configure differently when creating the external tables, configuring the mounting point or the OAuth2 credentials. It seems to me that when using JDBC, the mount is not visible at all, so the request to the underlying datasource (ADLS Gen 2) can not succeed. On the other hand, the second scenario (number 2 above) is a bit more puzzling and in my mind seems like something somewhere under the hood, deep, and I have no idea about what to do with that.
One peculiar thing is also my username which shows up in scenario 2. I don't know where that comes from, as it is not involved when setting up the ADLS Gen 2 access using the Service Principal.
jdbc hive azure-databricks
did you find a solution for this? i am having the same issue
– chathux
Mar 23 at 7:02
I do not have a solution yet. I am trying to get someone from Microsoft product group to come back with some insights and I will report any findings here. This is pretty critical for our customer project as well. What I was going to do in the meantime is try out the latest Databricks cluster (5.3 Beta) to see if that makes a difference.
– mikkark
yesterday
I tried with new 5.3 Beta cluster, no difference, does not work.
– mikkark
yesterday
add a comment |
I am using Azure Databricks with Databricks Runtime 5.2 and Spark 2.4.0. I have setup external Hive tables in two different ways:
- a Databricks Delta table where the data is stored in Azure Data Lake Storage (ADLS) Gen 2, the table was created using a location setting, which points to a mounted directory in ADLS Gen 2.
- a regular DataFrame, saved as a table to ADLS Gen 2, not using the mount this time but instead the OAuth2 credentials I've set on the cluster level using spark.sparkContext.hadoopConfiguration
Both the mount point and the direct access (hadoopConfiguration) have been configured using OAuth2 credentials and an Azure AD Service Principal, which has the necessary access rights to Data Lake.
Both tables show up correctly in Databricks UI and can be queried.
Both tables are also visible in a BI tool (Looker), where I have successfully configured a JDBC connection to my Databricks instance. After this the differences begin:
1) table configured using the mount point does not allow me to run a DESCRIBE operation in the BI tool, not to mention a query. Everything fails with error "com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /mnt/xxx/yyy/zzz for resolving path '/yyy/zzz' within mount at '/mnt/xxx'."
2) table configured using without the mount point allows me to run DESCRIBE operation, but a query fails with error "java.util.concurrent.ExecutionException: java.io.IOException: There is no primary group for UGI (Basic token) (auth:SIMPLE)".
JDBC connection and querying from the BI tool to a managed table in Databricks works fine.
As far as I know, there isn't anything I could configure differently when creating the external tables, configuring the mounting point or the OAuth2 credentials. It seems to me that when using JDBC, the mount is not visible at all, so the request to the underlying datasource (ADLS Gen 2) can not succeed. On the other hand, the second scenario (number 2 above) is a bit more puzzling and in my mind seems like something somewhere under the hood, deep, and I have no idea about what to do with that.
One peculiar thing is also my username which shows up in scenario 2. I don't know where that comes from, as it is not involved when setting up the ADLS Gen 2 access using the Service Principal.
jdbc hive azure-databricks
I am using Azure Databricks with Databricks Runtime 5.2 and Spark 2.4.0. I have setup external Hive tables in two different ways:
- a Databricks Delta table where the data is stored in Azure Data Lake Storage (ADLS) Gen 2, the table was created using a location setting, which points to a mounted directory in ADLS Gen 2.
- a regular DataFrame, saved as a table to ADLS Gen 2, not using the mount this time but instead the OAuth2 credentials I've set on the cluster level using spark.sparkContext.hadoopConfiguration
Both the mount point and the direct access (hadoopConfiguration) have been configured using OAuth2 credentials and an Azure AD Service Principal, which has the necessary access rights to Data Lake.
Both tables show up correctly in Databricks UI and can be queried.
Both tables are also visible in a BI tool (Looker), where I have successfully configured a JDBC connection to my Databricks instance. After this the differences begin:
1) table configured using the mount point does not allow me to run a DESCRIBE operation in the BI tool, not to mention a query. Everything fails with error "com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /mnt/xxx/yyy/zzz for resolving path '/yyy/zzz' within mount at '/mnt/xxx'."
2) table configured using without the mount point allows me to run DESCRIBE operation, but a query fails with error "java.util.concurrent.ExecutionException: java.io.IOException: There is no primary group for UGI (Basic token) (auth:SIMPLE)".
JDBC connection and querying from the BI tool to a managed table in Databricks works fine.
As far as I know, there isn't anything I could configure differently when creating the external tables, configuring the mounting point or the OAuth2 credentials. It seems to me that when using JDBC, the mount is not visible at all, so the request to the underlying datasource (ADLS Gen 2) can not succeed. On the other hand, the second scenario (number 2 above) is a bit more puzzling and in my mind seems like something somewhere under the hood, deep, and I have no idea about what to do with that.
One peculiar thing is also my username which shows up in scenario 2. I don't know where that comes from, as it is not involved when setting up the ADLS Gen 2 access using the Service Principal.
jdbc hive azure-databricks
jdbc hive azure-databricks
asked Mar 7 at 11:38
mikkarkmikkark
484
484
did you find a solution for this? i am having the same issue
– chathux
Mar 23 at 7:02
I do not have a solution yet. I am trying to get someone from Microsoft product group to come back with some insights and I will report any findings here. This is pretty critical for our customer project as well. What I was going to do in the meantime is try out the latest Databricks cluster (5.3 Beta) to see if that makes a difference.
– mikkark
yesterday
I tried with new 5.3 Beta cluster, no difference, does not work.
– mikkark
yesterday
add a comment |
did you find a solution for this? i am having the same issue
– chathux
Mar 23 at 7:02
I do not have a solution yet. I am trying to get someone from Microsoft product group to come back with some insights and I will report any findings here. This is pretty critical for our customer project as well. What I was going to do in the meantime is try out the latest Databricks cluster (5.3 Beta) to see if that makes a difference.
– mikkark
yesterday
I tried with new 5.3 Beta cluster, no difference, does not work.
– mikkark
yesterday
did you find a solution for this? i am having the same issue
– chathux
Mar 23 at 7:02
did you find a solution for this? i am having the same issue
– chathux
Mar 23 at 7:02
I do not have a solution yet. I am trying to get someone from Microsoft product group to come back with some insights and I will report any findings here. This is pretty critical for our customer project as well. What I was going to do in the meantime is try out the latest Databricks cluster (5.3 Beta) to see if that makes a difference.
– mikkark
yesterday
I do not have a solution yet. I am trying to get someone from Microsoft product group to come back with some insights and I will report any findings here. This is pretty critical for our customer project as well. What I was going to do in the meantime is try out the latest Databricks cluster (5.3 Beta) to see if that makes a difference.
– mikkark
yesterday
I tried with new 5.3 Beta cluster, no difference, does not work.
– mikkark
yesterday
I tried with new 5.3 Beta cluster, no difference, does not work.
– mikkark
yesterday
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042885%2faccess-unmanaged-external-azure-databricks-hive-table-via-jdbc%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042885%2faccess-unmanaged-external-azure-databricks-hive-table-via-jdbc%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
did you find a solution for this? i am having the same issue
– chathux
Mar 23 at 7:02
I do not have a solution yet. I am trying to get someone from Microsoft product group to come back with some insights and I will report any findings here. This is pretty critical for our customer project as well. What I was going to do in the meantime is try out the latest Databricks cluster (5.3 Beta) to see if that makes a difference.
– mikkark
yesterday
I tried with new 5.3 Beta cluster, no difference, does not work.
– mikkark
yesterday