FAQ
When i run a query from the hive command line client i can see that it is
being run as me (for example, in HDFS log i see INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=koert).

But when i do anything with the thrift interface my username is lost (i see
ugi=thrift in HDFS logs). Is there a way in the thrift interface to
communicate/preserve the username?
And if this is possible in thrift, then what about jdbc? i tried creating a
jdbc connection with username and password passed in but as far as i can see
it is ignored (ugi=thrift again in the HDFS logs).

Search Discussions

  • Ashutosh Chauhan at Sep 6, 2011 at 3:47 pm
    Hey Koert,

    I am assuming 'thrift' is the name of user through which thrift metastore is
    running. I also assume you are running in unsecure mode. If you run with
    security turned on, meaning secure hadoop cluster with secure thrift server,
    you will see the name of the original user. This is so because in secure
    mode, metastore server proxies the original user through doAs() which
    preserves the identity which is not the case in unsecure mode.
    Through hive client you see the usernames correctly even In unsecure mode
    because its a hive client process (which is run as koert) which does the
    filesystem operations.

    Hope it helps,
    Ashutosh
    On Tue, Sep 6, 2011 at 08:22, Koert Kuipers wrote:

    When i run a query from the hive command line client i can see that it is
    being run as me (for example, in HDFS log i see INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=koert).

    But when i do anything with the thrift interface my username is lost (i see
    ugi=thrift in HDFS logs). Is there a way in the thrift interface to
    communicate/preserve the username?
    And if this is possible in thrift, then what about jdbc? i tried creating a
    jdbc connection with username and password passed in but as far as i can see
    it is ignored (ugi=thrift again in the HDFS logs).
  • Koert Kuipers at Sep 6, 2011 at 4:09 pm
    The metastore is running as user "hive", and we are indeed running unsecured
    mode.
    Do i understand it correctly that in the thrift interface does provide a way
    to communicate the identity but in unsecured mode it is not being used?
    And does this mean that if i care about seeing the correct user execute the
    query in the logs, i have to use secure hadoop (with Kerberos)?
    Does secure mode suport hive JDBC?
    Thanks! Koert
    On Tue, Sep 6, 2011 at 11:47 AM, Ashutosh Chauhan wrote:

    Hey Koert,

    I am assuming 'thrift' is the name of user through which thrift metastore
    is running. I also assume you are running in unsecure mode. If you run with
    security turned on, meaning secure hadoop cluster with secure thrift server,
    you will see the name of the original user. This is so because in secure
    mode, metastore server proxies the original user through doAs() which
    preserves the identity which is not the case in unsecure mode.
    Through hive client you see the usernames correctly even In unsecure mode
    because its a hive client process (which is run as koert) which does the
    filesystem operations.

    Hope it helps,
    Ashutosh

    On Tue, Sep 6, 2011 at 08:22, Koert Kuipers wrote:

    When i run a query from the hive command line client i can see that it is
    being run as me (for example, in HDFS log i see INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=koert).

    But when i do anything with the thrift interface my username is lost (i
    see ugi=thrift in HDFS logs). Is there a way in the thrift interface to
    communicate/preserve the username?
    And if this is possible in thrift, then what about jdbc? i tried creating
    a jdbc connection with username and password passed in but as far as i can
    see it is ignored (ugi=thrift again in the HDFS logs).
  • Ashutosh Chauhan at Sep 7, 2011 at 3:29 pm
    I assume when you say thrift interface, you mean a separate metastore
    process running. If so,
    Do i understand it correctly that in the thrift interface does provide a
    way to communicate the identity but in unsecured mode it is not being used?
    Yes. Better way to say this is identity is communicated only in case of
    secure mode.
    And does this mean that if i care about seeing the correct user execute
    the query in the logs, i have to use secure hadoop (with Kerberos)?
    Yes. Though, technically it is possible to achieve this even without secure
    hadoop, its not the case currently mainly because logging identities in
    unsecure environment is anyway useless since one can easily impersonate
    another and whole point of logging is lost then.
    Does secure mode suport hive JDBC?
    I am not sure about this. Do you mean users and their roles as they exist in
    hive metastore and if you make a jdbc connection using credentials stored in
    it?

    By the way, I am still confused about user "thrift". Is there any process
    run by user "thrift"

    Hope it helps,
    Ashutosh
    On Tue, Sep 6, 2011 at 09:09, Koert Kuipers wrote:

    The metastore is running as user "hive", and we are indeed running
    unsecured mode.
    Do i understand it correctly that in the thrift interface does provide a
    way to communicate the identity but in unsecured mode it is not being used?
    And does this mean that if i care about seeing the correct user execute the
    query in the logs, i have to use secure hadoop (with Kerberos)?
    Does secure mode suport hive JDBC?
    Thanks! Koert

    On Tue, Sep 6, 2011 at 11:47 AM, Ashutosh Chauhan wrote:

    Hey Koert,

    I am assuming 'thrift' is the name of user through which thrift metastore
    is running. I also assume you are running in unsecure mode. If you run with
    security turned on, meaning secure hadoop cluster with secure thrift server,
    you will see the name of the original user. This is so because in secure
    mode, metastore server proxies the original user through doAs() which
    preserves the identity which is not the case in unsecure mode.
    Through hive client you see the usernames correctly even In unsecure mode
    because its a hive client process (which is run as koert) which does the
    filesystem operations.

    Hope it helps,
    Ashutosh

    On Tue, Sep 6, 2011 at 08:22, Koert Kuipers wrote:

    When i run a query from the hive command line client i can see that it is
    being run as me (for example, in HDFS log i see INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=koert).

    But when i do anything with the thrift interface my username is lost (i
    see ugi=thrift in HDFS logs). Is there a way in the thrift interface to
    communicate/preserve the username?
    And if this is possible in thrift, then what about jdbc? i tried creating
    a jdbc connection with username and password passed in but as far as i can
    see it is ignored (ugi=thrift again in the HDFS logs).
  • Koert Kuipers at Sep 7, 2011 at 5:16 pm
    ah sorry the user "thrift" that was a typo by me. it actually says ugi=hive
    in my logs. i missed that the first time you asked.

    regarding JDBC, i was mainly interested if JDBC would be able to make a
    connection to a secure cluster at all, and if so yes then my question would
    be if it uses the credentials of the jdbc connection.

    thanks again! best koert
    On Wed, Sep 7, 2011 at 11:29 AM, Ashutosh Chauhan wrote:

    I assume when you say thrift interface, you mean a separate metastore
    process running. If so,
    Do i understand it correctly that in the thrift interface does provide a
    way to communicate the identity but in unsecured mode it is not being used?
    Yes. Better way to say this is identity is communicated only in case of
    secure mode.
    And does this mean that if i care about seeing the correct user execute
    the query in the logs, i have to use secure hadoop (with Kerberos)?
    Yes. Though, technically it is possible to achieve this even without secure
    hadoop, its not the case currently mainly because logging identities in
    unsecure environment is anyway useless since one can easily impersonate
    another and whole point of logging is lost then.
    Does secure mode suport hive JDBC?
    I am not sure about this. Do you mean users and their roles as they exist
    in hive metastore and if you make a jdbc connection using credentials stored
    in it?

    By the way, I am still confused about user "thrift". Is there any process
    run by user "thrift"

    Hope it helps,
    Ashutosh
    On Tue, Sep 6, 2011 at 09:09, Koert Kuipers wrote:

    The metastore is running as user "hive", and we are indeed running
    unsecured mode.
    Do i understand it correctly that in the thrift interface does provide a
    way to communicate the identity but in unsecured mode it is not being used?
    And does this mean that if i care about seeing the correct user execute
    the query in the logs, i have to use secure hadoop (with Kerberos)?
    Does secure mode suport hive JDBC?
    Thanks! Koert

    On Tue, Sep 6, 2011 at 11:47 AM, Ashutosh Chauhan wrote:

    Hey Koert,

    I am assuming 'thrift' is the name of user through which thrift metastore
    is running. I also assume you are running in unsecure mode. If you run with
    security turned on, meaning secure hadoop cluster with secure thrift server,
    you will see the name of the original user. This is so because in secure
    mode, metastore server proxies the original user through doAs() which
    preserves the identity which is not the case in unsecure mode.
    Through hive client you see the usernames correctly even In unsecure mode
    because its a hive client process (which is run as koert) which does the
    filesystem operations.

    Hope it helps,
    Ashutosh

    On Tue, Sep 6, 2011 at 08:22, Koert Kuipers wrote:

    When i run a query from the hive command line client i can see that it
    is being run as me (for example, in HDFS log i see INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=koert).

    But when i do anything with the thrift interface my username is lost (i
    see ugi=thrift in HDFS logs). Is there a way in the thrift interface to
    communicate/preserve the username?
    And if this is possible in thrift, then what about jdbc? i tried
    creating a jdbc connection with username and password passed in but as far
    as i can see it is ignored (ugi=thrift again in the HDFS logs).

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedSep 6, '11 at 3:22p
activeSep 7, '11 at 5:16p
posts5
users2
websitehive.apache.org

2 users in discussion

Koert Kuipers: 3 posts Ashutosh Chauhan: 2 posts

People

Translate

site design / logo © 2021 Grokbase