HUE-8882 [impala] Fix invalidate delta when hive is missing.

Review Request #14033 - Created June 20, 2019 and submitted

Jean-Francois Desjeans Gauthier
hue
master
HUE-8882
hue
jgauthier, romain
commit c55cc84b407b9585cc7a23e09e858a7838e367aa
Author: Jean-Francois Desjeans Gauthier <jf.desjeans.gauthier@gmail.com>
Date:   Thu Jun 20 14:25:08 2019 -0700

    HUE-8882 [impala] Fix invalidate delta when hive is missing.

:100644 100644 be90bc711e... 9a6a8b321a... M	apps/impala/src/impala/dbms.py


  • 0
  • 0
  • 14
  • 0
  • 14
Description From Last Updated
  1. We should update the UI at some point to not show it or do all the logic in the backgrond but already a nice improvement with this.

  2. apps/impala/src/impala/dbms.py (Diff revision 1)
     
     
     

    How about moving all the new logic to desktop.models in a utility?

    e.g.

    def has_hive_metastore():
    ...

    that way we keep it clean here, and it will be easier to unit test and evolve it with the connector revamps. And later we could check for SparkSql etc in the utility function without leaking.

  3. apps/impala/src/impala/dbms.py (Diff revision 1)
     
     

    Move diretly to _get_beeswax_tables() ?

    (more isolated this way and same we can clean-up more easily with the connector refactoring)

  4. apps/impala/src/impala/dbms.py (Diff revision 1)
     
     

    If more than 10 tables, how about just failing with a message saying to do the full invalidate as 'too many tables (N) to invalidate...'?

  5. 
      
  1. 
      
  2. apps/impala/src/impala/dbms.py (Diff revision 3)
     
     

    Let me clean-up the API so that we don't need to do this

  3. apps/impala/src/impala/dbms.py (Diff revision 3)
     
     

    return?

  4. apps/impala/src/impala/dbms.py (Diff revision 3)
     
     

    How about failing and saying: please do a full refresh instead?
    (better than doing something else than ask in the UI, which might also fail if not full INVALIDATE perms anyway too)

    (and at some point we update the UI to adequately look)

  5. apps/impala/src/impala/dbms.py (Diff revision 3)
     
     

    How about renaming

    has_hive_metastore --> get_hive_metastore_interepreters():

    then just pick yp the first from there.

  6. apps/impala/src/impala/tests.py (Diff revision 3)
     
     

    This might fail as not builtin in py2 AFAIK

  7. apps/impala/src/impala/tests.py (Diff revision 3)
     
     

    This should break if not mocked?

    Also testing only full db invalidate

  8. desktop/core/src/desktop/models.py (Diff revision 3)
     
     
     

    Combine in one line with

    [ .... if interpreter == 'hive' or interpreter == 'hms', interpreters)]

    in the first list comprehension?

    Then returning the list

  9. 
      
  1. Nice!

    Maybe a few ideas to simplify a bit. Also using assert_called_once() in the Mock() might be great here (as we could prove there is no more 10s of call)?

  2. apps/impala/src/impala/dbms.py (Diff revision 6)
     
     

    nit:

    %s --> %d
    drop 'str()

    ?

  3. apps/impala/src/impala/dbms.py (Diff revision 6)
     
     
     

    How about just:

    interpreters = Cluster(self.client.user).get_app_config().get_hive_metastore_interpreters()]

    and we manage the case interpreter == 'hive' directy inside beeswax_query_server_config()?

    e.g.
    if name == 'hive':
    name == 'beeswax'

    And we can see later for the ordering of the hive connectors.

  4. apps/impala/src/impala/tests.py (Diff revision 6)
     
     

    assert_equal the hql query or make sure it was called once?

    assert_called_once()
    assert_called_once_with()

    https://docs.python.org/3/library/unittest.mock.html

  5. desktop/core/src/desktop/models.py (Diff revision 6)
     
     

    Simpler?

    return [interpreter['type'] for interpreter in get_ordered_interpreters(self.user) if interpreter == 'hive' or interpreter == 'hms']

  6. 
      
Review request changed

Status: Closed (submitted)

Loading...