HUE-8587 [jobbrowser] Fix JB queries to handle multiple coordinators

Review Request #13383 - Created Sept. 24, 2018 and updated

Chris Conner
hue
master
HUE-8587
hue
enricoberti, jgauthier, johan, ranade, romain, roohi, weixia, yingc
commit f0307a0667b121aab771c37b6de9c2ba0ebefd7c
Author: Chris Conner <cconner@cloudera.com>
Date:   Mon Sep 24 14:57:37 2018 -0400

    HUE-8587 [jobbrowser] Fix JB queries to handle multiple coordinators

:100644 100644 e8562d5376... 8e613279e9... M	apps/impala/src/impala/server.py
:100644 100644 4236184644... ecb9995ab3... M	apps/jobbrowser/src/jobbrowser/api2.py
:100644 100644 cb930aa695... 7a91dc31ef... M	apps/jobbrowser/src/jobbrowser/apis/base_api.py
:100644 100644 bbbe18e3a4... 78b1843d05... M	apps/jobbrowser/src/jobbrowser/apis/query_api.py
:100644 100644 246284e503... d1b770fc0a... M	desktop/core/src/desktop/lib/thrift_util.py

Tested to make sure we don't lose any jobs in the MR Jobs page, Oozie jobs page. Validated that we see queries from multiple coordinators. Validated they show up in reverse sorted order. Ran under load to make sure I see tons of queries.

  • 2
  • 0
  • 5
  • 0
  • 7
Description From Last Updated
cf. above, tricky to add some states without a common django.cache among Hue's instance, but there is not much choice ... Romain Rigaux
Needed in both get and return? (not just one?) Romain Rigaux
  1. 
      
  2. apps/jobbrowser/src/jobbrowser/api2.py (Diff revision 1)
     
     

    Would it make sense to run this in parallel?

    1. In practice, +1 to keep it simple like now without any multithreading (we are supposed to be in one cherrypy thread, and same with gunicorn). We should get no more than 5 coordinators in big setups AFAIK and job browser refresh timeout kicks in after the initial call returned

  3. 
      
  1. Not bad, just a few nits!

  2. apps/jobbrowser/src/jobbrowser/api2.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    Could we move this inside https://github.com/cloudera/hue/blob/master/apps/jobbrowser/src/jobbrowser/apis/query_ ? (so that we keep the main API clean)

  3. apps/jobbrowser/src/jobbrowser/api2.py (Diff revision 1)
     
     

    If one of the coordinators is gone, should we catch the exception and remove it from the list?

  4. cf. above, tricky to add some states without a common django.cache among Hue's instance, but there is not much choice IIRC

    1. This one has me a bit confused, do I need to make a change here or just making note?

  5. nit: usually global vars are upper case

  6. nit: maybe with also

    ENABLE_SMART_THRIFT_POOL.get() and

  7. Needed in both get and return? (not just one?)

    1. After a restart, it would take a bit before the get would get called and sometimes the return would get called sooner. And other times the get would get called sooner. So I did both to try and populate the coordinator_list as quickly as possible to not miss queries. Think this is OK? Otherwise I can do return as it seems to get called first slightly more often.

  8. 
      
Review request changed
Loading...