HUE-7126 [oozie] Improve Spark action to pick up the hive-site.xml easier

Review Request #11436 - Created Aug. 25, 2017 and submitted

Ying Chen
hue
master
HUE-7126
hue
enricoberti, jgauthier, johan, krish, ranade, romain, subrata
commit 5a396ed743ec76caaa41b12bb7c36700c35d8193
Author: Ying Chen <yingchen@cloudera.com>
Date:   Fri Aug 25 14:35:24 2017 -0700

    HUE-7126 [oozie] Improve Spark action to pick up the hive-site.xml easier

:100644 100644 5c1d07a27d... c4069b4d24... M    apps/beeswax/src/beeswax/hive_site.py
:100644 100644 16564fd663... a0f9292b5b... M    desktop/libs/liboozie/src/liboozie/submission2.py


  • 0
  • 0
  • 9
  • 0
  • 9
Description From Last Updated
  1. 
      
  2. Move the logic to deploy() ?

    https://github.com/cloudera/hue/blob/master/desktop/libs/liboozie/src/liboozie/submission2.py#L195

    Run should be only for starting the job exection.

  3. from hadoop.fs.hadoopfs import Hdfs

    Should be moved with the top imports

  4. Usually simpler:

    if [f for f in node.data.get('properties').get('files', []) if if f.get('value').endswith('hive-site.xml')]:

  5. How about adding a

    get_hive_site_content():

    in

    https://github.com/cloudera/hue/blob/master/apps/beeswax/src/beeswax/hive_site.py#L187

    ?

  6. 
      
  1. 
      
  2. apps/beeswax/src/beeswax/hive_site.py (Diff revision 2)
     
     

    How about having this one only return the content of the file?

    return file(hive_site_path, 'r').read()

  3. How about splitting it so so we check
    1. are we on a spark action (in the future we might have additional logic to add for the action)
    2. check parameters first instead of hdfs
    3. build hdfs path once and check

    elif action.data['type'] == 'spark':
    if not [f for f in action.data.get('properties').get('files', []) if f.get('value').endswith('hive-site.xml')]:
    hive_site_lib = Hdfs.join(self.job.deployment_dir + '/lib/', 'hive-site.xml')
    if not exist on HDFS already:
    create

  4. 
      
  1. 
      
  2. apps/beeswax/src/beeswax/hive_site.py (Diff revision 3)
     
     
     

    One thing: could we return '' if the file does not exist?

    (and log as LOG.warn)

  3. Move to above line #29?

    (with the other Hue lib imports)

  4. And then we could skip writtring the file if get_hive_site_content() is empty

  5. 
      
  1. Ship It!
  2. 
      
Review request changed

Status: Closed (submitted)

Loading...