HUE-1660 [core] Export/import all stored scripts

Review Request #4292 — Created March 25, 2014 and updated

abec
old-hue-rw
HUE-1660
hue
enricoberti, romain
commit fb8d9e4a9b2d618c418460d8ced46f57fe5bea81
Author: Abraham Elmahrek <abraham@elmahrek.com>
Date:   Tue Mar 25 16:45:17 2014 -0700

    HUE-1660 [core] Export/import all stored scripts
    
    Use python pickle libraries for serialization.
    Create hierarchy of relationships by providing wrapper objects that store
    m2m, m2o, and gfk information.

:100644 100644 c2c16b4... 042f75d... M	desktop/core/src/desktop/api.py
:100644 100644 387da65... 8618d7f... M	desktop/core/src/desktop/api_tests.py
:000000 100644 0000000... 393f2fc... A	desktop/core/src/desktop/forms.py
:100644 100644 3c6085d... 6bb878b... M	desktop/core/src/desktop/lib/django_util.py
:000000 100644 0000000... 4a28696... A	desktop/core/src/desktop/lib/document_serializers.py
:000000 100644 0000000... 4d3791e... A	desktop/core/src/desktop/migrations/0008_auto__add_field_document_uid.py
:000000 100644 0000000... 1a548a4... A	desktop/core/src/desktop/migrations/0009_initial_uuid_values.py
:100644 100644 4638ad7... 758d6d0... M	desktop/core/src/desktop/models.py
:100644 100644 042904f... 50d8b59... M	desktop/core/src/desktop/templates/home.mako
:100644 100644 f565b40... 35bbc84... M	desktop/core/src/desktop/urls.py
:100644 100644 4e60ade... 5709572... M	desktop/core/src/desktop/views.py
Prototype #4: Use python pickle to serialize and deserialize objects with the hierarchy defined via inheritance in wrapper objects.

Could serialize/deserialize an oozie workflow and hive saved query.
Can serialize/deserialize oozie examples.

pushed to https://github.com/cloudera/hue/tree/HUE-1660-abe
  • 6
  • 0
  • 0
  • 0
  • 6
Description From Last Updated
powerful :) romain romain
aka history romain romain
document.is_historic() ? instead romain romain
Should we add a last modified date in the doc and only update the old ones? This is good to ... romain romain
we lose all the current tags and perms? (probably ok for now until we get the good solution) romain romain
for a query, we need to save/load all the settings, design params etc too, not just the SQL? romain romain
abec
Review request changed

Testing Done:

   

Prototype #4: Use python pickle to serialize and deserialize objects with the hierarchy defined via inheritance in wrapper objects.

   
~  

Could serialize/deserialize an oozie workflow and hive saved query.

  ~

Could serialize/deserialize an oozie workflow and hive saved query.

  + Can serialize/deserialize oozie examples.

  +
  +

pushed to https://github.com/cloudera/hue/tree/HUE-1660-abe

Diff:

Revision 2 (+854 -3)

Show changes

romain
  1. Some small comments here and there. It looks better now but wondering if we could avoid some of the logic.
    
    
    #0
    http://django-extensions.readthedocs.org/en/latest/dumpscript.html
    
    https://github.com/jsonpickle/jsonpickle
    
    http://stackoverflow.com/questions/6578986/how-to-convert-json-data-into-a-python-object
    
    
    Other idea #1, a bit like we massage and send json to the UI, we serialize all the content of an object manually. Pros: simple logic. Cons: bit tedious and manual
    doc 1: {
      'uuid': 'AAAA',
      'last_modified': '2014-01-01',
      'type': 'hql',
      'hql': 'SELECT * FROM sammple' 
    }
    
    doc 2: {
       'uuid': 'BBBB',
       'last_modified': '2014-01-01',
       'nodes': {
          'node1': {.... 'ok': 'node2'},
          'node2': {.... 'ok': 'node3'},
        ]
    }
    
    #2
    Is there a way to load all the related objects of a workflow and the use json pickle?
    
    Or a mix of above
    
    #3 or we do a dumpdata but use a modified loader that checks if some PKs are used, build a mapping of id <--> uuid and then we load the json and replace the ids the loaddata it
    
    #4 or natural keys everywhere with uuids (but not backward compatible)
    
    1. #0: jsonpickle can replace pickle in this case I think. I need to check out dumpscript!
      
      #1: It's much simpler, but some generic logic makes sense to me. What we have now is even too specific IMO.
      
      #2: JSON Pickle and Pickle perform the same task really.
      
      #3: We run into the same ordering problems as we had in prototype 1 through 3.
      
      #4: I tried this, but it didn't work for some reason. I think it's because of the circular dependencies. Will check again to be sure though. Also, this doesn't take care of primary keys... so we actually run into a similar problem that was resolved in prototype #3.
      
      I really need to investigate dumpscript... this might be the generic solution we're looking for!
    2. #0: Unfortunately, dumpscript doesn't support generic foreign keys and circular dependencies (not well enough at least).
      
      #2: This seems like a great idea and so far is working in testing!
  2. apps/beeswax/src/beeswax/models.py (Diff revision 2)
     
     
    JFI: might be a problem with history and when people named 2 querie with same name. maybe add an "Name (id)"
    
    .hive --> .hql
  3. desktop/core/src/desktop/api_tests.py (Diff revision 2)
     
     
    powerful :)
  4. desktop/core/src/desktop/models.py (Diff revision 2)
     
     
    aka history
  5. desktop/core/src/desktop/models.py (Diff revision 2)
     
     
    document.is_historic() ? instead
  6. desktop/core/src/desktop/models.py (Diff revision 2)
     
     
    Should we add a last modified date in the doc and only update the old ones?
    
    This is good to delete everything but could be risky too
  7. desktop/core/src/desktop/models.py (Diff revision 2)
     
     
    we lose all the current tags and perms?
    
    (probably ok for now until we get the good solution)
  8. desktop/core/src/desktop/models.py (Diff revision 2)
     
     
    for a query, we need to save/load all the settings, design params etc too, not just the SQL?
    1. For this one, I just made the HQL modifiable. Maybe this feature isn't really needed? Or maybe a JSON structure is preferred? From my perspective the user will want to see just the HQL though.
  9. 
      
romain
  1. Some small comments here and there. It looks better now but wondering if we could avoid some of the logic.
    
    EDIT: all my inline comments are gone!
    
    #0
    http://django-extensions.readthedocs.org/en/latest/dumpscript.html
    
    https://github.com/jsonpickle/jsonpickle
    
    http://stackoverflow.com/questions/6578986/how-to-convert-json-data-into-a-python-object
    
    
    Other idea #1, a bit like we massage and send json to the UI, we serialize all the content of an object manually. Pros: simple logic. Cons: bit tedious and manual
    doc 1: {
      'uuid': 'AAAA',
      'last_modified': '2014-01-01',
      'type': 'hql',
      'hql': 'SELECT * FROM sammple' 
    }
    
    doc 2: {
       'uuid': 'BBBB',
       'last_modified': '2014-01-01',
       'nodes': {
          'node1': {.... 'ok': 'node2'},
          'node2': {.... 'ok': 'node3'},
        ]
    }
    
    #2
    Is there a way to load all the related objects of a workflow and the use json pickle?
    
    Or a mix of above
    
    #3 or we do a dumpdata but use a modified loader that checks if some PKs are used, build a mapping of id <--> uuid and then we load the json and replace the ids the loaddata it
    
    #4 or natural keys everywhere with uuids (but not backward compatible)
    
    1. actually it worked above !
  2. 
      
Loading...