HUE-7402 [core] Truncate SEARCH Column in Document2 for Oracle database

Review Request #11692 — Created Oct. 2, 2017 and submitted

yingc
hue
master
HUE-7402
hue
enricoberti, jgauthier, johan, krish, romain, weixia
commit 41746e058ab63eb10dbf64260be601dd301f91e2
Author: Ying Chen <yingchen@cloudera.com>
Date:   Mon Oct 2 14:25:29 2017 -0700

    HUE-7402 [core] Truncate SEARCH Column in Document2 for Oracle database
    
      Searching saved documents in Oracle returns following error:
        DatabaseError: ORA-06502: PL/SQL: numeric or value error: character string buffer too small
      This is caused by DBMS_LOB.SUBSTR(%s, 4000) in Django framework django/db/backends/oracle/base.py
      Django has a ticket for this issue but unfixed: https://code.djangoproject.com/ticket/11580.
      Buffer size 4000 limit the length of field equals or less than 2000 characters.

:000000 100644 0000000000... 593a415514... A    desktop/core/src/desktop/migrations/0027_truncate_documents.py
:100644 100644 aef7652dfe... 99e586238c... M    desktop/core/src/desktop/models.py
:100644 100644 dac1a9aa8c... 2dafdd9e2a... M    desktop/core/src/desktop/settings.py


  • 0
  • 0
  • 8
  • 0
  • 8
Description From Last Updated
romain
  1. Could we have 2000 has a constant? Also explaining why 2000?

    We also need to truncate when we set the field. e.g. for the editor this is https://github.com/cloudera/hue/blob/master/desktop/libs/notebook/src/notebook/api.py#L335

    (might want to do it here explicitely or in the save methode of Document2 model?)

  2. In a comment, could you add reference to the Django bug and when we hit it?

    (and add it to the jira description too)

  3. 
      
yingc
romain
  1. 
      
  2. Shouldn't we do it for any DB to not slow down the search anyway?

    Then maybe better in the Document2.save method?

    1. Document2.save got called twice. The first time is from https://github.com/cloudera/hue/blob/master/desktop/libs/notebook/src/notebook/api.py#L321 and the search field was empty.

    2. The second time is from #L338, and the search field is updated by getting from _get_statement(notebook). So maybe current changes is better than Document2.save.
  3. 
      
yingc
romain
  1. 
      
  2. Didn't we said we would move this to document2 save method?

  3. 
      
yingc
yingc
romain
  1. Just renaming the constant and adding a comment and looks good!

  2. We only truncate for Oracle, but this is good enough (other fields will be truncated over time via the save method).

  3. desktop/core/src/desktop/settings.py (Diff revision 4)
     
     

    Also add: for performance reasons and to avoid searching in huge fields, we also trunctae to a max length

  4. desktop/core/src/desktop/settings.py (Diff revision 4)
     
     

    ORACLE_DOCUMENT2_SEARCH_LENGTH --> DOCUMENT2_SEARCH_MAX_LENGTH
    ?

  5. 
      
yingc
romain
  1. 
      
  2. desktop/core/src/desktop/settings.py (Diff revision 5)
     
     
    Add a new line to split the sections?
    
    ############################################################
    # Searching saved documents in Oracle returns following error:
    #   DatabaseError: ORA-06502: PL/SQL: numeric or value error: character string buffer too small
    # This is caused by DBMS_LOB.SUBSTR(%s, 4000) in Django framework django/db/backends/oracle/base.py
    # Django has a ticket for this issue but unfixed: https://code.djangoproject.com/ticket/11580.
    # Buffer size 4000 limit the length of field equals or less than 2000 characters.
    #
    # For performance reasons and to avoid searching in huge fields, we also trunctae to a max length
  3. desktop/core/src/desktop/settings.py (Diff revision 5)
     
     

    trunctae --> truncate

  4. 
      
yingc
yingc
Review request changed

Status: Closed (submitted)

Loading...