[Debug] Code 502: Error as sending POST requests iteratively to OCR API deployed on Google App Engine

OCR Service Brief:

POST a file -> OCR service interfaces API (on APP Engine) : encoded file as base64 string -> OCR backend engine

Issue Description:

An OCR service (Flask REST API) deployed on GAE (Google App Engine) cannot respond correctly when users use iteratively send POST requests. Sample code is shown below:

for i, path in enumerate(path_list):
    file_path = str(path)
    perform_ocr(file_path)
    #sleep(5)

Debugging:

  1. Calling the service file by file separately doesn't occur errors.
  2. Service works well on the local laptop but has issues after being deployed on Google App Engine.
  3. Using sleep() function which can alleviate the issue but does not entirely solve it. The service would be stuck sometimes.
  4. After tracing the code, the error occurred as the OCR API server does not actually receive the POST file content but it does get the file name of the file. Therefore, OCR backend engine doesn't get the to-be-processed file and respond 500 code.
  5. The error ('Connection aborted.', OSError("(32, 'EPIPE')")) occurs sometimes.

Solutions:

  1. Make the program sleep for n seconds (i.e., sleep(n)) between two consecutive requests and resend the file until it is processed correctly while the server can not get the file content properly. Gradually increase sleep period (i.e., n+=1) once the request fails again.
  2. Try to catch the exception caused by ('Connection aborted.', OSError("(32, 'EPIPE')")) and send the request again.
  3. It seems using JSON string to transmit files could be more robust. Probably we should change the interface to force users to send the base64 strings instead of raw files.

留言

這個網誌中的熱門文章

Python: Using pydot and Graphviz to generate PDF file for Decision Tree

Python: List Comprehensions