search by tags

for the user

adventures into the land of the command line

uploading files in a flask\gunicorn\nginx app

In this stack, nginx is a reverse proxy for the gunicorn web server. First of all, you need to specify in you nginx vhost config, some data size which can be proxied by the server per request.

server {
    .
    .
    location / {
        .
        .
        proxy_buffer_size       128k;
        proxy_buffers           4 256k;
        proxy_busy_buffers_size 256k;
    }
    .
    .
}

If you don’t do this, you will see a 502 error in your browser when you eventually try to upload a file to your server, and yout nginx error log will show something like:

2017/04/14 15:14:53 [error] 5675#0: *5 upstream sent too big header while reading response header from upstream, client: 192.168.50.1, server: dev.myapp.com, request: "POST /dataimport HTTP/1.1", upstream: "http://0.0.0.0:5001/dataimport", host: "dev.myapp.com", referrer: "http://dev.myapp.com/

Next, in your gunicorn start command, you need to add this limit-request-line option to the end and specify a higher than default (4096) value:

command=/usr/local/bin/gunicorn -w 1 -k eventlet -b 0.0.0.0:5001 -t 60 --limit-request-line 8190 index:app

If not, you will see a 400 Bad Request in your browser coming from gunicorn, with some additional information on the page like:

Request Line is too large (6060 > 4094)

Ok, lastly, the actual code. In your flask app, do like this:

from flask import Flask, request, Response
from werkzeug.utils import secure_filename

upload_folder = './imports'
allowed_extensions = ['.txt']

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = upload_folder

@app.route('/dataimport', methods=['POST', 'OPTIONS'])
def dataimport():
    page_name = 'dataimport'
    if request.method == 'POST':
        the_file = request.files.getlist('fileupload')[0]
        if the_file.filename[-4:] in allowed_extensions:
            filename = secure_filename(the_file.filename)
            full_save_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
            the_file.save(full_save_path)
            data = open(full_save_path, 'r').read()
            the_file.close()
            os.remove(full_save_path)
            return Response(data)
        else:
            return Response(error)
    else:
        return Response(error)

It’s important to note this line in the code:

the_file = request.files.getlist('fileupload')[0]

request.files.getlist will return data uploaded via an html form. The data structure of ’.files’, is an ImmutableMultiDict, which is accessible with ’.getlist’. The list element will be a werkzeug FileStorage data structure, which can be accessed with a few methods, like filename, save, close, etc.

You will have to test it with an html form upload.

If you don’t use an html form, you can change the line to:

the_file = request.files['file']

(You may have to edit some of the code below it too..)

Then you can test it out with something like curl:

$ curl -X "POST" -F [email protected]_to_import.txt http://dev.myapp.com/dataimport

Where ‘file’ is the key that request.files looks for.

One other caveat when POSTing files with curl is you need to change into the directory where the file exists. Using relative or absolute paths for some reason makes it behave strangely. As it’s just for testing, just change into the working directory of the file you want to upload.