Monday, December 14, 2015

Google Drive: Uploading & Downloading files with Python

UPDATE: Since this post was published, the Google Drive team released a newer version of their API. After reading this one, go to the next post to learn about migrating your app from v2 to v3 as well as link to my video which walks through the code samples in both posts.

Introduction

So far in this series of blogposts covering authorized Google APIs, we've used Python to access Google Drive, Gmail, and Google Calendar. Today, we're revisiting Google Drive with a small snippet that uploads plain text files to Drive, with & without conversion to a Google Apps format (Google Docs), then exports & downloads the converted one as PDF®.

Earlier posts demonstrated the structure and "how-to" use Google APIs in general, so more recent posts, including this one, focus on solutions and apps, and use of specific APIs. Once you review the earlier material, you're ready to start with authorization scopes then see how to use the API itself.

    Google Drive API Scopes

    Google Drive features numerous API scopes of authorization. As usual, we always recommend you use the most restrictive scope possible that allows your app to do its work. You'll request fewer permissions from your users (which makes them happier), and it also makes your app more secure, possibly preventing modifying, destroying, or corrupting data, or perhaps inadvertently going over quotas. Since we need to upload/create files in Google Drive, the minimum scope we need is:
    • 'https://www.googleapis.com/auth/drive' — Read/write access to Drive

    Using the Google Drive API

    Let's get going with our example today that uploads and downloads a simple plain text file to Drive. The file will be uploaded twice, once as-is, and the second time, converted to a Google Docs document. The last part of the script will request an export of the (uploaded) Google Doc as PDF and download that from Drive.

    Since we've fully covered the authorization boilerplate fully in earlier posts and videos, we're going to skip that here and jump right to the action, creating of a service endpoint to Drive. The API name is (of course) 'drive', and the current version of the API is 2, so use the string 'v2' in this call to the apiclient.discovey.build() function:

    DRIVE = build('drive', 'v2', http=creds.authorize(Http()))

    Let's also create a FILES array object (tuple, list, etc.) which holds 2-tuples of the files to upload. These pairs are made up of a filename and a flag indicating whether or not you wish the file to be converted to a Google Apps format:
    FILES = (
        ('hello.txt', False),
        ('hello.txt', True),
    )
    Since we're uploading a plain text file, a conversion to Apps format means Google Docs. (You can imagine that if it was a CSV file, the target format would be Google Sheets instead.) With the setup complete, let's move on to the code that performs the file uploads.

    We'll loop through FILES, cycling through each file-convert flag pair and call the files.insert() method to perform the upload. The four parameters needed are: 1) the conversion flag, 2) the file metadata, which is only the filename (see below), 3) the media_body, which is also the filename but has a different purpose — it specifies where the file content will come from, meaning the file will be opened and its data transferred to the API, and 4), a set of fields you want returned.
    for filename, convert in FILES:
        metadata = {'title': filename}
        res = DRIVE.files().insert(convert=convert, body=metadata,
                media_body=filename, fields='mimeType,exportLinks').execute()
        if res:
            print('Uploaded "%s" (%s)' % (filename, res['mimeType']))
    
    It's important to give the fields() parameter because if you don't, more than 30(!) are returned by default from the API. There's no need to waste all that network traffic if all you need are just a couple. In our case, we only want the mimeType, to confirm what the file was saved as, and exportLinks, which we'll explore in a moment. If files are uploaded successfully, the print() lets the user know, and then we move on to the final section of the script.

    Before we dig into the last bit of code, it's important to realize that the res variable still contains the result from the second upload, the one where the file is converted to Google Docs. This is important because this is where we need to extract the download link for the format you want (res['exportLinks'][MIMETYPE]). The way to download the file is to make an authorized HTTP GET call, passing in that link. In our case, it's the PDF version. If the download is successful, the data variable will have the payload to write to disk. If all's good, let the user know:
    if res:
        MIMETYPE = 'application/pdf'
        res, data = DRIVE._http.request(res['exportLinks'][MIMETYPE])
        if data:
            fn = '%s.pdf' % os.path.splitext(filename)[0]
            with open(fn, 'wb') as fh:
                fh.write(data)
            print('Downloaded "%s" (%s)' % (fn, MIMETYPE))
    
    Final note: this code sample is slightly different from previous posts in two big ways: 1) now that the Google APIs Client Library runs on Python 3, I'll try to produce only code samples for this blog that run unmodified under both 2.x and 3.x interpreters — the primary one-line difference being the import of the print() function, and 2) we're going to incorporate the use of the run_flow() function from oauth2client.tools and only fallback to the deprecated run() function if necessary — more info on this change available in this earlier post.

    If you run the script, grant the script access to your Google Drive (via the OAuth2 prompt that pops up in the browser), and then you should get output that looks like this:
    $ python drive_updown3.py # or python3
    Uploaded "hello.txt" (text/plain)
    Uploaded "hello.txt" (application/vnd.google-apps.document)
    Downloaded "hello.pdf" (application/pdf)
    

    Conclusion

    Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!):
    #!/usr/bin/env python
    
    from __future__ import print_function
    import os
    
    from apiclient import discovery
    from httplib2 import Http
    from oauth2client import file, client, tools
    
    SCOPES = 'https://www.googleapis.com/auth/drive'
    store = file.Storage('storage.json')
    creds = store.get()
    if not creds or creds.invalid:
        flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
        creds = tools.run_flow(flow, store)
    DRIVE = discovery.build('drive', 'v2', http=creds.authorize(Http()))
    
    FILES = (
        ('hello.txt', False),
        ('hello.txt', True),
    )
    
    for filename, convert in FILES:
        metadata = {'title': filename}
        res = DRIVE.files().insert(convert=convert, body=metadata,
                media_body=filename, fields='mimeType,exportLinks').execute()
        if res:
            print('Uploaded "%s" (%s)' % (filename, res['mimeType']))
    
    if res:
        MIMETYPE = 'application/pdf'
        res, data = DRIVE._http.request(res['exportLinks'][MIMETYPE])
        if data:
            fn = '%s.pdf' % os.path.splitext(filename)[0]
            with open(fn, 'wb') as fh:
                fh.write(data)
            print('Downloaded "%s" (%s)' % (fn, MIMETYPE))
    
    You can now customize this code for your own needs, for a mobile frontend, sysadmin script, or a server-side backend, perhaps accessing other Google APIs. If you want to see another example of using the Drive API, check out this earlier post listing the files in Google Drive and its accompanying video as well as a similar example in the official docs or its equivalent in Java (server-side, Android), iOS (Objective-C, Swift), C#/.NET, PHP, Ruby, JavaScript (client-side, Node.js, Google Apps Script), or Go. That's it... hope you find these code samples useful in helping you get started with the Drive API!

    UPDATE: Since this post was published, the Google Drive team released a newer version of their API. Go to the next post to learn about migrating your app from v2 to v3 as well as link to my video which walks through the code samples in both posts.

    EXTRA CREDIT: Feel free to experiment and try something else to test your skills and challenge yourself as there's a lot more to Drive than just uploading and downloading files. Experiment with creating folders and manipulate files there, work with a folder of photos and organize them using the image metadata available to you, implement a search engine for your Drive files, etc. There are so many things you can do! 

    15 comments:

    1. I used parts & pieces, including dependencies to make up for the missing autho tools to the google api for deja dupe (authinitication wasn't working oob) I'm working on Lubuntu 18.04 i686-32bit. More to follow as I compile entire set of my last 12 hours. LOL

      ReplyDelete
    2. Replies
      1. Learn how to use the Drive API with PHP from our quickstart page: https://developers.google.com/drive/api/v3/quickstart/php

        Delete
    3. AttributeError: 'Resource' object has no attribute 'insert'
      why?

      ReplyDelete
      Replies
      1. Somehow your code wasn't able to establish an endpoint to the Drive API appropriately. I would suggest getting the basics working first before this example, such as the example in the G Suite APIs intro using the Drive API at http://g.co/codelabs/gsuite-apis-intro. Once you get everything working, update this example so that it works as well.

        Delete
    4. Is it possible to move a file from Google Drive to Google Cloud Storage (bucket)?

      ReplyDelete
      Replies
      1. Absolutely. Interestingly, I've been working on a sample application that describes how to do exactly what you're proposing. It is part of a larger enterprise cloud-based image processing workflow application which I've open sourced at http://github.com/googlecodelabs/analyze_gsimg w/archiving files from Drive to Cloud Storage as the 1st step. Along w/the code sample is a self-paced hands-on tutorial at http://g.co/codelabs/drive-gcs-vision-sheets on building the entire application step-by-step. Moving forward, these "assets" will be publicized in various blog & social posts as well as several upcoming videos if I can get into a recording studio.

        Delete
    5. can u plz help me google drive upload download files java sample code

      ReplyDelete
      Replies
      1. The Drive API's Java Quickstart sample (https://developers.google.com/drive/api/v3/quickstart/java) shows developers how to list files (and IDs) on a user's Google Drive. This REST API is the only option for web *and* mobile apps as the Drive Android API (https://developers.google.com/drive/android/deprecation) has been deprecated.

        Delete
      2. This is useless, Almost impossible to upload a file with Java.

        Delete
      3. Your response is not helpful. What do you mean? Have you tried it? Do you have a code sample? Do you have an error screenshot you can share? You're making a blanket statement without backing it up.

        Delete
    6. can u plz help me google drive upload download files C#.net sample code

      ReplyDelete
      Replies
      1. Everything you need to get started is available in this SO answer I gave a few years ago: https://stackoverflow.com/a/42826839/305689

        Delete
    7. can you please help me google drive download file using curl language

      ReplyDelete
      Replies
      1. Heads-up this won't be easy because the OAuth security handshake requires more effort. This is why my examples have code which use the APIs, and more importantly, the client libraries. Doing it with `curl` (or `wget`) is doing it manually, and there are many more steps to make it happen. If you still wish to proceed, you'll need to learn the HTTP verbs for the Drive API: https://developers.google.com/drive/api/v3/reference ... you'll also need to know how to authenticate: https://developers.google.com/drive/api/v3/about-auth Good luck!

        Delete