Manage your Letterboxd profile with Python

2014-03-05 | python

I am currently working on a Letterboxd plugin for XBMC. Awaiting an official Letterboxd API, the need is there to scrape and "reverse engineer" the workings of the Letterboxd website. For reference the following illustrates one simple way to signing into and manage some aspects of an Letterboxd account using Python and the Requests library.

Sign in
Methods

import requests

URL = 'http://letterboxd.com'
USER = 'my_username'
PASS = 'my_password'

# Session
client = requests.session()

# Cookies
client.get(URL)  
token = client.cookies['com.xk72.webparts.csrf']  
params = {'__csrf': token}

# Login
auth = {'username': USER, 'password': PASS} #, 'remember': True}
headers = {'Referer': URL}

r = client.post(URL + '/user/login.do', data=dict(params, **auth), headers=headers)

# Logged in

2. Methods

It is now possible to use the client further to mange the account. Some examples:

FILM = 'in-bruges' # http://letterboxd.com/film/in-bruges/

2.1. Like/unlike a movie

r = client.post('%s/s/%s/toggle-like' % (URL, FILM))

Simply toggles whether you like a movie or not. The response data - accessed by r.text - returns some lines of HTML (as well as some unuseful JS omitted here):

<span class="like-link">
 <a href="#" class="has-icon icon-16 icon-liked ajax-click-action" data-data-type="html" data-action="/s/in-bruges/toggle-like"> You liked this film </a>
 <a class="likes-count" href="/film/in-bruges/likes/"> 5,004&nbsp;likes </a> 
</span>

Note that the class attribute in the a tag contains a class called icon-liked, which indicates that the film was successfully liked. This is replaced by icon-like when a movie is unliked.

2.2. Rate a movie

data = {'rating': 9}
r = client.post('%s/film/%s/rate/' % (URL, FILM), data=dict(params, **data))

Letterboxd represents a rating with five stars, while also allowing for half-star ratings. The rating parameter therefore accepts a value of 0 to 10 (where 0 will remove the rating). A provided rating of 5 will for example be represented as two and a half star. The POST request returns a more useful JSON response:

{"result":true,"csrf":"...","rating":9,"watched":true,"film":{"id":47809,"name":"In Bruges"},"watchlistUpdated":false}

It's worth noting that giving a rating to a movie not marked as watched, will logically mark the movie as just that.

2.3. Toggle watched status

r = client.post('%s/film/%s/mark-as-watched/' % (URL, FILM), data=params)

{"result": true, "filmId": 47809, "watched": true, ...}

and

r = client.post('%s/film/%s/mark-as-not-watched/' % (URL, FILM), data=params)

{"result": true, "filmId": 47809, "watched": false, ...}

2.4. Add/remove from watchlist

r = client.post('%s/film/%s/add-to-watchlist/' % (URL, FILM), data=params)

{"result": true, "filmInWatchlist": true, "film": {"id": 47809, "name": "In Bruges"}, ... }

and

r = client.post('%s/film/%s/remove-from-watchlist/' % (URL, FILM), data=params)

{"result": true, "filmInWatchlist": false, "film": {"id": 47809, "name": "In Bruges"}, ... }

2.5. Diary

Let's do what we came here for, that is adding films to the diary (at least it's what I need for my XBMC plugin). This time around you will need to find the film ID. (This ID does unfortunately not correspond to the TMDb ID, from which Letterboxd fetches it's data.) So, what would be the best way to get this ID? One way would be parsing the HTML from the film page (e.g. http://letterboxd.com/film/in-bruges), which contains:

<a 
  href="#add-this-film"
  class="add-this-film modal-link"
  data-film-id="47809"
  data-film-name="In Bruges"
  data-poster-url="/film/in-bruges/image-150/"
  data-film-release-year="2008"
>
  In Bruges
</a>

The ID for In Bruges (2008) would in this case be 47809, based on the obvious film-id attribute. (Not really that efficient, though.)

There is multiple valid parameters, not all of which are required. Passing along only filmId will just mark the movie as watched. Adding liked will then work as a shorthand for "watched" and "liked". The same thing goes for rating. This allows multiple operations, while not adding the film to the diary per se.

To actually add it to the diary, a date is required, specified with viewingDateStr (format: YYYY-MM-DD), and setting the specifiedDate boolean to true.

# Parameters
entry = {
    'filmId': 47809, 
    'specifiedDate': True, 
    'viewingDateStr': '2014-03-06', 
    'rewatch': True,
    #'review': '',
    #'containsSpoilers': False,
    #'tag': 'top10', # simply add multiple of these
    'liked': True, 
    'rating': 9, 
    #'shareOnFacebook': False
}

# Request
r = client.post('%s/s/save-diary-entry' % (URL), data=dict(params, **entry))

The JSON response from the request will differ if added to the diary or not. The following shows the ouput when adding an entry.

{
    "result": true,
    "csrf": "...",
    "messages": ["&lsquo;In Bruges&rsquo; was <a href=\"/my_username/film/in-bruges/1/\">added to your films<\/a>."], 
    "film" : "In Bruges", 
    "filmId": 47809,
    "viewingId": 4007989, 
    "viewingDate": "2014-03-06",
    "viewingDateStr": "06 Mar 2014",
    "reviewText": null,
    "rating": 9,
    "tags": "",
    "rewatch": false,
    "specifiedDate": true,
    "containsSpoilers": false,
    "liked": true,
    "didShareOnFacebook": false,
    "alreadySharedFilmWatchOnFacebook": false
}

To remove the diary entry, one would have to request the following:

r = client.post('%s/s/viewing:4007989/delete' % (URL), data=params)

In this example, the number 4007989 reflects the viewingId returned when adding the diary entry (see above). These viewing id's are luckily also present in the diary page source. There is noe response data here, besides a redirect back to the diary page. It is possible to prevent this by adding allow_redirects=False to post().

All of the above is of course subject to change at Letterboxd's whim.