Since FriendFeed was bought by Fackbook, it’s doomed for itself and users. Yes, they still have a team to maintain and fix issues, but it has become a zombie I would say. I stayed until two months ago, I removed all services I had added and removed it from yjl.im. As I see it as a zombie, it continued grabbing my stuff from a few sources for a few weeks. I didn’t report because I didn’t care.

Few days ago, I decided to remove some entries. I have wanted to do that for long time after I saw many links from FriendFeed (to this blog) reported in Webmaster Tools. Surely, I would have those link, I added my blog to FriendFeed. It’s same reason why often don’t link to my blog post in screenshots I upload on Flickr. I feel I was spamming myself. Time to fix it.

I am not trying to remove all entries but only those without being commented on or liked, I also don’t remove the FriendFeed entries. For example, you write an entry or upload files or images directly on FriendFeed. I want to keep everything original intact. Blog entries, YouTube favorites, Last.fm favorites, etc., they will be still intact in source websites as they were after I remove those entries from my FriendFeed account. But likes and comments on FriendFeed are original stuff, so I keep entries which have those.

Unfortunately, there is an API rate limit for deletion (or write permission). Of course I use API, you don’t expect me to delete 9,430 entries (10,091 entries in total in my account) by mouse clicks, do you? I don’t know exact rate, it seems to be 100 requests per a few hours or per a day, I am not sure. Conservatively speaking, it’s a 100-day job. I wrote an email to API team for detail about rate limit but I haven’t got an response.

Since this is a one-time script, I just post the code here.

#!/usr/bin/env python


import datetime
import getpass
import re
import shelve
import sys
import time
from urllib2 import HTTPError
from friendfeed import FriendFeed

NUM = 100
DELETION_INTERVAL = 30
RE_TITLE = re.compile('(.*) - <a rel.*')


def print_eta(n, extra=0, est_only=False):

  return
  eta = 3600 * n / 100 + extra
  est = datetime.datetime.now() + datetime.timedelta(seconds=eta)
  if est_only:
    print '[%s]' % est
  else:
    print 'Estimated time to complete: %d seconds, at %s' % (eta, est)


def main():

  ff = FriendFeed()

  nickname = raw_input('Your FriendFeed Nickname: ')
  data = shelve.open('%s.data' % nickname)

  if 'start' not in data:
    start = 0
  else:
    start = data['start']
  if start == -1:
    # Finish retrieving entries
    entries = data['entries']
    marked = len([True for v in entries.values() if v[0]])
    total = len(entries)
    del_queue = [entry for entry, value in entries.items() if value[0] and not value[1]]
    print '%d out of %d entries marked for deletion.' % (marked, total)
    print '%d deleted, %d left to delete.' % (marked - len(del_queue), len(del_queue))
    print
    if not del_queue:
      return
    print 'You can find your Remote Key at http://friendfeed.com/remotekey'
    print
    remote_key = getpass.getpass('Please enter your remote key [no echo]: ')
    ff = FriendFeed(nickname, remote_key)
    print
    print_eta(len(del_queue), extra=5)
    print 'Starting deletion (every %d seconds a request) in 5 seconds...' % DELETION_INTERVAL
    print
    time.sleep(5)
    del_count = 0
    try:
      while del_count < len(del_queue):
        e_id = del_queue[del_count]
        try:
          result = ff._fetch('/api/entry/delete', {'entry': e_id})
        except HTTPError, e:
          data['entries'] = entries
          data.sync()
          if e.code == 403 and 'limit-exceeded' in e.read():
            print
            print 'Failed to delete [%s], reached the rate limit.' % e_id
            print_eta(len(del_queue) - del_count, extra=10*60)
            print 'Sleeping for 10 minutes...'
            time.sleep(10 * 60)
            print
            continue
          raise e
        if result['success']:
          entries[e_id] = (True, True)
        else:
          print
          print 'Failed to delete [%s]: ' % e_id, result
          print 'Continue, anyway.'
        sys.stdout.write('#')
        sys.stdout.flush()
        del_count += 1
        if del_count % 50 == 0 or del_count == len(del_queue):
          sys.stdout.write(' %d \n' % del_count)
          print_eta(len(del_queue) - del_count, extra=10*60)
          data['entries'] = entries
          data.sync()
        time.sleep(DELETION_INTERVAL)
    except Exception, e:
      data['entries'] = entries
      data.sync()
      raise e
    print 'Done.'
  else:
    entries = data.get('entries', {});
    # Retrieving entries
    while True:
      feed = ff.fetch_user_feed(nickname, start=start, num=NUM, maxcomments=1, maxlikes=1, hidden=1)
      ids = [entry['id'] for entry in feed['entries']]
      for e_id in ids:
        if e_id not in entries:
          break
      else:
        # All already in entries
        print 'Retrieval is done.'
        break
      for entry in feed['entries']:
        if entry['id'] in entries:
          continue
        if entry['service']['id'] == 'internal':
          # Don't delete FriendFeed stuff
          entries[entry['id']] = (False, False)
        #elif 'comments' not in entry and 'likes' not in entry:
        elif len(entry['comments']) + len(entry['likes']) == 0:
          entries[entry['id']] = (True, False)
        else:
          entries[entry['id']] = (False, False)
      print 'start=%d, entries=%d' % (start, len(entries))

      start += NUM
      data['entries'] = entries
      data['start'] = start
      data.sync()

      time.sleep(5)

    data['start'] = -1
    data.sync()
    marked = len([True for v in entries.values() if v[0]])
    total = len(entries)
    print '%d out of %d entries marked for deletion.' % (marked, total)

  data.close()


if __name__ == '__main__':
  main()

(I silence print_eta(). Because I don’t have exact rate limit information, therefore I could not give an ETA.)

It’s a two-stage design. First run is to collect entries and to mark entries should be deleted, you will only be asked for FriendFeed nickname. The collected data will be stored in nickname.data file.

% ./remove_lonely_entries.py
Your FriendFeed Username: livibetter
start=0, entries=100
start=100, entries=200
start=200, entries=300
start=300, entries=400
[...]
start=9600, entries=9691
start=9700, entries=9791
start=9800, entries=9891
start=9900, entries=9991
start=10000, entries=10091
Retrieval is done.
9430 out of 10091 entries marked for deletion.

The second stage is to delete entries, you will also be asked for remote key. I still use API v1, API v2 uses OAuth and I am not sure if it supports three-legged OAuth. No need to trouble myself to that. Once you enter the key, the deletion will be started in five seconds. It will send a deletion request every 30 seconds. If it gets a rate limit exceeded response, it sleeps for 10 minutes, then try again.

http://farm6.static.flickr.com/5211/5425880613_bc70f3e465_z.jpg

It should be safe to interrupt this script anytime you want, it will pick up where you force it to leave. There is no option to tell this script which stage to perform, it knows. Simply run the script without options and follow the instruction, it will be fine.

Meh, I still have 8,887 entries to delete!