Get torrent info like seeds/peers/completed from tracker (UDP) aka scraping torrent

Published: by

  • Categories:

The previous script I made adds trackers to a .torrent file. After I made that I thought that it would be if I could remove all the dead torrents by checking how many seeds/peers are available according to a particular tracker. So, the following script finds seeds/peers information if we have tracker url & torrent hash. It also finds the torrent name from the torrent hash using torrentz.me

You can read more about the protocol from here:

http://bittorrent.org/beps/bep_0015.html#udp-tracker-protocol

Code:

"""
Author: shadyabhi [email protected]
For protocol description(not mine), check http://bittorrent.org/beps/bep_0015.html#udp-tracker-protocol
"""

import socket
import struct   
from random import randrange #to generate random transaction_id
from urllib import urlopen
import re

tracker = "tracker.istole.it"
port = 80
torrent_hash = ["3ebde329f208b9e2e81c8e0f80d14384d5f416e4", "3ac9002ce1a7d5dde2c02b7cf9dc9e0f15eda7cb", "00e058f6629a19b42458af4dea5f6b9e2ebe8e25"]
torrent_details = {}

def get_torrent_name(infohash):
	url = "http://torrentz.me/" + infohash
	p = urlopen(url)
	page = p.read()
	c = re.compile(r'<h2><span>(.*?)</span>')
	return c.search(page).group(1)

def pretty_show(infohash):
	print "Torrent Hash: ", infohash
	try:
		print "Torrent Name (from torrentz): ", get_torrent_name(infohash)
	except:
		print "Coundn'f find torrent name"
	print "Seeds, Leechers, Completed", torrent_details[infohash] 
	print

#Create the socket
clisocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
clisocket.connect((tracker, port))

#Protocol says to keep it that way
connection_id=0x41727101980
#We should get the same in response
transaction_id = randrange(1,65535)

packet=struct.pack(">QLL",connection_id, 0,transaction_id)
clisocket.send(packet)
res = clisocket.recv(16)
action,transaction_id,connection_id=struct.unpack(">LLQ",res)

packet_hashes = ""
for infohash in torrent_hash:
	packet_hashes = packet_hashes + infohash.decode('hex')

packet = struct.pack(">QLL", connection_id, 2, transaction_id) + packet_hashes

clisocket.send(packet)
res = clisocket.recv(8 + 12*len(torrent_hash))

index = 8
for infohash in torrent_hash:
	seeders, completed, leechers = struct.unpack(">LLL", res[index:index+12])
	torrent_details[infohash] = (seeders, leechers, completed)
	pretty_show(infohash)
	index = index + 12 

Usage (The above script has 3 hashes for demonstration, you can change them):

shadyabhi@archlinux ~ $ python2 check_trackers.py 
Torrent Hash:  3ebde329f208b9e2e81c8e0f80d14384d5f416e4
Torrent Name (from torrentz):  House.S08E02.HDTV.XviD-LOL.avi
Seeds, Leechers, Completed (10297, 1051, 172274)

Torrent Hash:  3ac9002ce1a7d5dde2c02b7cf9dc9e0f15eda7cb
Torrent Name (from torrentz):  Dexter.S06E02.Once.Upon.a.Time.HDTV.XviD-FQM.avi
Seeds, Leechers, Completed (10962, 1328, 248032)

Torrent Hash:  00e058f6629a19b42458af4dea5f6b9e2ebe8e25
Torrent Name (from torrentz):  Breaking.Bad.S04E13.Face.Off.HDTV.XviD-FQM.avi
Seeds, Leechers, Completed (7751, 495, 183809)

shadyabhi@archlinux ~ $