Decipher YouTube’s “signatureCipher”

Note: This is a translation of an old post from the old Chinese language blog.

Since no single Invidious instance can run smoothly in all 24 hours, I decided to rewrite my YouTube stream API by parsing YouTube's webpage. however, I soon realized that API responses for some copyright protected videos do not have the url field. instead, there is a different signatureCipher field contains some URL query parameters which contains the URL of that stream.

Investigation

signatureCipher is a URL encoded query string with multiple queries.

  1. Query s probably means signature, it is a variant of urlsafe base64 with = preserved.
  2. Query sp probably means signature parameter, its value is always set to sig.
  3. Query url probably is the main part of the full URL, open this URL directly will only get a 403 error.

Since it is probably not the first time anyone trying to reverse engineering signatureCipher, I decide to search online for similar issues to same me some time, and indeed I found some valuable information: in this Stackoverflow question, the answers gave me 2 important points:

  1. The code to decipher signatureCipher is in the main player.
  2. It probably always start with a=a.split("")

Reverse Engineering

After searching the main player in HTML, I got this file:

/s/player/${HEX_DIGITS}/player_ias.vflset/${LOCALE}/base.js

And there is indeed one function includes a=a.split(""):

ara=function(a){a=a.split("");Yx.nO(a,64);Yx.qV(a,1);Yx.pR(a,11);return a.join("")};

But it depends on another function to process the array, thankfully the helper function is also in the same file:

var Yx={nO:function(a,b){var c=a[0];a[0]=a[b%a.length];a[b%a.length]=c},
qV:function(a,b){a.splice(0,b)},
pR:function(a){a.reverse()}}

Implementation

This is the most difficult part, create an implementation to automatically get these functions and use it in my codebase.

Firstly, I tried to get both of these functions and simply do a eval(), but it’s both insecure and slow.

Then after searching a little bit, I decided to get contents from individual functions, create new functions with Function() in a Class constructor, then redirect the helper function calls in main function to this.

Now when I need to decipher signatureCipher, I can just call the decrypt() method in the new Class.

Here is my final results:

const playerRegex = /src="(\/s\/player\/[^"]+\/player_ias\.vflset\/[^"]+\/base\.js)"/
const playerEntryRegex = /function\(a\){(a=a\.split\(""\);(\w+).+;return a\.join\(""\))}/
const playerHelperExtraRegex = /([a-zA-Z0-9"]+):function\(((?:a|a,b))\){([a-zA-Z0-9.,()\[\]=%; ]+)}/
class Decryptor {
    constructor(entry, helperName, helperContent) {
        this.hnr = new RegExp(helperName, 'g')
        this.decrypt = Function('a', entry.replace(this.hnr, 'this'))
        for (this.i of helperContent.split(',\n')) {
            this.i = this.i.match(playerHelperExtraRegex).splice(1, 3)
            this[this.i[0]] = Function(this.i[1], this.i[2])
        }
        delete this.i; delete this.hnr
    }
}
async function fetchDecryptData(data) {
    var data = await fetch(https://www.youtube.com/${data.match(playerRegex)[1]})
    data = await data.text()
    try {
        var entry = data.match(playerEntryRegex).splice(1, 2)
    } catch (e) {
        throw new Error('Unable to fetch entry decryption function.')
    }
    const playerHelperRegex = new RegExp((var ${entry[1]}={([a-zA-Z0-9"]+:function\\((?:a|a,b)\\){.+}(?:,\\n)?)+)};)
    try {
        var helper = data.match(playerHelperRegex)[1]
    } catch (e) {
        throw new Error('Unable to fetch helper decryption function.')
    }
    return [entry[0], entry[1], helper]
}

This method is still somewhat insecure, but I think it’s the risk of Google adding malware to prevent people from streaming videos without open official YouTube apps is very low. it is faster and it also allows garbage collector to remove old unused functions from the memory.

Note

  • YouTube reroll params in the main decipher functions and signatureCipher regularly, decipher functions need to be updated if that happen.
  • Since I’m too lazy to only do RegExp, this won’t work on other languages, but I won’t port my YouTube stream API to any other language anytime soon.
chevron_left
chevron_right