Base62


Base62, like Base10 (decimal), Base16 (hexadecimal), is a number system. Base62 uses 62 possible ASCII letters, 0 – 9, a – z and A – Z, therefore it is often used to represent large numbers in short length of string. Mainly it has two advantages: A shorter number representation in base62 yields a smaller risk of error entered by human and the number can be typed in faster. The second advantage is that it can be used in a more restricted application where the length for URL or name, identify is limited.

The length of Base62 encoding can be estimated by the formula:

base62 Base62 algorithms beginner brute force implementation javascript programming languages python string tricks

where is the length of estimated Base62 encoding. is the number of points in the system to the base b.

The Base62 is often used in web-applications to make long URLs shorter, such as ROT47 URL

The following Python functions (github) provide the conversion between Base10 and Base62. For both functions, a base table is used. To convert into base62, the process is repeated as: divide the number by 62, get the remainder, concatenate the result by the indexed character in the table until the number becomes zero. On the other direction, it is even simpler, loop from left to right for each base62 character, multiply the result by 62 and add the index of the character in the table, repeat until the end of the base62 string.

#!/usr/bin/env python
# https://helloacm.com
# base62 convert

from math import floor

def toBase(num, b = 62):
    if b <= 0 or b > 62:
        return 0
    base = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    r = num % b
    res = base[r];
    q = floor(num / b)
    while q:
        r = q % b
        q = floor(q / b)
        res = base[int(r)] + res
    return res

def to10(num, b = 62):
    base = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    limit = len(num)
    res = 0;
    for i in xrange(limit):
        res = b * res + base.find(num[i])
    return res

And the accuracy can be tested by first convert to base, and use this number to convert it back to base 10, then compare the number with the original:

if __name__ == "__main__":
    for x in xrange(100000):
        y = toBase(x)
        z = to10(y)
        if x != z:
            print "error, " + x
    print "end"

With these two functions, we can virtually convert the numbers from base x to base y using the base 10 as the intermediate base, i.e. convert the number to base 10 first and then further convert to base y.

The Javascript version can be found on github.

// https://helloacm.com
// https://rot47.net
// base62.js
// provides conversion between base10 and base62

var Base62 = (function(){                
  var table = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
  
  function _to10(num)
  {
    var limit = num.length;
    var res = 0;
    for (var i = 0; i < limit; i ++)
    {
      res = 62 * res + table.indexOf(num.charAt(i));
    }
    return res;  
  }
  
  function _toBase(num)
  {
    var r = num % 62;
    var res = table.charAt(r);
    var q = Math.floor(num / 62);
    while (q)
    {
      r = q % 62;
      q = Math.floor(q / 62);
      res = table.charAt(r) + res;
    }
    return res;
  }
  
  return {
    FromBase10: function()
    {
      var r = [];
      for (var i = 0; i < arguments.length; i ++)
      {
        var num = parseInt(arguments[i]);
        r.push(_toBase(num));
      }
      return r;
    },
    
    FromBase62: function()
    {
      var r = [];
      for (var i = 0; i < arguments.length; i ++)
      {
        var num = arguments[i].toString();
        if (num.length)
        {
          r.push(_to10(num));
        }
      }
      return r;    
    }
  } 
})();

References:

[1] http://de.wikipedia.org/wiki/Base62

--EOF (The Ultimate Computing & Technology Blog) --

GD Star Rating
loading...
680 words
Last Post: Codeforces: 260A. Adding Digits
Next Post: Arbitrary Base Number Converter

The Permanent URL is: Base62

Leave a Reply