For instance, in a relational database, where one might store a video ID as an integer, the BIGINT data type requires 8 bytes. This discovery was unsurprising because 8 bytes is a common size for storing and calculating very large numbers. How long of a random number in bytes would one need to generate so that number encoded in base 64 would require 11 digits? Based on the table above, 8 bytes (64 bits) require 10⅔ digits in base 64, and since you can’t have ⅔ of a digit, 11 digits would be necessary to encode all possible 8-byte values. Why did they choose an 11-digit base 64 number for their video IDs, and not 8 or 12 or 16 digits? The answer has to do with that function for generating random numbers I referenced earlier. Here’s a table that shows the relationship between bytes, bits, and base 16 and 64 digits. ![]() You can certainly encode any byte value that is not a multiple of 3 in base 64, but you would leave some possible base 64 values unused. This is the case for 3 bytes (24 bits), 6 bytes (48 bits), 9 bytes (72 bits), and so on. Every base 64 digit corresponds to a sequence of 6 bits, so the only time when base 64 perfectly encodes whole byte values is when the number of bits is divisible by 6. This is what I mean by a perfect byte encoding-there are no greater values possible with 2 hexadecimal digits.īut what if I encoded the random number in base 64? Turns out base 64 doesn’t encode bytes as neatly as hexadecimal. The largest byte, 11111111, equals the largest 2-digit hexadecimal number: FF (or 255 in decimal). For example, the byte 11010011 in binary equals D3 in hexadecimal (or 211 in decimal). Thus every byte (8 bits) returned by the random number generator can be encoded perfectly using 2 hexadecimal digits. Initially I was planning on encoding the random number in hexadecimal (base 16) because every hexadecimal digit (0-9, A-F) corresponds to a sequence of 4 binary digits (or bits), as shown in the following table. The function I was using to generate random numbers ( openssl_random_pseudo_bytes) required a single parameter: the length of the random number to generate in bytes. The day after I saw the video, I needed to write some code at work to encode a random number in a non-decimal base for similar purposes. There’s only one problem: he recited the wrong number.Īctually, there are a few problems *, but first, let me provide some background. ![]() ![]() If you haven’t seen Will YouTube Ever Run Out Of Video IDs?, it’s worth a watch, not only because Tom Scott recorded the 5-minute video in a single take, but also because in the middle of it he managed to recite seventy-three quintillion, seven hundred eighty-six quadrillion, nine hundred seventy-six trillion, two hundred ninety-four billion, eight hundred thirty-eight million, two hundred six thousand, four hundred sixty-four from memory.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |