Skip to main content

Tagged: 

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
    Posts
  • #38979
    Anonymous
    Inactive

    Hello everybody,
    Just created this to see who else apart from us who don’t (well, before!) understand how frequency analysis works. Please tell me if you are as confused as our group was. We asked a teacher and even they were confused and couldn’t help us!

    [EDIT, Harry: Dear CipherGirl1, frequency analysis is not that scary! The idea is that some letters in the alphabet appear a lot more often than others. In particular, because E is a vowel and we use the word “the” a lot, you tend to find that the most common letter in a longish bit of English text is the letter E. Not always. Sometimes T beats it, but those two are usually the most common. If you have a cipher text encrypted so that the letter e is replaced by X, then there is a pretty good chance X will be the most common letter in the cipher text. If you count how often each letter appears and see that X is the most common then that is a strong hint that it might be the replacement for e, so you can assume that and see what else you can deduce. If you see a lot of three letter words ending in X you can guess that they are all copies of the word “the” and that helps you to see how t and h are encrypted. This is particularly useful if you are trying to decrypt a substitution cipher with the word shapes left in! If you look this up on the web you will find loads about it, and we talk about it in the beginners guide too.

    Good luck,

    Harry]

    • This topic was modified 6 years, 1 month ago by Harry.
    • This topic was modified 6 years, 1 month ago by Harry.
    #39249
    Anonymous
    Inactive

    Thanks Harry

    #39250
    Anonymous
    Inactive

    Hi Harry
    Sorry it’s just that I wanted to help – not saying it’s totally a nightmare. Sorry I forgot to mention that before at the end of my first post.
    The Encryptic Enterprise

    #39375
    Anonymous
    Inactive

    Also helpful to consider frequency analysis alongside its cousins, bigram, trigram and quadgram analyses. These make more and more useful methods of analysing the score of english text, good for very short sections or sections that aren’t quite perfect.

    Bigram, trigram, and quadgram frequency text files can be found here:

    http://practicalcryptography.com/media/cryptanalysis/files/english_bigrams.txt
    http://practicalcryptography.com/media/cryptanalysis/files/english_trigrams.txt.zip
    http://practicalcryptography.com/media/cryptanalysis/files/english_quadgrams.txt.zip
    (http://practicalcryptography.com/media/cryptanalysis/files/english_quintgrams.txt.zip)
    (the last one is barely more useful than quadgrams and is certainly slower, so I wouldn’t recommend it)

    #39385
    Anonymous
    Inactive

    Hi thanks! We will certainly consider that!
    The Encryptic Enterprise

    #39392
    Anonymous
    Inactive

    The simplest way to use single-letter (monogram) frequencies is just to line up the most frequent letter of the ciphertext with the most frequent letter of english, etc. But as Harry pointed out, this is a bad idea because E and T sometimes vie for first place, and the others shuffle around too.

    The second easiest thing to do with single-letter frequencies is take the dot product of the frequencies in your candidate plaintext with the frequencies of english. Your math teacher should know what a dot product is, if you do not. Basically, decrypt with a candidate key to get a candidate plaintext, find the frequencies of the letters in it, then find the sum of freq(A in candidate)*freq(A in english) + freq(B in candidate)*freq(B in english) + … The candidate with the highest sum is the winner. This works very well with Caesar shift and affine ciphers, and can crack one in a split second by looping over all possible keys and keeping the solution with the highest sum.

    For polyalphabetic ciphers, or for a generic substitution cipher that is not as constrained as Caesar or affine, you need to use di- tri- or tetra-gram frequencies to determine if a candidate is good. But the idea is similar.

    #39422
    Anonymous
    Inactive

    You will also need to look at the lengths between the -grams in polyalphabetic ciphers to find the key length.

Viewing 7 posts - 1 through 7 (of 7 total)
  • You must be logged in to reply to this topic.