Red Sox pitcher Curt Schilling throws fastballs 52% of the time, curveballs 10%, sliders 13%, changeups 8%, and splitters 17%. (10 pts) a.    The scorer wants to write down his pitch sequence in binary using a code that will make the record as short as possible. Find such a code. (Hint: huffman code) (5 pts) b.    Roughly how much more efficient is this code than the code that simply assigns a different 3-bit codeword to each of the five possible pitches

Respuesta :

Explanation:

a. Rearranging the list to descending order, we have ...

  • fastballs - 52%
  • splitters - 17%
  • sliders - 13%
  • curveballs - 10%
  • changeups - 8%

We can combine the two least-frequent pitches and assign those leaves to a node in the code tree. Together, their frequency is 18%, so exceeds both of the next two pitches. We can combine sliders and splitters into another node of the code tree. Together, their frequency is 30%, so the two nodes we just created now have the lowest two frequencies. Those nodes can be combined to make another node in the code tree. Finally that node and the fastballs leaf can be combined at the top of the tree. Assigning a 1 to the most frequent branch from each node, we get these codes ...

 fastballs: 1

  splitters: 011

  sliders: 010

  curveballs: 001

  changeups: 000

This coding scheme is chosen so that more frequent pitches have codes that favor 1s over 0s, making them easier to write.

__

b. The average number of bits written per pitch is ...

  0.52×1 + 0.48×3 = 1.96 . . . bits per pitch

A straight binary coding scheme would write 3 bits per pitch, so this scheme uses about 34.7% fewer bits.

_____

I found it interesting that my first simulation of 1000 random pitches required exactly 1960 bits for the record. (I would have expected the total to be off by a few bits in one direction or the other. Another running of the same simulation required a record of 1956 bits.)

ACCESS MORE