|Speaker:||Dr. Alon Orlitsky, University of California, San Diego|
Standard information-theoretic results show that data over small, typically binary, alphabets can be compressed to Shannon's entropy limit. Yet most practical sources, such as text, audio, or video, have essentially infinite support. Compressing such sources requires estimating probabilities of unlikely, even unseen, events, a problem considered by Laplace. Of existing estimators, an ingenious if cryptic one derived by Good and Turing while deciphering the Enigma code works best yet not optimally. Hardy and Ramanujan's celebrated results on the number of integer partitions yield an asymptotically optimal estimator that compresses arbitrary-alphabet data patterns to their entropy. The same approach generalizes Fisher's seminal work estimating the number of butterfly species and its extension authenticating a poem purportedly written by The Bard. The talk covers these topics and is self contained.
Alon Orlitsky received B.Sc. degrees in Mathematics and Electrical Engineering from Ben Gurion University in 1980 and 1981, and M.Sc. and Ph.D. degrees in Electrical Engineering from Stanford University in 1982 and 1986.
|Presented On:||Nov 3rd, 2006|
|Video:||QuickTime Streaming Video|