Bryn Mawr College
CS 380: Recent Advances in Computer Science
Topic: Science of Information
Fall 2012
BMC Class Number: 1214
Course Materials
General Information
Instructor: Deepak Kumar, 246B Park Hall, 526-7485
E-Mail: dkumar at cs brynmawr dot edu
Tweet: @bmcdeepak
WWW: http://cs.brynmawr.edu/~dkumar
Lecture Hours: Tuesdays & Thursdays, 2:15a to 3:45a
Room: Park Science Building, Room 336
Laboratories:
- Computer Science Lab Room 231 (Science Building)
Course Description:
Claude Shannon's foundations of information theory have paved the way for data storage, compression, encoding, and transmission for the Internet, CDs, DVDs, MP3 players, JPEGs, WiFi, iPODs, mobile phones, and a whole host of applications underlying today's information technologies. The past six decades have brought information theory to the crossroads of several traditional disciplines: mathematics, statistics, computer science, physics, neurobiology, and electrical engineering. This course introduces students to the fundamentals of Information Theory and leads them to a broader understanding of the concept of “information” that transcends boundaries between disciplines, especially between physical and life sciences, communication, and knowledge extraction from massive datasets. Students in several disciplines will be able to draw upon the latest discoveries from multiple disciplines, replicate and discuss recent research, and learn to apply the techniques and tools of information-based inquiry in their lives.
The course will run in a seminar format where students will engage by participating in and leading discussions and presenting results from readings and computational experiments. The course requires a Junior or Senior standing. Students from ALL disciplines are encouraged to enroll.
Texts & Software
|
Information: A Very Short Introduction: By Luciano Floridi, Oxford University Press, 2010.
|
Topics
The following topics are planned to be covered in the course...
- What is information?
- The Information Revolution
- The Mathematical Theory of Information
- Semantic & Physical Information
- Biological Information
- Economic Infomration
- The Ethics of Information
- Information Retreival & Big Data
- Information Visualization
- Quantun Information
- ...
Readings
Important Dates
September 4: First Meeting
December 13: Last Meeting
Assignments
- Assignment#1 (Due on Tuesday, September 25): Click here for details.
- Assignment#2 (Due on Thursday, October 25): Click here for details. Files you will need: blosum80.txt, coding.txt, cytochrome_c-10.txt, cytochrome_c.txt, gencode.py, genetics.py
Lectures
- Week 1 (September 4, 6)
September 4: Course Introduction. What is information? A class discussion.
Fellowships available. Click here for details.
Slides: Information
Read: Chapter 1 from Floridi.
September 6: Information: An Overview. Defining Information: Information = Data + Meaning. Understanding Data. Types of Data. Floridi's taxonomy.
Slides: Defining Information.
Read: Read Chapter 2 from Floridi. The Data Deluge (from The Economist, February 25, 2010).
Watch: Watch the video: Luciano Floridi: The Fourth Revolution (TEDxMaastricht, 2011)
- Week 2 (September 11, 13)
September 11: Broader perspectives on "information". Information in and for other disciplines. The five E-s of Information (Entropy, Economics, Encryption, Extraction, Emission). Towards a science of information: structure, time, space, semantics, cooperation, etc.
Read: Chapter 3 from Floridi. Read, Paul Nurse: Life Logic and Information, Nature V454, July 24, 2008.
Slides: Information: Broader Perspectives.
Homework: Reflect upon the materials from the last three classes and do a presentation on the study of "information" from your perspective/discipline of study. Suggest topics and/or ideas for inclusion that will be useful to you.
September 13: Presentations by students on their perspectives. Introduction to Information Theory (MTC).
Watch: Short IBM ad on the Data Deluge (thanks to Paul Ruvolo).
- Week 4 (September 18, 20)
September 18: Defining Shannon Information: Motivations and background. "Figure 1". Information. Entropy. Information Source. Entropy in Thermodynamics.
Read: Chapter 3 from Floridi.
Shannon 1948.
Slides: Introduction to Information Theory, Part 1.
Assignment#1 (Due on Tuesday, September 25): Click here for details.
September 20: Information (I), Entropy (H), Source Coding basics and terminology. Source Coding Theorem (Shannon's First Theorem). Huffman Encoding.
Slides: Introduction to Information Theory, Part 2
Written Work: Compute the Huffman Codes for S = {A, B}, P = {0.75, 0.25}. First, what is H(S)? Next, design Huffman Codes for (1) A, B (2) AA, AB, BA, BB (3) AAA, AAB, ABA, BAA, ABB, BAB, BBA, BBB. For each case, what is the average code length?
- Week 4 (September 25, 27)
September 25: Huffman Codes, contd. Lossless compression of English Text. Zipf's Law. Lempel-Ziv encoding.
Slides: Introduction to Information Theory, Part 3
Read: A Tale of Many Cities, Edward L. Glaeser, New York Times, April 10, 2010. The Long Tail of Search, Alan Rimm-Kaufman, September 18, 2007. Power Laws, pareto Distributrions and Zipf's Law, MEJ Newman, May 2006. A Universal Algorithm for Sequential Data Compression, Jacob Ziv & Abraham Lempel, IEEE Trans. of Information Theory, May 1977. Compression if Individual Sequences via Variable-Rate Coding, Jacob Ziv & Abraham Lempel, IEEE Trans. of Information Theory, September 1978. Peter Shor's proof on efficiency of LZ.
September 27: Lempel-Ziv Compression Algorithms. Hands-on class exercises. Parameters to keep in mind for implementation. Lossless compression.
Exercises for class: click here.
- Week 5 (October 2, 4)
October 2: Channels, noise, discrete channels, conditional and joint entropy, mutual information, Shannon's Second Theorem, Error correcting codes.
Read: Data Compression: Something for Nothing, Error-Correcting Codes: Mistakes That Fix Themselves, from John MacCormick's 9 Algorithms That Changed the Future.
Slides: Introduction to MTC, Part 4.
October 4: No class today as most folks are at Grace Hopper Conference.
Read: Chapter 6 (Biological Information) from Floridi.
- Week 6 (October 9, 11)
October 9: Biological Information: Introduction to Molecular Biology.
Slides: Introduction to Molecular Biology, Part 1.
Read:
Chapter 6 (Biological Information) from Floridi.
The original Watson & Crick paper (1953), Crick's Central Dogma article (1970)
October 10: Special Event (all students should attend): FREE Screening of film: Alan Turing: Codebreaker from 7p at Bryn Mawr Film Institute. This special, pre-release screening will be presented by Patrick Sammon, Executive Producer and Creator. For more information, visit: TuringFilm.com
October 11: Discussion on CODEBREAKER. Term Project ideas. Biological Information: Protein Synthesis, an overview of Bioinformatics and its challenges.
Read: Cell Communication: The Inside Story, by John D. Scott and Tony Pawson, Scientific American, June 2000. Available here. Also, Foundations for the Design and Implementation of Synthetic Genetic Circuits, by Adrian L. Slusarczyk, Allen Lin, Ron Weiss in Natur Reviews | Genetics, June 2012. Available here.
Slides: Introduction to Molecular Biology, Part 2.
Assignment#2 (Due on Thursday, October 25): Click here for details. Files you will need: blosum80.txt, coding.txt, cytochrome_c-10.txt, cytochrome_c.txt, gencode.py, genetics.py
- Week 7 (October 16, 18)
No classes, Fall Break!!
- Week 9 (October 23, 25)
October 23: Biological Information: Signalling Pathways, Synthetic Biology. A Special Guest Lecture by professor Karen Greif (Department of Biology).
Slides: Biological Information Pathways
Read: Cell Communication: The Inside Story, by John D. Scott and Tony Pawson, Scientific American, June 2000. Available here. Also, Foundations for the Design and Implementation of Synthetic Genetic Circuits, by Adrian L. Slusarczyk, Allen Lin, Ron Weiss in Natur Reviews | Genetics, June 2012. Available here.
October 25: Biological Information: Neuroscience. Nervous system, neurons, action potentials, spikes, information coding in spikes.
Slides: Biological Information: Neuroscience
Read: Mind Goggling, The Economist October 29, 2011.
Paper: Reconstructing Brain Activity Evoked by Natural Movies by Nishimoto et al, Current Biology 21, 1641-1646, October 11, 2011.
October 26: CANCELLED!! Special Event (all students should attend!): James Gleick, author of the book Information: A History, A Theory, A Flood will give an invited lecture. CANCELLED!!
James Gleick and his publisher agents reneged on their booking with us at the last minute. Our apologies. We are trying to find another speaker.
October 26: Science of Information Meets the Liberal Arts. A Talk by Porf. Sanjeev Kulkarni, princeton University. Dalton Hall Room 300 from 4:30-5:30p.
- Week 10 (October 30, November 1)
October 30: Class cancelled due to Hurrican Sandy. The college is closed.
November 1: Guest Lecture: Paul Ruvolo on nueroscience of the visual cortex.
Slides: Neuroscience of the Visual Cortex by Paul Ruvolo.
- Week 11 (November 6, 8)
November 6: Information retrieval: Search engines, indexing, inverted index/files. Ranking and relevance of search results.
Slides: Information Retrieval, Part 1
November 8: Information retrieval: Search engines, indexing, inverted index/files. Ranking and relevance of search results.
Field trip to NSA moved to next week (see below).
Slides: Information Retrieval, Part 2
- Week 12 (November 13, 15)
November 13: Information Retrieval: Questiona Answering Systems. Siri, Watson. Arhictecture of Watson. Big Data. Microsoft Research's demo of real-tile speech translation (English->Mandarin).
Slides: Information Retrieval Part 3
Read: Building Watson, Ferrucci et al, AI Magazine, 2010.
November 15: Field Trip: National Security Agency's National Cryptologic Museum in Columbia, MD. A chartered bus will depart from Pembroke Arch at 9a (return by 4p).
- Week 13 (November 20, 22)
November 20: Big Data, Data Science, Information Visualization: the visualization process, mapping numbers, time series, stacked graphs, heat maps, proportional symbols, spirals, word clouds, geographical maps, choropleth maps, etc.
Slides: Information Visualization Part 1.
November 22: Happy Thanksgiving!!
- Week 14 (November 27, 29)
November 27: Pictorial Presentation of Information: Pros & Cons: A Guest Lecture by Prof. Alan Baker, Department of Philosophy, Swarthmore College.
Slides: Pictorial Presentation of Information: Pros & Cons
November 29: Information Visualization: Advanced visualizations and techniques.
Slides: Information Visualization, Part 2
Read: An Information Theoretic Framework for Visualization (Chen & Janicke 2010),
also An Analysis of Information in Visualization (Chen & Floridi 2012) ,
Temporal Visualization of Boundary-based Geo-information Using Radial Projection (Drocourt et al 2011) - The Glacier Paper.
- Week 15 (December 4, 6)
December 4: Deepak is out of town at a meeting. No class.
December 6: Term Project presentations:
1. Meagan Neal: Fourier Transform
2. Natalie Kato: Data Visualization: Word Clouds
3. Sophia Berlin: TBA
- Week 16 (December 10, 12)
December 10: Term Project presentations:
1. Cristina Cabrera: AES Algorithm
2. Sara Fielder: Amino Acid Analysis
3. Daisy Sheng: Computational Investing
December 12: Term Project presentations:
1. Caitlyn Clabaugh: MP3 Encoding
2. Peiying Wen: Recommender Systems
3. Jacy Li: Information Visualization: Networks
Grading
All graded work will receive a grade, 4.0, 3.7, 3.3, 3.0, 2.7, 2.3, 2.0, 1.7,
1.3, 1.0, or 0.0. At the end of the semester, final grades will be calculated
as a weighted average of all grades according to the following weights:
Labs & Written Work: 100%
Total: 100%
Links
Created by dkumar at cs dot brynmawr dot edu on
August 22, 2012.