Schedule for 16:194:614

This schedule is subject to alterations.

(Legend for the readings: MBK = Meadow, Boyce and Kraft, vR = van Rijsbergen, RB = Rik Belew, SJW = Sparck-Jones and Willett)

- Week - - Topics / Activities - - Students' responsibilities -
(During and/or after class)
Textbook IR: general topics

* 1 *

Fri,
Sep 02

Slides in HTML and PDF

Introduction and overview of the course.

Get familiar with the course website. Set up your course website on scils.
Send me email with your details (use students.xml template).

Play with Model.xls to solve the homework.If you need to, do some Excel practice.

* 2 *

Fri,
Sep 9

Slides in HTML and PDF

Introduction to IR. Information vs data retrieval.

What do we want from IR ? Introduction to evaluation.

 

* 3 *

Fri,
Sep 16

Slides in HTML and PDF

IR concepts. Aboutness. Relevance.

Rationalist vs. empiricist approaches (AI vs. Stats)

Design decisions for IRS; automatic vs. manual/intellectual systems.

 

* 4 *

Fri,
Sep 23

Slides in HTML and PDF

Indexing.

Document and query representation. Manual vs. automatic indexing.

Look at an example of a document collection, a stopword list, an indexed collection and an inverted file.
Formulate a few boolean queries and figure out the result of a boolean search.

* 5 *

Fri,
Sep 30

Slides in HTML and PDF


Automatic indexing. Lexical analysis. Weighting. Data structures.

Lab work.

Homework (to be graded).

* 6 *

Fri,
Oct 07

Slides in HTML and PDF

Models of IR.

Interaction models. Indexing models. Language models. Topic models. User models.

Relevance estimation models.
Vector space model. Probabilistic model. Language models.

 

WebClusterLite lab work. WebClusterLite homework.

* 7 *

Fri,
Oct 14

Slides in HTML and PDF

Models of IR.

Information Retrieval as interaction.

Evaluation of interactive systems.

Lemur lab work. Lemur homework.

(Old lab work. Old homework.)

* 8 *

Fri,
Oct 21

Slides in HTML and PDF

Evaluation of IR systems.

Lab work / homework.

* 9 *

Fri,
Oct 28

Greater Philadelphia DB/IR Day
(Register in advance !)

 

* 10 *

Fri,
Nov 04

Evaluation of IR systems.

Measures of performance.

 

* 11 *

Fri,
Nov 11

Slides in HTML and PDF

Introduction to Statistics and Hypothesis Testing.

Practical evaluation.

Lab work / homework.
Advanced IR: current research topics

* 12 *

Fri,
Nov 18

INEX project work.

Optional homework.

* 13 *

Fri,
Nov 25

Thanksgiving, no class.  

* 14 *

Fri,
Dec 02

Slides in HTML and PDF

User interfaces and Information Visualization for IR Part I: Interaction models. Part II : Tools and techniques.

Mediated IR (HTML and PDF).

AI and IR.

Machine learning and data mining for IR.

(Also see a tutorial on ML )

* 15 *

Fri,
Dec 09

Topic modeling.

Web IR

Structure. Clustering vs. classification.

Informetrics and IR.

The Semantic Web.

See Ravi Kumar's tutorial on Internet Search.

* 16 *

Fri,
Dec 16

Slides in HTML and PDF

A review of ranking models.

Cross-language IR.

Natural language processing for IR.

Collaborative and recommender systems.

Personalization and user modeling.

Implicit vs. explicit feedback.

Document summarization.

Information extraction.

Multimedia IR (image, video, music, ...).

IR for structured documents. INEX.

See Lavrenko's tutorial on Language Models.