TREC 2003

Web track - Interactive subtrack

Rutgers - Interactive Information Retrieval Group

This is an unofficial webpage maintained by Gheorghe Muresan. It simply contains links to guidelines and resources relevant to the TREC Interactive experiment, so that the people in the Rutgers group don't need to scan their email folders and their Web bookmarks looking for information.

Web track 2003

Interactive subtrack 2003

NIST/CSIRO's Panoptic search engine

The official version, used in TREC2003

Panoptic, installed at NIST, has two versions:

There is user help available.

Example of submitting a query programmaticaly: "http://ir.nist.gov/search/search.cgi?query=wireless+comms&collection=gov-plain". Here's a result sample. More parameters can be added: "http://ir.nist.gov/search/search.cgi?query=wireless+comms&collection=gov-plain&start_rank=61" or "http://ir.nist.gov/search/search.cgi?query=wireless+comms&collection=gov-plain&start_rank=61&num_ranks=20".

Example of getting a document through Panoptic, based on its id: "http://ir.nist.gov/search/gov.cgi?id=G01-01-0000000".
In order to avoid getting onto the live Web, a webpage should be obtained through Panoptic, rather than directly: "http://ir.nist.gov/search/gov.cgi?url=http://trec.nist.gov/".


If the results of the Panoptic search engine are needed, but not the user interface, XML output is preferred (the HTML of displayed by their user interface is derived from the XML output of the search engine):

Example of use:

Last year's Panoptic (for testing before the new one becomes available)

Example of submitting a query in order to get XML output: "trec.panopticsearch.com/gov/padre-sw_xml.cgi?collection=gov&query=bush". A PADRE_result_packet is returned. Here's a sample.

Rutgers group

Responsibilities

Responsibilities:

Gheorghe: Coordinating the Interactive track (sub-)group
Michael & Gheoghe: Software
Morris, Yuelin & Hyuk-Jin: Instruments
Morris: Scheduling experiments
Nick: Communication with track; collaborate with other groups.

Work schedule

The schedule which we hope to follow is:

August 31 - submit results (hard deadline)
Conduct experiments
June 30 - topics distributed to sites (hard deadline)
Make changes according to pilot results
Jund 27 - Pilot testing completed
Pilot testing
June 20 - Initial versions of software and data collection instruments

Schedule of experiments for the final week:

Tuesday Gheorghe (Elisa) 11am, Morris 2:30pm, Yuelin 5pm
Wednesday (Xiangmin) 11am, Yuelin 2:30pm, Hyuk-Jin (Giyeong) 5pm
Thursday Gheorghe (Ulla) 9:30am, Morris 5pm
Friday  
Saturday Yuelin 2:30pm
Sunday  

 

Experimental design issues

We decided upon "Option 3"; that is, running 16 subjects in a within-subject design, comparing the linear with the hierarchically-structured interface, both using the saved documents function. If we have time and enough subjects, we will repeat the design, but replacing the saved documents function with bookmarking.

Searcher
First condition
Second condition
S1
Syst 1:
1 2 3 4
Syst 2:
5 6 7 8
S2
Syst 1:
5 6 7 8
Syst 2:
1 2 3 4
S3
Syst 2:
1 2 3 4
Syst 1:
5 6 7 8
S4
Syst 2:
5 6 7 8
Syst 1:
1 2 3 4
S5
Syst 1:
4 3 2 1
Syst 2:
8 7 6 5
S6
Syst 1:
8 7 6 5
Syst 2:
4 3 2 1
S7
Syst 2:
4 3 2 1
Syst 1:
8 7 6 5
S8
Syst 2:
8 7 6 5
Syst 1:
4 3 2 1
S9
Syst 1:
3 1 4 2
Syst 2:
7 5 8 6
S10
Syst 1:
7 5 8 6
Syst 2:
3 1 4 2
S11
Syst 2:
3 1 4 2
Syst 1:
7 5 8 6
S12
Syst 2:
7 5 8 6
Syst 1:
3 1 4 2
S13
Syst 1:
2 4 1 3
Syst 2:
6 8 5 7
S14
Syst 1:
6 8 5 7
Syst 2:
2 4 1 3
S15
Syst 2:
2 4 1 3
Syst 1:
6 8 5 7
S16
Syst 2:
6 8 5 7
Syst 1:
2 4 1 3

Syst1 is the baseline - the search results are displayed linearly.

Syst2 is the experimental system - the search results are display in a hierarchic structure, based on their URL.

Current action list

Everyone in the group: try to find volunteers and conduct another few searches.

Gheorghe: Write software for analyzing the logs and insert the relevant columns in the search.sav SPSS file; submit results to NIST.

Morris, Yuelin, Hyuk-Jin: fill in the SPSS files with data from questionnaires.

One or two volunteers needed to conduct statistical analysis using SPSS.

Current issues

Running the system for the experiment

All you should have to do is type "java Trec <s>", where <s> the the subject id (1..16), in the trec2003int/gui folder on goumang.

However, due to some bugs the system may run out of memory and get slower or freeze during the experiment. Therefore, the recommended steps are:

  1. Reboot goumang.
  2. Login as 'mongrel' and make sure that the "Window style manager" in the CDE has the field "Raise window when made active" OFF, so that dialog messages don't disappear behind the main window.
  3. Go to trec2003int/gui and type "java Trec <s>".
  4. During the demo tell the subject that this is an experimental system that require a bit of patience.Show that you should wait for the outcome of an action (search, loading of a hit, etc.) before you click requiring the next action.
  5. After finishing the first 4 searches, with the first interface, close the second interface (or kill the system with Ctrl-C in the terminal window) while the subject is busy filling in questionnaires. Restart the system and close the first interface; continue the experiment with the second interface. There should be no significant interruption perceived by the subject.

If the system does freeze

Try the following list (atempt the first items in the list first):

  1. Try to get around:
  2. If nothing seems to work, try to end that task (especially if the subject has searched for close to 10 mins). If both "Start next task" and "End current task" buttons are disabled, try to use the right button of the mouse in the TaskPanel to get access to those actions.
  3. If that doesn't work, try to close the interface by double-clicking the top-left corner or by pressing Alt-F4.
  4. If that doesn't work, kill the application with Ctrl-C at the terminal window.
  5. If that doesn't work, open another terminal window or logon from another machine, check the process id with "ps -Af | grep java" and kill it with "kill -9 <id>".
  6. If that doesn't work (I'd be really surprised), reboot with Stop-A and "boot".

Note. If you need to continue the experiments after a crash with system 2, you don't need to start and stop all the task on system 1. Just re-start the application and close the first interface.

After the experiment

Don't put the file into the filing cabinet immediately. While the experiment is fresh in your mind, make notes of what was interesting, uncommon, etc. during the experiment. Listen to the audio-tape, if necessary, for the subject's comments; it is unlikely that we will have time to re-listen to the tapes. Send an email around with these observations.

Evaluation

What do we log and measure ?

  1. Time - for each search, to formulate a query, ...
  2. The actual queries (and implicitly query length and number of iterations in each session)
  3. Nb of saved docs, viewed docs, seen docs, saved docs that are not from Panoptic
  4. Scrolling ?

Old issues

Topics and user guidelines

Which are the best 8 topics out of the proposed 11 ?

How should the topics be re-formulated for the Interactive sub-track and presented to the user ? The current format (title and description) is probably not appropriate.

What instructions should the users get before starting the experiment and with each topic ? Shall we ask for "home pages" or "key resources" ?

How many documents do users have to save for each topic ?

Instruments

Should we have an exit interview or is an exit questionnaire adequate ? What questions do we want to ask ? Do we have the resources to process the interview data ?

Arranging the hits in hierarchic structure

We've decided to show only two levels, so that users don't get confused. It means that documents will be generally grouped based on the .gov domain they belong to.

The ranking is based on the scores provided by the Panoptic search engine:

For more inspiration see Marti Hearst's Cha-Cha system and CSIRO's paper at Interactive TREC 2002.

How many hits in each go ?

For the linear view it makes more sense to go to the next n hits, for the hierarchic view to add another n hits.

In order to have consistent behaviour, 50 hits are requested from Panoptic, the non-HTML files are filtered out (.ps, .pdf, .doc) and the best 30 of the remaining are displayed. There is no "Next" or "More".

The format of the results view

We used Java Swing's JList for the linear view and JTree for the hierarchic view, so that the hits are presented in "folder" form.

Another possibility would be to use HTML. The linear view will simply display Panoptic output, as in last year's SDD. The hierarchic view will be obtained by processing Panoptic's output and building the hierarchy with layered HTML unordered lists (<UL>).