|
|
Abbas,
J., Norris, C. & Soloway, E. (2002). Middle school children's
use of the ARTEMIS digital library. Proceedings of the Second ACM/IEEE-CS
Joint Conference on Digital Libraries, Oregon: Portland, pp.98-105.
In the study, Transaction log data analysis was employed to evaluate
users' interaction with a genuine digital library. Instance of use
and time spend per instance were the key measures.
Adams A, & Blandford, A. (2001). Digital libraries in a
clinical setting: friend or foe? ECDL'01; Proceedings of the
5th European Conference on Digital Libraries, 214-224.
To understand the social and organizational impacts of digital
libraries on clinical users, the authors conducted focus group
discussions and in-depth interviews involved 73 hospital clinicians
(nurses, doctors, surgeons, consultants, etc.). The results based
on the ground theory indicate that users' perceived information
needs, dissemination processes and the impact of newly introduced
technology associated with organizational, social and political
structures. In addition, organizational hierarchies, technological
misconceptions, technology and information accessibility impeded
the use of digital libraries.
Bainbridge, D., Dewsnip, M., & Witten, I.H. (In press).
Searching digital music libraries. Information Processing and
Management, Retrieved on September 24, 2004 from http://www.sciencedirect.com
To compare the performance of three difference melody retrieval
models (dynamic, static, and n-gram-based matching) and algorithms
devised under these models, the authors conducted a series of
effectiveness and efficiency evaluations by employing traditional
IR metrics, including precision, recall, number of relevant items
retrieved, and query processing time. The results show that state
based retrieval model has a good balance of efficiency and effectiveness,
while n-grams is of higher efficiency. Nevertheless, the hybrid
approach (3-grams followed by stated-based) has "the best
overall combination of efficiency and effectiveness." Additionally,
the use of both pitch and rhythm information of melodies yields
to better performance than the solo use of any of the two. Whereas
the research findings is suggestive to music digital library design,
the solo use of topical relevance judgment and exclusion of real
users might restrict the applicability.
Baldonado, M.Q.W. (2000). A user-centered interface for information
exploration in a heterogeneous digital library. Journal of American
Society for Information Science, 51(3): 297-310.
The research series on evaluation of SenseMaker at Stanford University
have essentially three objectives, namely comparing 3 result interfaces,
evaluating iterative cluster interface and examining the fluidity
between search and browse (structure-based searching & filtering).
A number of criteria have been employed such as time, error rate,
perceived speed, usefulness and usability.
Bergmark, D., Lagoze, C., & Sbityakov, A. (2002). Focused
crawls, tunneling, and digital libraries. Research and Advanced
Technology for Digital Libraries: Proceedings of the 6th European
Conference, ECDL'02, September 16-18, 2002, Paris, France. 91-106.
Aiming to develop large size digital library collection, the
authors propose the hybrid approach of focus crawls (best-first)
and tunneling (prioritize a given page based on value other than
relevance score). To test the effectiveness and efficiency of
this approach, the authors run an automatic harvest on 500,000
unique documents downloaded from the Web and conducted statistical
analysis on observed data. Three constructs are proposed to measure
effectiveness and efficiency. They are nugget (a Web document
whose cosine correlation with at least one of the collection centroids
is higher than some given threshold), dud (a Web document that
does not match any of the centroids very highly), path length
(2 minus the number of duds in the path <the sequence of pages
and links going from one nugget to the next>). The results
show that expanded tunneling concept can achieve highly efficient
and effective focused crawling. It should be noted that the harvest
was done off the real Web.
Bertot, J.C., McClure, C.R. (2003). Outcome assessment in the
networked environment: research questions, issues, considerations,
and moving forward. Library Trends, 51(4): 590-513.
This article identifies a number of research questions related
broadly to library outcomes assessment in a networked environment
and discusses issues affecting these research topics. It also
proposes a framework to relate traditional evaluation components
and terminology to the networked environment and identifies a
number of factors in the networked environment that affect outcomes
and other assessment methods. Meanwhile, a multi-dimensional library
service outcome assessment model is outlined.
Bishop, A. P. (2002). Measuring access, use and success in digital
libraries. Retrieved April 18, 2003, from http://www.press.umich.edu/jep/04-02/bishop.html.
This paper describes and evaluates the accessibility of DELIver,
which is a testbed, and then discusses how to remedy the access
barriers. Convenience and ease of use are viewed as especially important
factors in this paper. The metrics for the evaluation include the
number of registered system users, the number of hits logged on
a digital library's home page, the number of documents viewed or
printed, the degree of penetration, and so on. The faculty and graduate
student are major audiences of DEIver. Focus groups, interviews,
and observation are the major methods for this evaluation. The paper
suggests that both subjective and objective access factors greatly
influence use. The results also indicate that some subject factors
make more contributions to encourage access than some objective
factors.
Bishop, A.P. (1998). Measuring access, use, and success in digital
libraries. The Journal of Electronic Publishing, v.3 (December
1998), Retrieved on September 25 from http://www.press.umich.edu/jep
The paper outlines the evaluation efforts during the implementation
of DeLiver (a testbed collection of journal articles) across the
University of Illinois Campus. The author highlights the significance
of measurement and interpretation of digital library use. From
the design and evaluation, the author found that access barriers
do influence the system use, Whereas accessibility can be measured
by both subjective (e.g. initial expectation of convenience, system
awareness) and objective (e.g. difficulty in accessing the system)
criteria, system use can be assessed through adapting the criteria
for evaluating the use of a physical library (e.g. library use,
material use, material access, degree of penetration). Accordingly,
the combination of different methods (e.g. log analysis, interview)
and benchmark evaluation measures were highly recommended.
Bishop, A.P. (1999). Making digital libraries go: comparing
use across genres. Proceedings of the Fifth ACM Conference on
Digital Libraries, 94-103.
The author combined multiple research methods (e.g. in-depth
interview, survey, focus group, log analysis, etc.) to compare
the information use of two different groups of users (i.e. academic
and low-income communities). She concluded that digital library
use is an assemblage activity associated with social practice,
beliefs and goals, community norms, knowledge, technology access
and proficiency, and resource constraints, and the interplay between
them. New users not only need to learn how to use IR system functions,
but also figure out how to make the system fuse into their daily
life.
Bishop, A.P., Neumann, L.J., Star, S.L & Merke, C. et al.
(2000). Digital libraries: situating use in changing information
infrastructure. Journal of American Society for Information Science,
51 (4): 394-413.
Aiming to examine "how potential users approach new systems"
(DeLiver, a DLI project at Univ. of Illinois), the authors employed
various research methods, such as focus group, survey, interview,
log analysis, user registration, and lab usability test. Several
criteria, such as use, satisfaction, information convergence,
access barrier, were used to achieve the research goal. Their
research findings regarding usage statistics fell within the same
range as those generated by similar full text journal system.
Additionally, insignificant barriers (e.g. trivial technical problem)
"became magnified in the effect of use". Moreover, the
research findings suggest that it is essential to gain the knowledge
of different processes of searching and use germane to different
information worlds.
Biship, A.P., Van House, A.A., & Buttenfield, B.P. (2003).
Digital Library Use: Social Practice in Design and Evaluation.
Massachusettes, Cambridge: The MIT Press.
The book provides rich and in-depth arguments and evidences about
digital library as a socio-technical system, which is composed
of technology, information, carriers of information, people, and
their practice. "It is about digital libraries' interaction
with the larger world of work, institutions, knowledge, and society,
as well as with the production of knowledge." (p.1) Accordingly,
another theme of the book is to perform "technically informed
social analysis" for DL design and evaluation. The book contains
three parts and twelve chapters by a group of DL activists.
Blandford, A. & Buchanan, G. (2002). Workshop report: Usability
of Digital Libraries @ JCDL'02, ACM SIGIR Forum, 36(2): 83-89.
The authors summarize usability related issues from Digital Libraries
@ JCDL'02. The usability test findings reported show: (1) unclearness
about what is usability, although there was essentially agreement
that the term includes many aspects; (2) "immature understanding
of what techniques are appropriate for addressing particular aspects
of design and evaluation (p.85), although there were a number
research techniques proposed; (3) difficulty of identifying potential
users and their tasks; (4) little understanding about what user's
assumptions and familiarity about DL's content, interface, process
and features; (5) divergence of main concerns of different groups
of people: whereas users are more concerned about results, LIS
professionals are emphasizing the procedures; (6) users' little
understanding of the role and purpose of different metadata fields.
Other findings demonstrate that users don't use systems as expected
and usually have poor information handling skills.
Blandford, A., Keith, S., Connell, I., & Edwards, H. (2004).
Analytical usability evaluation for digital libraries: A case study.
Proceedings of the 2004 Joint ACM/IEEE Conference on Digital
Libraries, pp.27-36.
Having argued for the necessity and feasibility of analytical
usability evaluation (conducted by expert using established theories
and methods) as a complementary approach to empirical approach,
the authors compare four different analytical approaches, namely
Heuristic Evaluation (HE), Cognitive Walkthrough (CW), Claims
Analysis (CA), and Concept-based Analysis of Surface and Structural
Misfits (CASSM). For each of the four approaches, they demonstrate
its pros and cons via conducting a case analysis using a single
Web site. The main comparison demonstrate that HE and CW can only
generate superficial data and inadequate for DL domain, whereas
the other two can help in identifying "conceptual difficulties"
and are more useful to DL contexts. In particular, the proposed
CASSM approach is promising in terms of integrating empirical
data to usability principles. Nevertheless, none of the approaches
fit "seamlessly with existing digital library development
practices."
Blixrud, J.C. (2002). Measures for electronic use: the ARL E-Metrics
project. Retrieved on April 12, 2004 from http://www.lboro.ac.uk/departments/dis/lisu/Blixrud.pdf
In order to determine whether their money subscribing electronic
resources is worth investing, 24 ARL members self-funded the 2
years project to "develop measures for describing the resources,
expenditures, and usage of electronic resources." The project
has three phrases, namely initial (inventory of current practices
at ARL libraries as to statistics, measures, processes, and activities
that pertain to networked resources and services), second (identification
and field testing of statistics and measures, recommendation of
measures, surveys and onsite test were used in the stage), and
final (identification of linkage to educational outcomes and impacts,
to research, and to technical infrastructure, content analysis
on standards from education commissions).
Blocks, D., Binding, C., & Cunliffe, D. et al. (2002). Qualitative
evaluation of thesaurus-based retrieval. Research and Advanced
Technology for Digital Libraries: Proceedings of the Second European
Conference, ECDL'2002, September 16-18, 2002, Paris, France.
346-361.
The evaluation is part of the FACET project with the collection
of the National Museum of Science and Industry in UK. The main
thesaurus is faceted Art & Architecture Thesaurus. To illuminate
problems, and inform interface design, the authors conducted this
formative evaluation (log analysis, think-aloud, screen-capture
videotaping, observation note) to analyze users' interaction with
the interface components (e.g. thesaurus display) experimental
system. In particular, the evaluation was focused on how the displayed
thesaurus could assist search process, the formation of faceted
queries, and query reformation. Totally, there were eight participants
in the study among which six were museum professionals, one IT
and one library professional. The search sessions took place in
there participants' acquaint place rather than lab. The results
show that "although the prototype interface supports basic
level operations, it does not provide non-expert searchers with
sufficient guidance on query structure and when to use the thesaurus."
Bollen, J. & Luce, R. (2002). Evaluation of digital library
impact and user communities by analysis of usage patterns. D-Lib
Magazine, 8 (6), Retrived on July 1, 2003 from http://www.dlib.org/
With a belief that user preferences and satisfaction tend to
be highly transient and specific, the authors argue for the significance
of quantitative analysis on more implicit, user community determined
preferred relationships among documents from server logs. In their
study, the document impact was established by using subjective
Journal Consultation Frequency (JCF) rather than traditional Impact
Factor (IF). Whereas the IF is associated with definite citation,
the JCF is determined by search pattern of a given use community.
Pitifully, the article does not provide empirical data revealing
how good the correlation between different journals' JCF can be
used to as an indictor of the quality of DL collection and user
community's preference.
Borgman, C.L. Leazer, G.H., & Gilliland-Swetland, A. et al.
(2004). How geography professors select materials for classroom
lectures: Implication for the design of digital libraries. Proceedings
of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 179-185.
This is a HIB research aiming to "have close understanding
of process by which faculty search for and use information in
support of their teaching." For the research objective, the
authors conducted an interview with nine professors of physical
geography and human geographers. Although the research per se
does not target on DL context, several key findings are suggestive
to DL (e.g. ADPEP) design in terms of: (1) the necessity of providing
more features for concept and teaching purpose searching, creation
and management of personalized DL, resource sharing, and so forth.
Borgman, C.L. (2002). Challenges in building digital libraries
for the 21st century. Digital libraries: people, knowledge, and
technology: Proceedings of 5th International Conference on Asian
Digital Libraries, ICADL 2002, Singapore, December 11-14, 2002
1-13.
The paper summarizes some challenges facing the transition from
DL research /development to practice. One of the challenges is
the need of evaluation to know what works and in what contexts.
"Appropriate evaluation methods and metrics are requirements
for sustainable digital libraries that have received little attention
until recently." "Evaluation has many aspects and can
address a variety of goals, such as usability, maintainability,
interoperability, scalability, and economic viability." Moreover,
"consistent evaluation methods will enable comparison between
systems and services." (p.8) However, the reality is pessimistic.
"Despite the advances in digital library technology, we have
insufficient understanding of their utility for most applications,
and we lack appropriate evaluation methods, metrics, and test
beds for determining their effectiveness relative to various benchmarks."
Borgman, C.L., Gilliland-Swetland, A.J. (2000). Evaluating digital
libraries for teaching and learning in undergraduate education:
a case study of the Alexandria Digital Earth Prototype (ADEPT).
Library Trends, 49 (2): 228-250.
This study employed an ethnographic observation of course content,
teaching style to examine genuine users' information needs and behavior.
Borgman, C.L., Leazer, G.H., et al. (2001). Iterative design
and evaluation of a geographic digital library for university students:
a case study of the Alexandria Digital Earth Prototype (ADEPT).
Proceedings of the Fifth European Conference on Research and
Advanced Technology for Digital Libraries, Darmstadt, Germany,
pp. 390-401.
The paper reports multiple methods (e.g. survey, structured interviews,
classroom observation, laboratory-based usability studies, etc.)
in evaluating ADEPT digital library, User needs, content, modes
of searching and using resources, usability, transparency, and diversity/
extensibility of metadata were examined.
Borgman, C. L. (2002). Final report to National Science Foundation
computer and information science directorate information and intelligent
systems division: Workshop goals, outcomes, and recommendations.
Fourth DELOS Workshop: Evaluation of digital libraries: Testbed,
Measurement, and Metrics. Retrieved June 1, 2003, from http://www.dli2.nsf.gov/internationalprojects/working_group_reports/evaulation.html.
This report is a summary for the fourth DELOS workshop. The basic
issues, such as the importance and the content of the evaluation
of digital library, are discussed. U.S. and E.U. research activities
on DL evaluation are summarized. The themes of the workshop are
presented. In the end, the workshop gives the recommendation for
the future DL evaluation.
Bosman, F.J.M., Bruza, P.D., & van de Weide, Th.P. et al.
(1998). Documentation, cataloging, and query by navigation: A practical
and sound approach. Research and Advanced Technology for Digital
Libraries: Proceedings of the Second European Conference, ECDL'98,
September 21-23, 1998, Heraklion, Crete, Greece. 459-478.
To test how effective the HyperIndex (index organized in the
form of hypertext) can help users search a collection of visual
reproduction of art subjects, the authors conducted several experiments
to compare search performance with HyperIndex and ICONCLASS (non-
hyperlinked but standardized, well documented classification system).
The participants were asked to find answers to the predefined
questions and their interactions with system were logged and analyzed.
The effectiveness is measured by using traditional pair of precision
and recall. Additionally, the number of logical decision (instance
in the log file, such as each class selected, and documents viewed)
and show (view of current selection) were also used as indictors
of effectiveness. The results show the "advantage of HyperIndex
over ICONCLASS."
Browne, P., Gurrin, C., et al. (2001). Dublin City University
Video Track experiments for TREC 2001
The study compared three video browsers (timeline, slide show, and
hierarchical.) using shot boundary detection. Tradition P/R, reference
transition, deletion rate, and insertion rate were the key measures.
Budhu, M. & Coleman, A. (2002). The design and evaluation
of interactivities in a digital library. D-Lib Magazine, 8
(11):
The paper summarizes an educational evaluation regarding learning
effects when users interact with GROW (Geotechnical (soil), rock
and water engineering) collection at Univ. of Arizona. The concept
of interactivity focuses on structured representations of interactive
multimedia resources. The leaning effects were measured by students'
perform test in the lab in terms of understanding of the concepts
learned. The evaluation results are encouraging. In addition to
the preliminary evaluation effort, the further plan for the usability
test is outlined as well.
Carter, D., & Janes, J. (2001). Unobtrusive data analysis
of digital reference questions and service at the Internet Public
Library: and exploratory study. Library Trends, 49 (2): 251-265.
The authors conducted a transaction logs analysis to examine information
needs, questionnaire nature and reference service at Internet Public
Library.
Champeny, L., Borgman, C.L., & Leazer, G.H. et al. (2004).
Developing a digital learning environment: An evaluation of design
and implementation processes. Proceedings of the 2004 Joint ACM/IEEE
Conference on Digital Libraries. 37-46.
This is a case study of designing and implementing ADEPT (Alexandria
Digital Earth Prototype) for undergraduate geography education
in a university setting (University of California, Santa Barbara)
during a period of one academic year. To ensure the implementation
reaches the goal of DLE (digital learning environment) development,
iterative evaluation, including interview on professors, TAs,
and students, and classroom observation has been conducted with
focuses on flexibility, openness, ease of use, and learning outcome.
Overall, the evaluation shows "modest improvements"
between the Fall and Spring semesters primarily with respect to
reliability of access, and learning outcomes (test of graph understanding
and hypothesis generation). However, learning difficulties were
still perceived by students. Furthermore, there are noticeable
functional dichotomy between designers and users, as well as between
teachers and students. For instance Whereas designers showed great
interests in applying advanced technology and expect users explore
functionality by themselves, users expected to see explanatory
documents and tutorials. Similarly, whereas teachers were "generally
tolerant of system flaws," students were uncomfortable with
their lack of understanding and control of the system.
Choudhury, S., Hobbs, B., & Lorie, M. (2002). A framework
for evaluating digital library services. D-Lib Magazine,
8 (7/8):
The authors adopted a multi-attribute, stated-preference technique
to cost-benefit evaluation of hypothetical digital library service,
namely Comprehensive Access to Printed Materials (CAPM) at Johns
Hopkins University. The target users (students, faculty, and staff)
were asked to participate in an online survey, which was designed
by using the results of focus group discussions. Cost-benefit
was measured by users' preferred pair of service level and price
from 36 possible options. The survey results show the users' hypothetical
willingness of payment for the remote accessible full-text service
provided by CAPM.
Covey, D.T. (2002). Usage and usability assessment: library
practices and concerns. Washington, D.C.: Digital Library Federation
Council on Library and Information Resources. Retrieved on 4/12/2004
from http://www.clir.org/pubs/reports/pub105/contents.html
The report provides research findings from interviews with 71
individuals at 24 of the 26 DLF (Digital Library Federation) member
institutions (representing an 86 percent response rate at the
24 institutions) conducted from November 2000 through February
2001. A standard set of open-ended questions was used to examine
the DL collection and service evaluation they had done with respect
to methods, results, experiences and lessons, and so forth. Follow-up
questions varied, based on the work being done at the institution;
in effect, the interviews tracked the efforts and experiences
of those being interviewed. In general, focus groups, survey,
user protocol, heuristic usability test, paper prototype/scenario,
as well as card sorting tests are used for user studies and log
analysis for usage study.
Cox, I.J., Miller, M.L., Minka, T.P., Papathomas, T.V. &
Yianilos, P.N. (2000). The Bayesian image retrieval system, PicHunter:
Theory, Implementation and Psychophysical experiments. IEEE Transactions
on Image Processing, 9(1): 20-37.
The authors propose a Bayesian framework for content-based image
retrieval, that is based on users' similarity judgment to direct
a search. When assessing the performance of PicHunter, which employs
the framework, the authors conducted a series of experiments.
The performance is measure by "the average number of images
required to converge to the desired specific target". The
results indicate that users "attend to the semantic content
of images in judging similarity".
Cullen, R. (2003). Evaluating digital libraries in the health
sector. Part 1: measuring inputs and outputs. Health Information
and Libraries Journal, 20(4): 195-204
Cullen, R. (2004). Evaluating digital libraries in the health
sector. Part 2: measuring impacts and outcomes. Health Information
and Libraries Journal, 21(1): 3-13.
Dillon, A. (1999?). Evaluating on time: a framework for the
expert evaluation of digital interface usability. Retrieved on 4/12/2004
from http://www.ischool.utexas.edu/~adillon/publications/evaluating.html
Based on nine years of investigations of human information usage
from an HCI viewpoint, the author proposes the TIME model aiming
to address key human factors for digital library evaluation. "The
intention of the framework is to provide those developing digital
information resources a way to conceptualize the human factors
influencing the usability of the created artifact." The multi-leveled
framework highlights the interplay among the key human factors,
namely Task (reflect user's need and use), Information
model (user's mental representation of the information space),
Manipulation (support physical use of materials), and Ergonomics
(variables influencing the perceptual processing of words and
images).
Dorward, J., Reinke, D. & Recker, M. (2002). An evaluation
model for a digital library service tool. Proceedings of the
2nd ACM/IEEE-CS Joint Conference on Digital Library, 322-323
This paper summarizes a series of educational evaluation activities
for a NSF educational DL project-Instructional Architech (IA),
which enables users to discover, select, reuse, sequence, and
annotate digital learning objects. The highlighted evaluation
strategies are iterative, user-centered, and rapid-prototyping.
The evaluation model focuses on both project process and outcome.
As such, the authors employed multiple research methods at different
implementation stage for various purposes, such as (1) pilot survey
of audience (for needs assessment in terms of teachers' level
of online teaching resource use, and their perception of the utility
of the IA system), (2) expert review of interface (a group of
graduate students and external professionals in instructional
technology make critiques on interface design and content plan
based on scenario walk through), (3) prototype testing (26 pre-service
elementary teachers were asked to locate learning objects for
2 search scenarios from SMETE Open Federation Digital Library
using the software developed. Observation, post-session focus
groups and post-session surveys were used to collect data on their
perception of utility.
Entlich, R., Garson, L., Lesk, M., & Normore, L. et al.
(1996). Testing a digital library: user response to the CORE project.
Library Hi Tech, 14 (4): 99-118.
The paper focuses on (expectation, perception and usage) to CORE
(The Chemical Online Retrieval Experiment) aiming to project scholarly
journals in e-form. The authors employed various evaluation techniques
(detailed transaction log, online questionnaire, online comments,
in-person interviews, and anecdotes) for user study. The comparison
between the interview and transaction results was conducted for
their perception and actual deed. The results show that there
were about 47% recursive (repeat) users. Other data analysis on
CORE usage include article viewing, printing, reading habit, searching,
and so forth.
Fox, E.A., Hix, D., et al. (1993). Users, user interfaces, and
objects: Envision, a digital library. Journal of the American
Society for Information Science, 44(8): 480-491.
Envision is a digital library project aiming to provide users
with computer science literature. The authors interviewed potential
users, librarians, and computer/information scientists of the
Envision. The collected data was used to come up with nine principles
for digital library development under three categories, namely
representation, architecture, and interfacing. A usability test
illustrated that the application of the nine principles to Envision
implementation was successful.
Fuhr, N., Hansen, P., Mabe, M., Micsik, A., & Solvberg,
I. (2001). Digital libraries: A generic classification and evaluation
scheme. Research and Advanced Technology for Digital Libraries.
Proceedings of the 5th European Conference on Digital Libraries,
ECDL 2001 (Lecture Notes in Computer Science Vol.2163), Germany:
Darmstadt, pp. 187-199.
This article describes a holistic approach to digital library evaluation.
A scheme is presented, which includes four major dimensions: data/collection,
system/technology, users, and usage. The evaluation criteria and
metrics with respect to the dimensions are addressed. In order to
test this scheme, an online survey is conducted. Only 3%-4% of the
targeted audience responded, among which 70% respondents are from
the research domain. The results indicate that the proposed classification
scheme seems to be appropriate for DL characterization.
Fuhr N., Klas, C.P., Schaefer, A. & Mutschke, P. (2002).
DAFFODIL: an integrated desktop for supporting high-level search
activities in federated digital libraries. Research and advanced
technology for digital libraries: Proceedings of 6th European conference,
ECDL'02, Paris, France, September 16-18, 2002, 597-612.
DAFFODIL (Distributed Agents for user-friendly Access of Digital
Libraries) is a federated DL system that offers a rich set of
functions across a heterogeneous set of DLs. To examine how usable
this system is, the authors conducted heuristic evaluation and
questionnaire interviews. During the heuristic evaluation, perceptions
and problems reported from the participants' interface exploration
were recorded. The findings include: irritation due to long waiting
times, unsure about the cause of empty results, need of assistance
in employing new concepts (e.g. author network browser), and so
forth. Meanwhile, the questionnaire interview on the usability
of the DAFFODIL after they tried it show the difficulty of interpreting
and making relevance judgement due to less precise and very large
result sets.
Greenberg, J., Bullard, K.A., & James, M.L. et al. (2002).
Student comprehension of classification applications in a science
education digital library. Research and Advanced Technology for
Digital Libraries: Proceedings of the 6th European Conference, ECDL'02,
September 16-18, 2002, Paris, France. 560-567.
To explore whether children may have the same comprehension of
scientific classification in educational digital libraries as
they have in physical libraries, the authors conduct an experiment
to compare six-grade students' understanding of botany classification
scheme in UNC's Plant Information Center (PIC) and physical library.
The students were asked to assign taxonomy name to a given plant,
and complete a survey with a series of questions about how objects
are grouped in physical and digital environment. Whereas the classification
tasks were successfully completed in both settings, the understanding
of the classification structures in the digital setting seemed
to have diminished compared with physical libraries.
Han, J.W. & Guo, L. (2003). A shape-based image retrieval
method using salient edges. Signal Processing: Image Communication,
18: 141-156.
The authors present a novel five-stage image retrieval approach
based on automatic salient edge detection and similarity matching
among the detected edges. To examine how good the proposed approach
is in terms of effectiveness and efficiency, a preliminary experiment
was conducted to compare the system using the approach with other
two image retrieval models. The results show that the new approach
has the highest accuracy but the longest retrieval time.
Hartland-Fox, B., & Dalton, P. (2003). EVALUEd-an evaluation
model for e-library developments, Ariadne, 31. Retrieved June 1,
2003, from http://www.ariadne.ac.uk/issue31/evalued/.
This article describes a project, i.e., eVALUEd, by which the issues
on e-library evaluation are discussed. These issues include: "what
techniques are being employed, who uses the data collected, how
evaluation can inform decisions, and what evaluation could be conducted
given more time, resources, staffing etc." An online survey is conducted
for this study.
Hauptmann, A., Jin, R., et al. (2001). Video retrieval with
the Informedia Digital Video Library System. TREC'01
, Retrieved on June 1, 2002 from http://trec.nist.gov/pubs/
The study evaluated the effectiveness of information extraction
techniques employed in Informedia Digital Video Library at Carnegie
Mellon. Average Reciprocal Rank (ARR) and Recall were used to measure
the effectiveness. Meanwhile, usability test was conducted to evaluate
the interface.
Hee, M., Ik, Y.Y., & Kim, K.C. (1999). Unified video retrieval
system supporting similarity retrieval. Proceedings of Tenth
International Workshop on Database and Expert Systems Applications,
pp.884-888.
The study compared the effectiveness of integrated feature-based
and annotation-based similarity retrieval. The traditional P/R were
modified as 'user-defined relevant scenes nretrieved relevant scenes'/'user
defined relevance scenes' (R) and 'user-defined relevant scenes
nretrieved relevant scenes'/'retrieved scenes'(P) respectively.
Hidaka, T., Abe, T., & Kokogawa, T. (2001) NetLibra: and
advanced digital library system based on CORBA. TREC'01,
Retrieved on June 1, 2002 from http://trec.nist.gov/pubs/
The study compared the efficiency of searching distributed digital
libraries via CORBA networking technology. Retrieval time was the
key measure.
Hill, L.L., Carver, L. et al. (2000). Alexandria Digital Library:
user evaluation studies and system design. Journlal of the American
Society for Information Science, 51 (3): 246-259.
Online survey, Ethnographic observations and target user groups
were employed to examine information needs and information seeking
patterns. The authors also conducted usability test.
Hill, L.L., Dolin, R., et al. (1997). User evaluation: summary
of the methodologies and results for the Alexander Digital Library,
University of California at Santa Barbara. In C. Schwartz et. (Eds.)
Proceedings of the American Society for Information Science (ASIS)
Annual Meeting, Washington DC, November 1997 (http://www.asis.org/annual-97/alexia.htm)
(pp.225-243, 369). Medford, NJ: Information Today.
Huang, Z., Chung, W.Y., Ong, T.H. & Cheng, H.C. (2002).
A graph-based recommendation system for digital library. Proceedings
of the Second ACM/IEEE-CS Joint Conference on Digital Libraries,
Oregon: Portland, pp.65-73.
The study evaluated the effectiveness (how accurate the prediction
reflects the real tendency) of hybrid search (i.e. content + association)
for book recommend system. The traditional P/R was modified to be
the key measure.
Huxley, L. (2002). Renardus: following the Fox from project
to service. Research and Advanced Technology for Digital Libraries:
Proceedings of the Second European Conference on Digital Libraries,
ECDL'02, September 16-18, 2002, Paris, France. 218-229.
Renardus is a collaborative pan-European project, which provides
"a single, multilingual user interface for cross-searching
and cross-browsing distributed metadata collections held by 12
participating subject gateways." To examine how end users
perceive Renardus, an online survey in five languages was delivered
to potential users via the project Web site, email newsletter
and relevant mailing lists across Europe. Whereas the majority
response distributions are centralized, the ratings on the ease
of use of the browsing functionality are different. Regarding
the quality and adequacy of metadata information, the results
show that the users outside LIS domain tended to feel difficult
to understand the order of metadata element display.
Janssen, Olaf (2004) The European Library user survey of Gabriel,
Gateway to Europe's National Libraries. Retrieved on September 2,
2004, from http://www.bl.uk/gabriel/index.html
The online survey report "This report gives insight into
the background of the respondents, their use of the internet,
their use of the Gabriel website and their opinions about this
site. The respondents were also asked if they would use a shared
catalogue of all the national libraries in Europe if that was
to be created." The short and long PDF versions of the report
are archived at http://www.kb.nl/gabriel/surveys/results2003/gabriel_survey_short.pdf
and http://www.kb.nl/gabriel/surveys/results2003/gabriel_survey_long.pdf
respectively. The screenshor of the online survey form is available
at http://www.bl.uk/gabriel/surveys/results2003/screenshots.html
Jewell, T. D. (1998). The ARL "investment in electronic resources"
study: Final report to the council on library and information resources.
Retrieved on April 13, 2003, from http://www.arl.org/stats/specproj/jewell.html.
This article examines the expenditures of electronic resources,
the organization and accessibility of resources, and outcomes or
how does availability of electronic resources affect users. Surveys
are the major method in this study. The results are summarized in
this article. The major conclusion is that solid information about
the expenditure of electronic resources should be clarified.
Jones, G.J.F., & Lam-Adesina, A.M. (2002). An investigation
of mixed-media information retrieval. Research and Advanced Technology
for Digital Libraries: Proceedings of the Second European Conference,
ECDL'02, September 16-18, 2002, Paris, France. 463-478.
To investigate how effective the mixed-media retrieval (text,
document image, and spoken document in this study) would be if
compared with mono-media search, the authors conducted an experiment
using the existing TREC text, spoken and scanned image collections
along with the spoken document retrieval task. Meanwhile, the
systems with and without pseudo relevance feedback (PRF) were
also compared at mono and mixed collection levels. The results
show that the query expansion via summary-based PRF provided large
improvements in performance for spoken documents, good improvements
for TR, but surprisingly could lead to significant reduction in
performance for document image retrieval." In general, "mixed-media
retrieval performs well without compensation for media specific
indexing problems.
Jones, M.L.W., Gay, G.K. & Rieger, R.H. (1999). Project
soup: comparing evaluations of digital collection efforts. D-Lib
Magazine, 5 (11) Retrieved on 3/12/2003 from http://www.dlib.org/
The authors from the Human-Computer Interaction Group at Cornell
University investigated digital collection evaluation efforts
across five different DL prototype projects with emphasis on "backstage"
concerns (e.g., metadata, copyright and intellectual property
issues), collection maintenance and access (e.g., decisions regarding
collection scope and the maintenance of a consistent quality and
fidelity of digital records) and usability findings. The authors
argue for the necessity of Establishing an Effective Content Base:
that is finding a balance between quantity ("Achieving Critical
Mass"), quality ("Ensuring Fidelity and Accuracy")
and usability (" Meeting the Goals and Objectives of the
End-User"). Additionally, the significance of usability in
digital collection evaluation is emphasized with the statement
that "a digital collection can contain a critical mass of
high quality, copyright-cleared content all organized around a
solid metadata foundation, and still prove to be a failure."
Jones, S., Cunningham, S.J. et al (2000). A transaction log
analysis of a digital library. International Journal of Digital
Libraries, 3:152-169.
The authors conducted a transaction log analysis to examine human
information behavior regarding query formulation and reformulation
and the use of the Computer Science Technical Reports Collection
at New Zealand.
Jones, S. & Paynter, G.W. (2002). Automatic extraction of
document keyphrases for use in digital libraries: evaluation and
application. Journal of the American Society for Information
Science & Technology, 53 (8): 653- 677.
The authors evaluated the Kea automatic keyphrase extraction
techniques used in the New Zealand Digital Library. There are
essentially two evaluation purposes: (1) compare three different
Kea techniques in terms of how effective the automatically extracted
keyphrases match human identified keyphrases and (2) compare the
evaluation results based on authors' keyphrases with those based
on human users' selection. 28 human users were asked to rating
the suitability of each phrase (automatically extracted, author
assigned, or user assigned) as a keyphrase of the document on
an 11-likert scale. Modified pair of precision/recall and other
statistical analysis methods (e.g. Kappa statistic K & Kendall
Coefficient of Concordance W for the level of inter-person agreement)
were employed for the comparison. The results show that "in
general Kea produces keyphrases that are rated positively by human
assessors."
Kapidakis, S., Terzis, S., Sairameshi, J., & Nikolaou, C.
et al. (1998). A management architecture for measuring and monitoring
the behavior of digital libraries. Research and Advanced Technology
for Digital Libraries: Proceedings of Second European conference
on Digital Libraries, ECDL '98, Heraklion, Crete, Cyprus, September
21-23, 1998, 1513: 95-114.
The authors propose a management architecture which can be used
for monitoring and balancing query load in distributed digital
libraries so as to improve efficiency of performance. The efficiency
is measured in the following four aspects: local search response
time, local index database processing time, remote processing
time of the index database, and remote search response time (includes
network delay + remote processing time). The lab experiment demonstrates
the architecture is promising with respect to the expected effect,
although further real setting test is required.
Kassim, A. R.C., & Kochtanek, T. R. (2003). Designing, implementing,
and evaluating an educational digital library resource. Online
Information Review, 27(3): 160-168.
The authors report their iterative and interwoven design/evaluation
efforts on Project i-DLR, a Web based educational resource on
digital libraries (http://www.coe.missouri.edu/~rafee/idigital_libraryR).
To ensure the site is well-implemented for its target users (both
beginner and experts in DL research/professional fields), they
conducted a series of studies by using different qualitative and
quantitative research methods: focus group interviews, Web log
analysis, Web survey, and remote usability evaluation. Whereas
group interviews and remote usability identified a set of problems
users encountered during interacting with the system, the log
analysis show some noticeable HIB (human information behavior),
such as the preference of browsing that searching, simple search
query than the use of Boolean operators. In particular, the authors
advocate for the strengths of a remote usability evaluation in
terms of convenience for users, more natural, authentic and unobtrusive
setting, more variety of system environment, as well as cost-effectiveness.
Kenney, A.R., Sharpe, L.H., & Berger, B. (1998). Illustrated
book study: digital conversion requirements of printed illustration.
Research and Advanced Technology for Digital Libraries: Proceedings
of the Second European Conference on Digital Libraries, ECDL'98,
September 21-23, 1998, Heraklion, Crete, Greece. 279-293.
This collaboration project between the LC and Cornell U. Dept.
of Preservation and Conservation reports the exploration of the
"best means for digitizing the vast array of illustrations
used in 19th and early 20th century publications." The authors
evaluated the quality of digitized works at the following three
levels: essence (how well the digital version has captured essence
by using the unaided eye and within normal distance), detail (how
well the digital version can represent the smallest significant
part of the original as viewed closer or with slight magnification),
and structure (how well the digital version can convey the information
necessary to distinguish one illustration process type from another
under various levels of magnification).
Khoo, C.S.G., Poo, D.C.C., & Toh, T.K. et al. (1998). E-referencer:
a prototype expert system Web interface to online catalog. Research
and Advanced Technology for Digital Libraries: Proceedings of the
Second European Conference on Digital Libraries, ECDL'98, September
21-23, 1998, Heraklion, Crete, Greece. 316-333.
The authors developed E-referecer aiming to assist users to search
OPAC effectively. The E-referecer essentially has two components:
initial search strategies (composed of keyword/phrase search in
all fields as strategy I and subject heading search as strategy
II) and relevance feedback strategy. Based on 12 search topics
selected from the university staff and students' submission, They
compared search performances by the expert system and an experienced
librarian (one of the authors). No improvement or even worsen
performance was observed with respect to precision and average
number of relevant records retrieved after using the E-referencer.
Khoo, M. (2001). Ethnography, evaluation, and design as integrated
strategies: a case study from WES. ECDL'01; Proceedings of the
5th European Conference on Digital Libraries, 263-274.
The case study illustrates how ethnographic observation and documentation
analysis can be used to assess The Water in the Earth System (WES)
collection, a sub-project of DLESE. The theory of user-centered
design and technological frames theory were employed to guide
the analysis. From the observation of design meetings, a correlation
was detected between institutional features of WES community and
project centers. However, no investigation was conducted regarding
how good the WES is and how it can help the user community to
learn and locate desired information.
Kwak, B.H., Jun, W., Gruenwald, L. & Hong, S.K. (2002).
A study on the evaluation model for university libraries in digital
environments. Research and Advanced Technology for Digital Libraries:
Proceedings of the 6th European Conference on Digital Libraries,
ECDL'02, September 16-18, 2002, Paris, France. 204-217.
Having argued the necessity of new evaluation model for university
library in digital age, the authors develop a model in two phases.
In the 1st phase, an initial model was constructed based on the
opinions of library experts and the previous works on the evaluation
of both traditional and digital libraries were collected and analyzed.
In the 2nd stage, Three-run Dlephi surveys on total number of
50 digital library-related professors, researchers, and university
librarians were applied to develop a valid evaluation model. As
the result, a new model, which consists of 7 categories (goal
setting/vision, library specialization, information resources,
information usability environment, information sharing, information
services, and human resources & budget), 35 items, and 92
indicators, was finalized (p.213-215).
Ma, Y.F., Sheng, J. et al. (2001). MSR-Asia at TREC-10 Video
Track: shot boundary detection task, Retrieved on June 1, 2003 from
http://trec.nist.gov/pubs/
The study evaluated the effectiveness & efficiency of boundary
detection techniques. Reference transit, deletion rate, insertion
rate, traditional P/R, and test/normal time were used as measures.
MacCall, S. L., Cleveland, A. D., & Gibson, I. E. (1999). Outline
and preliminary evaluation of the Classical Digital Library Model.
Proceedings of the American Society for Information Science. Retrieved
from June 1, 2003, from http://www.bama.ua.edu/~smaccall/cdlm.html.
This article describes and evaluates an alternative model of the
database retrieval model, namely, the classical digital library
model (CDLM). The number of "Clickthroughs" per month is used as
a critical metric. Library and information professionals and endusers
involved with primary care medicine are recruited as subjects to
answer a series of questions. The results indicate that use of the
digital library saves the user's time for retrieving information.
Marchionini, G. (2001). Evaluating digital libraries: a longitudinal
& multifaceted view. Library Trends, 49 (2): 304-333.
The author employed multiple methods (e.g. observations, semi-structured
& group interviews, surveys, document analysis, learning effect
analysis, etc.) to examine information needs, information use, system
performance, educational effect of Perseus Digital Library.
Marchionini, G., Plaisant, C., & Komlodi, A. (2003). The
people in digital libraries: multifaceted approaches to assessing
needs and impact. In Ann P. Bishop et al. (ed.) Digital Library
Use: Social Practice in Design and Evaluation. Massachusetts,
Cambridge: The MIT Press. pp.119-160.
The authors used observation, interview, document analysis (syllabi,
reading room handouts, reference emails), questionnaires to compare
the learning and teaching effects, system performance, and information
needs in two digital libraries, namely, Perseus DL and Baltimore
Learning Community.
Melucci, M. (2004). Making digital libraries effective: Automatic
generation of links for similarity search across hyper-textbooks.
Journal of the American Society for Information Science and Technology,
55(5): 414-430.
The author devised an automatic generation and insertion approach
for similarity search across hyper-textbooks (HTB), which is based
on statistical clustering algorithms. To assess the performance
of the approach, he conducted a small test to see whether cosslinks
among clusters from different HTBs are effective. In specific,
he used two textbooks on information retrieval published in different
years as testing HTBs, and drew randomly 10 clusters from one,
and then compared them to the ones from the other. The results
demonstrate a high intra-homogeneity between the two clusters.
It should be noted that evaluation was conducted without the involvement
of real users.
Meyyappan, N., Foo, S., & Chowdhury, G.G. (2004). Design
and evaluation of a task-based digital library for the academic
community. Journal of Documentation, 60(4): 449-475.
The authors report their evaluation efforts on how effective
the proposed task-based information organization technique is
in comparison with other two conventional techniques (i.e. alphabetical
and subject-based) in terms of helping university community members
to locate task-demand information from the DWE (digital work environment)
at Nanyang Technology University. In addition to resources included
in conventional DLs, DWE also collect other informal resources,
such as course calendar, university statutes, etc. To address
the evaluation question, an experiment was conducted with the
participation of 60 information science students. The students
were asked to search information for two sets of tasks on interfaces
with different information organization techniques. Time to complete
a single task and perceived usefulness were employed as criteria.
The results show that (1) the "task-based approach took the
least time in identifying information resources"; and (2)
the hybrid approach and the task-based approach "were considered
better than the other approaches for almost all the tasks."
Monopoli, M., Nicholas, D., Georgiou, P. & Korfiati, M.
(2002). A user-oriented evaluation of digital libraries: case study
the "electronic journals" service of the library and information
service of the University of Patras, Greece. Aslib Proceedings-New
Information Perspectives, 54(2): 103-117.
To examine the use of e-journal services at a university in Greece,
the authors conducted an online survey with focuses on who are
the primary user, how often they use the service and for what
purpose, and what are their search strategies. 246 community members
(out of the total number of 13,000 member) finished the survey.
In general, the e-journal service was used by a wide age range,
and the majority of respondents used the service on weekly or
daily basis for a variety of reasons, such as writing papers for
class, publication, degree, supporting lectures, and "keeping
up with the progress in the relevant subject area," and so
forth. Additionally, keywords were the most popular search method
followed by author names, and the online help function was used
by all age and occupation groups. Compared with their print counterparts,
e-journals were considered to be easy to search and use, quick
to access, and readily manipulated.
Orio, N. (2002). Alignment of performance with scores aimed
at content-based music access and retrieval. Research and advanced
technology for digital libraries: Proceedings of 6th European Conference
on Digital Libraries, ECDL'02, Paris, France, September 16-18,
2002, 479-492.
The paper reports an approach allowing retrieve music performance
through an automatic alignment of acoustic recordings with the
corresponding score stored in a musical DL. To test how good the
proposed approach is, the author conducted a preliminary test
using small collection of acoustic and synthetic performance,
during which an expert musician was asked to supervise the screen
and report any mismatch found. The results are encouraging with
a reasonable low mismatching rate (8.2%).
Paliouras, G., Papatheodorou, C., & Karkaletsis, V. et al.
(1998). Learning user communities for improving the services of
information providers. Research and Advanced Technology for Digital
Libraries: Proceedings of the Second European Conference on Digital
Libraries, ECDL'98, September 21-23, 1998, Heraklion, Crete,
Greece. 316-333.
Aiming to examine how well the proposed automatic grouping algorithms
are in terms of constructing meaningful user communities, the authors
employed two criteria, namely overlap and coverage. Whereas overlap
is measured by "the amount of overlap between the constructed
descriptions," (ratio between the total number of categories
in the description and the number of distinct categories that are
covered), coverage is indicated by the proportion of news categories
covered by the constructed user community descriptions. The results
are very encouraging.
Park, S. (2000). Usability, user preferences, effectiveness,
and user behaviors when searching individual and integrated full-text
databases: implications for digital libraries. Journal of the
American Society for Information Science, 51(5): 456-468.
The author modified TREC-5 interactive task and collection to compare
users' interaction with integrated and common interface in DL. Aspectual
recall was the key measure.
Purcell, G.P., Rennels, G.D. & Shortliffe, E.H. (1997).
Development and evaluation of a context-based document representation
for searching the medical literature. International Journal on
Digital Libraries, 1 (3): 288-296.
The test the proposed context model, the author compared inter-subject
consistency when the subjects (medical students) were asked to
assign document structure name to each sentence/paragraph in medical
articles from 4 leading medical journals. The inter-subject consistency
was measured by the kappa coefficient of agreement for nominal
scales. The results show that there is substantial agreement among
the subjects, and hence the model seems to provide effective indexing
scheme. Pitifully, the evaluation was conducted based on looking
at paper documents rather than those in electronic formats.
Rui, Y., Gupta, A., & Acero, A (2000). Automatically extracting
highlights for TV baseball programs. ACM Multimedia 2000,
105-115.
This study evaluated the effectiveness & efficiency of the automatic
segment extraction techniques. Segment overlap amd access time were
the key measures employed.
Salampasis, M., Tait, J. & Bloor, C. (1998). Evaluation
of information-seeking performance in hypermedia digital libraries.
Interacting With Computers, 10: 269-284.
The study evaluated the effectiveness of information seeking performance
by employing relative distance relevance (RDR) as the measure in
hypermedia network contexts.
Sanderson, M. & Crestani, F. (1998). Mixing and merging
for spoken document retrieval. Research and Advanced Technology
for Digital Libraries: Proceedings of the Second European Conference
on Digital Libraries, ECDL'98, September 21-23, 1998, Heraklion,
Crete, Greece. 397-407.
Aiming to address the most effective technique for spoken document
retrieval, two main bodies of experiments were conducted on the
TREC-6 SDR collection (1451 broadcast news stories along with
49 known item topics) together with the Abbot generated transcript.
One is with merged collection and another mixed collection (manually
vs. automatically generated manuscripts). The effectiveness is
the criterion and measured by mean rank, mean reciprocal, and
number of queries where the relevant document is found in the
top n rank, where n is 1, 5, or 10. The results reveal that the
shortcoming of the if-idf weighting approach for this type of
IR retrievals.
Saracevic, T. (2000). Digital library evaluation: toward and
evolution of concepts. Library Trends, 49 (3): 350-369.
Having acknowledged the lack, the difficulty as well as the necessity
of digital library evaluation, the author proposes a conceptual
framework that outlines the constructs and contexts of digital library
development. Meanwhile, the author also suggests an adaptation approach
borrowing existing criteria from conventional library, Information
retrieval and interface evaluation.
Seadle, M. & Peters, T.A. (2000). Project ethnography:
an anthropological approach to assessing digital library services.
Library Trends, 49 (2): 370-385.
In the paper, the authors argue that for digital library evaluation,
"anthropology can provide the initial understanding, the
intellectual basis, on which informed choices about population,
survey design, or focus group selection can reasonably be made."
In light of the argument, nine target user samples of National
Gallery of the Spoken Word (NGSW) were identified. The micro-cultures
and characteristics of the samples were examined. The implications
of the cultural interpretation results to further evaluation are
discussed.
Smeaton, A.F., Paul, O. et al. (2001). The TREC-2001 Video Track
report. Retrieved on June 1, 2003 from http://trec.nist.gov/pubs/
The study evaluated the effectiveness of shot boundary detection
for known topics & general topics. Traditional P/R ratio and
the amount of relevance disagreement among assessors were the key
measures.
Solvberg, I. T. (2002). Report of breakout group on metrics
and testbeds. 4th DELOS workshop on DL evaluation. Retrieved on
June 1, 2003, from http://www.sztaki.hu/conferences/deval/presentations/Breakout_metrics.doc.
This report discusses the metrics and testbeds for digital library
evaluation. The author proposed four questions should be asked at
the beginning of the evaluation process: who need this? what shall
be evaluated? why is this needed? and how can it be done? The basic
components of a testbed are listed, which include the collections
of documents, DL-system components, and user components. Future
researches are proposed.
Sumner, T. & Melissa, Dawe (2001). Looking at digital library
usability from a reuse perspective. Proceedings of the First
ACM/IEEE-CS Joint Conference on Digital Libraries. June 24-28, 2001,
Roanoke, VA. New York: ACM., pp.416-425.
The authors conducted interview, observations to assess the usability
of a digital library. The key measures included information needs,
reuse intent, resource location, comprehension, modification and
sharing effects.
Sumner, T., Khoo, M., & Recker, M. et al. (2003). Digital
libraries in the classroom: Understanding educator perceptions of
"quality" in digital libraries. Proceedings of the
third ACM/IEEE-CS joint conference on Digital libraries, 269-279.
To examine educators' expectation and perceptions of DL (resource)
quality, the authors conducted a series of five focus groups with
37 practicing teachers, pre-service teachers, and science librarians,
drawn from different educational contexts (i.e., K-5, 6-12, College).
The informants were presented with diverse representative digital
educational resources (see p.273 for the list). The findings show
that the informants need "high quality" teaching and
learning resources as well as additional contextual information
beyond that in the resource. Additionally, perceptions of scientific
accuracy, bias, advertising, design and usability, and the potential
for student distraction should also be taken into account.
Tolle, K.M., & Chen, H. (2000). Comparing noun phrasing
techniques for use with medical digital library tools. Journal
of the American Society for Information Science, 51 (4): 352-370.
Aiming to examine effectiveness of their proposed noun phrasing
NLP (Natural Language Processing) technique (AZ Noun Phraser),
the authors compared relative recall/precision of the technique
with other three counterpart techniques. They conducted an experiment
involved 19 domain expert participants (medical librarians, doctors,
medical students/researchers), 10 abstracts randomly selected
from CANCERLIT, and corresponding 10 alphabetically arranged noun
phrases lists derived by the four algorithms from the 10 abstracts.
The results show that the proposed technique with SPECIALIST Lexicon
"resulted in improved recall and precision".
Wesson, J., & Greunen, D.V. (2002). Visualization of usability
data: measuring task efficiency. Proceedings of the Conference
of South African Institute of Computer Scientist and Information
Technologists, SAICSIT 2002, 11-18.
The authors employed a number of measures to assess the usability
of a digital library interface, such as diversity, complementarity,
decomposition, parsimony, space/time resource optimization, self-evidence,
consistency, and attention management.
White, M.D. (2001). Digital reference services: framework for
analysis and evaluation. Library & Information Science Research,
23 (3): 211-231.
The author proposes a descriptive model for analyzing and evaluating
digital reference services. The model "consists of about
100 questions related to 18 categories in four broad areas,"
namely mission/purpose, structure/responsibility to client, core
function, and quality control. To test the model, the author analyzed
the following 6 aspects of 20 digital reference services: archive,
content, selectivity, privacy protection, access, and browsability/searchbility.
The results show usefulness of the model illustrating strengths
and weaknesses of each service in examined aspects.
Whitlatch, J.B. (2001). Evaluating reference services in the
electronic age. Library Trends, 50 (2): 207 - 212.
The paper aims to examine how evaluation criteria and methods
for traditional library reference services can be used in digital
reference settings. Accordingly, different evaluation methods,
such as survey/questionnaire, observation, individual and focus
group interview, and case study, were discussed. Based on reviewing
strengths and weaknesses and representative studies applying these
methods, the author concludes that these methods "can be
used very effectively" in the new environment. However, dramatic
changes of reference process provide challenges to researchers
with respect to assess how users and librarian/reference systems
interact in non face-to-face environment.
Wildemuth, B., Marchionini, G., Yang. M., Geisler, G., Wilkens,
T., Hughes, A., & Gruss, R. (2003). How fast is too fast? Evaluating
fast forward surrogates for digital video. Proceedings of the
ACM/IEEE Joint Conference on Research on Digital Libraries,
TX: Houston, May 27-31, 2003, Los Alamitos, CA: IEEE. pp. 221-230.
The authors conducted an experiment comparing the effectiveness
of the four speed level of the surrogates extracted from 5 digital
videos in OpenVideo by using six measures (corresponding to six
performance tasks). 45 subjects were involved. The measures/tasks
include object recognition (textual), object recognition (graphical),
action recognition, linguistic gist comprehension (full text),
linguistic gist comprehension (multiple choice), and visual gist
comprehension. The measures are regarded as having face validity
with consideration of multiple facets of video browsing behavior
conceptually and perceptually. The findings suggest that for video
browsing, it is essential to have "a range of user control
mechanisms and underlying representations for video."
Wilson, R. & Landoni, M. (2001). Evaluating electronic textbooks:
a methodology. Proceedings of 5th European Conference on Digital
Libraries, ECDL 2001, Darmstadt, Germany, September 4-9, 2001,
1-12.
The methodology is initially proposed to perform usability tests
for the project of EBONI (Electronic Books ON-screen Interface).
It sets out options form selecting materials, participants, tasks
and techniques, which vary in cost and level of sophistication.
No real study was conducted to examine the quality of the proposed
methodology.
Wilson, R., Landoni, M., & Gibb, F. (2002). Guidelines for
designing electronic books. Research and Advanced Technology
for Digital Libraries: Proceedings of the 6th European Conference
on Digital Libraries, ECDL'02, September 16-18, 2002, Paris,
France, 47-60.
The authors mainly outline the "the guidelines emerging
from the EBONI (Electronic Books ON-Screen Interface) Project's
evaluations of electronic books." They also provide a summary
of their evaluation work under the guidelines of specifically
developed "Ebook Evaluation Model," which is outlined
in the preceding year's conference. The model essentially contains
four selection-related aspects: object (three e-textbooks in psychology
differing markedly in appearances), actors (participants <100
students, lectures, and researchers from a range of disciplines
in UK higher education>, evaluators, task developers, and task
assessors), tasks (finding specific facts and performing post-tests),
and evaluation techniques (subjective satisfaction questionnaires,
thinking aloud usability sessions, and interviews). There are
four categories of E-book design guidelines emerged from the evaluation
with respect to adhering to the book metaphor, adapting to the
electronic medium, hardware consideration, and accessibility.
These guidelines might serve as e-book evaluation criteria.
Yang, S.C. (2001). An interpretive and situated approach to
an evaluation of Perseus Digital Libraries. Journal of the American
Society for Information Science and Technology, 53 (14): 1210-1223
In-class observation method was employed to examine how the Perseus
Hypermedia Digital Library can help its users approach their regular
class assignments.
Zhang, H.J., Yong, L.C., et al. (1995). Video parsing, retrieval
and browsing: and integrated and content-based solution. ACM
Multimedia 95, Nov.5-9 1995 San Francisco, CA.
The study compared the effectiveness of human and automatic key
frame extraction techniques. Overlap and accuracy were the key measures.
Zhu, B., & Schatz, B. (1999). Support concept-based multimedia
information retrieval: a knowledge management approach. Proceeding
of the 20th international conference on Information Systems,
North Carolina: Charlotte, pp.1-14.
The study compared the effectiveness of integrated techniques (keyword
& concept space-based search). Traditional P/R was the key measure.
|
|