RT list: Relevance Theory and Information Science

From: Howard White <whitehd@drexel.edu>
Date: Wed Jan 13 2010 - 23:08:33 GMT
Hello, All,

Dan Sperber was kind enough to announce on this list my recent article:

White, Howard D.  "Some new tests of relevance theory in information science."

The journal Scientometrics has published this online; it will appear in print later this year. 
The slides from my presentation at the International Society for Scientometrics and
Informetrics (Rio de Janeiro, 2009) are on the Web, but not all are well reproduced.
In any case, they are hard to follow without a textual accompaniment.

For anyone interested, I would suggest the two long articles below, in which I try to mesh
relevance theory with information science across a number of fronts.  The "new tests"
of the article above are made with reference to the "old tests" in these two articles.  I
realize that the abstracts posted below may make them seem outlandish, but I assure those
on this list that they are easy enough to read, with all technical terms and quantification
explained at length and illustrated with copious examples. Applying relevance theory
to data from bibliometrics (the quantitative study of literatures) allows me to explain why
a formula long used in computerized information retrieval has been popular--a formula
for ranking documents by their predicted relevance to a user's query.  However, I also
show that RT can underpin a unified explanation of many things in library-related
information science that have heretofore not been explained, let alone integrated.  Those
on this list may not be aware that information scientists commonly hold relevance to be
the central concept of their field. What they usually have in mind is ARTIFICIAL relevance: 
the algorithmic responses of a system in delivering appropriate writings to its users. But
information scientists have done not nearly as good a job of analyzing "relevance" as S&W
and their followers. So in my work I am attempting to move insights from RT into the
study of artificial relevance in dialogues between literature-based information systems
and human beings. I use numeric measures in the SYSTEM's predictions of what
items persons will perceive as most relevant, but, with S&W, I believe that PERSONS
can only judge relevance in crude degrees of "more" or "less."  And, yes, Peter and Mary
do appear.

MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be White, Howard D. (2007) MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be Combining bibliometrics, information retrieval, and
relevance theory.  Part 1:  First examples of a synthesis.
MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be Journal of the American
Society for Information Science 58: 536-559.


MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be Retrievable at:   174.129.205.30/libr202/white_1.pdf

Abstract:  In Sperber and Wilson’s relevance theory (RT),
the ratio Cognitive Effects / Processing Effort defines the
relevance of a communication. The tf*idf formula from
information retrieval is used to operationalize this ratio for
any item co-occurring with a user-supplied seed term in
bibliometric distributions. The tf weight of the item predicts
its effect on the user in the context of the seed
term, and its idf weight predicts the user’s processing effort
in relating the item to the seed term. The idf measure,
also known as statistical specificity, is shown to
have unsuspected applications in quantifying interrelated
concepts such as topical and nontopical relevance,
levels of user expertise, and levels of authority. A
new kind of visualization, the pennant diagram, illustrates
these claims. The bibliometric distributions visualized
are the works cocited with a seed work (Moby
Dick), the authors cocited with a seed author (White HD,
for maximum interpretability), and the books and articles
cocited with a seed article (S.A. Harter’s “Psychological
Relevance and Information Science,” which introduced
RT to information scientists in 1992). Pennant diagrams
use bibliometric data and information retrieval techniques
on the system side to mimic a relevance-theoretic
model of cognition on the user side. Relevance
theory may thus influence the design of new visual information
retrieval interfaces. Generally, when information
retrieval and bibliometrics are interpreted in light of
RT, the implications are rich: A single sociocognitive theory
may serve to integrate research on literature-based
systems with research on their users, areas now largely
separate.

White, Howard D. (2007) Combining bibliometrics, information retrieval, and
relevance theory.
MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be Part 2: Some implications for information science. Journal of
the American Society for Information Science58: 583-605.

MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be Retrievable at:   MailScanner has detected a possible fraud attempt from "174.129.205.30" claiming to be MailScanner warning: numerical links are often malicious: 174.129.205.30/libr202/white_2.pdf
Abstract:  When bibliometric data are converted to term
frequency (tf) and inverse document frequency (idf) values,
plotted as pennant diagrams, and interpreted according to
Sperber and Wilson’s relevance theory (RT), the results
evoke major variables of information science (IS). These
include topicality, in the sense of intercohesion and
intercoherence among texts; cognitive effects of texts in
response to people’s questions; people’s levels of expertise
as a precondition for cognitive effects; processing
effort as textual or other messages are received;
specificity of terms as it affects processing effort; relevance,
defined in RT as the effects/effort ratio; and authority
of texts and their authors. While such concerns
figure automatically in dialogues between people, they
become problematic when people create or use or judge
literature-based information systems. The difficulty of
achieving worthwhile cognitive effects and acceptable
processing effort in human-system dialogues explains
why relevance is the central concern of IS. Moreover,
since relevant communication with both systems and
unfamiliar people is uncertain, speakers tend to seek
cognitive effects that cost them the least effort. Yet hearers
need greater effort, often greater specificity, from
speakers if their responses are to be highly relevant
in their turn. This theme of mismatch manifests itself
in vague reference questions, underdeveloped online
searches, uncreative judging in retrieval evaluation trials,
and perfunctory indexing. Another effect of least effort is
a bias toward topical relevance over other kinds. RT can
explain these outcomes as well as more adaptive ones.
Pennant diagrams, applied here to a literature search
and a Bradford-style journal analysis, can model them.
Given RT and the right context, bibliometrics may predict
psychometrics.



Received on Wed Jan 13 23:08:49 2010

This archive was generated by hypermail 2.1.8 : Wed Jan 13 2010 - 23:09:34 GMT