**6. Summary and conclusions
**
Although perceptrons have become popular categorisation models,
a formal description of the properties of such a model in the
human categorisation context seems to be lacking. This paper puts
forward a descriptive model of human categorisation behaviour
in which the single-layer perceptron (SLP) is the central part.
The model is discussed within Ashby's representation-retrieval-response
selection framework and its modelling properties are studied.
It appears to be useful to separate an average component and a
differential component in the ratio of the response probabilities
for two competing classes. The differential component exclusively
determines the location of the equal-probability boundary between
the two classes. The equal-probability boundaries of the model
are shown to be linear functions of the feature vector. The average
component effectively "scales" the contribution of the
differential component. In one extreme case of scaling the SLP-based
model is shown to be equivalent to an asymptotic instance of the
well-known similarity-choice model. It is also shown that, due
to the scaling, the model's probability functions may have local
extrema in the feature space.

Connectionist models, such as our SLP-based model, generally have
a large number of parameters to be estimated, which may lead to
overfitting. This is one of the reasons why the use of connectionist
models in psychological research continues to meet with suspicion.
We propose a way in which a cross-validation technique called
the "leaving-one-out" method can be used in the context
of human classification data. After our model has been fitted
on a data set, the technique gives an estimate of the model's
generalisability, that is, the model's goodness-of-fit on data
which have not been used in the model estimation procedure. The
proposed technique is not specific to the SLP-based model and
can be used for any classification model.

**Acknowledgements
**
The research reported in this paper was carried out at the Institute
for Perception Research (IPO) in Eindhoven, The Netherlands. The
authors thank Rudi van Hoe, Don Bouwhuis, B. Yegnanarayana and
Yves Kamp for their helpful and constructive criticism, and Rene
Collier for his patience. Louis ten Bosch is with Lernout and
Hauspie Speech Products, Wemmel, Belgium.

**References
**
Ashby, F.G. (1992) Multidimensional models of categorization.
In: F.G. Ashby (Ed.),

Ashby, F.G., & Perrin, N.A. (1988) Toward a unified theory
of similarity and recognition. *Psychological Review 95*,
124-150.

Blumstein, S.E., and Stevens, K.N. (1979) Acoustic invariance
in speech production: Evidence from measurements of the spectral
characteristics of stop consonants. *Journal of the Acoustical
Society of America 66*, 1001-1017.

Blumstein, S.E., and Stevens, K.N. (1980) Perceptual invariance
and onset spectra for stop consonants in different vowel environments*.
Journal of the Acoustical Society of America 67*, 648-662.

Fukunaga, K. (1972) *Introduction to statistical pattern recognition.*
New York: Academic Press.

Fukunaga, K., & Kessell, D.L. (1971) Estimation of classification
error. *IEEE Transactions on Computers 20*, 1521-1527.

Gluck, M.A., and Bower, G.H. (1988) From conditioning to category
learning: An adaptive network model. *Journal of Experimental
Psychology: General 117*, 227-247.

Halle, M., Hughes, G.W., and Radley, J.-P.A. (1957) Acoustic properties
of stop consonants. *Journal of the Acoustical Society of America
29*, 107-116.

Haykin, S. (1994) *Neural networks - A comprehensive Foundation*.
New York: Macmillan College Publishing Company.

Hertz, J., Krogh, A., & Palmer, R.G. (1991)* Introduction
to the theory of neural computation*. Redwood City: Addison-Wesley.

Kewley-Port, D., Pisoni, D.B., & Studdert-Kennedy, M. (1983)
Perception of static and dynamic acoustic cues to place of articulation
in initial stop consonants. *Journal of the Acoustical Society
of America 73*, 1779-1793.

Kruschke, J.K. (1992) ALCOVE: An examplar-based connectionist
model of category learning. *Psychological Review 99*, 22-44.

Kruskal, J.B. (1964) Multidimensional scaling by optimizing goodness
of fit to a nonmetric hypothesis. *Psychometrika 29*, 1-27.

Lippman, R.P. (1987) An introduction to computing with neural
nets. *IEEE ASSP Magazine 4*, 4-22.

Luce, R.D. (1963) Detection and recognition. In: R.D. Luce, R.R.
Bush, and S.E. Galanter (Eds.), *Handbook of mathematical psychology*,
vol. 1, ch. 3, New York: Wiley.

Massaro, D.W. (1988) Some criticisms of connectionist models of
human performance. *Journal of Memory and Language 27*, 213-234.

McClelland, J.L., Rumelhart, D.E., and the PDP research group
(1986)* Parallel distributed processing: Explorations in the
microstructure of cognition.* Cambridge, MA: MIT press.

Nosofsky, R.M. (1986) Attention, similarity, and the identification-categorization
relationship. *Journal of Experimental Psychology: General 115*,
39-57.

Nosofsky, R.M., and Smith, J.E.K. (1992) Similarity, identification,
and categorization: Comment on Ashby and Lee (1991). *Journal
of Experimental Psychology: General 121*, 237-245.

Oden, G.C., and Massaro, D.W. (1978) Integration of featural information
in speech perception. *Psychological Review 85*, 172-191.

Quinlan, P. (1991) *Connectionism and Psychology*. Hemel
Hempstead: Harvester Wheatsheaf.

Shepard, R.N. (1958) Stimulus and response generalization: tests
of a model relating generalization to distance in psychological
space. *Journal of Experimental Psychology 55*, 509-523.

Smits, R. (1995) *Detailed versus global spectro-temporal cues
for the perception of stop consonants.* Doctoral dissertation,
Institute for Perception Research (IPO), Eindhoven, Netherlands.

Smits, R., Ten Bosch, L., & Collier, R. (1995a) Evaluation
of various sets of acoustical cues for the perception of prevocalic
stop consonants: I. Perception experiment. Accepted for *Journal
of the Acoustical Society of America.
*

Smits, R., Ten Bosch, L., & Collier, R. (1995b) Evaluation
of various sets of acoustical cues for the perception of prevocalic
stop consonants: II. Modeling and evaluation. Accepted for *Journal
of the Acoustical Society of America.
*

Ten Bosch, L., & Smits, R. (1996) On error criteria in perception modeling. In preparation.

**Appendix 1
**
In this appendix it is shown that the SLP and the SCM coincide
in the limit case, when the SLP-biases tend to
and the distance of all prototypes to the origin approach infinity.
We assume that all stimulus features are normalised using Eq.
(7), so that all values are grouped around the origin.

Let us first define the SCM processing stages. Each response class
*C _{j}* has one prototype
which is a vector containing

(A1) |

where *w _{k}* is a non-negative parameter representing
the attention allocated to feature dimension

It is assumed that the similarity *s _{ij}* of stimulus

(A2) |

Finally, the probability *p _{ij}* of responding class

(A3) |

where is the response bias for category
*C _{j}*. Note that this response bias is different
from the SLP-bias.

Now, first it is to be shown that the distance between __ F__
and is linear in

(A4) |

*W* denoting the diagonal matrix of attention weights.

Since by definition, it follows that
the distance *d _{j}* of

(A5) |

Since

(A6) |

it follows that

(A7) |

Thus, when is large, we find

(A8) |

Because and have the same direction

(A9) |

After some calculation, it follows that

(A10) |

Hence, is a linear function of __ F__.

If we substitute

(A11) |

Eq. (A10) simplifies to

According to Shepard (1958) and Luce (1963), the biased similarity
*b _{j}s_{j}* of

(A12) |

If is large

(A13) |

If we now substitute

(A14) |

Eq. (A13) simplifies to

(A15) |

Let us now turn to the SLP. As stated in Eq. (5), the function
*s _{j}* is defined as

(A16) |

Using

(A17) |

we find that

(A18) |

Thus, we find for *b _{j}* < 0 and
large

(A19) |

Expression (A19) is equivalent to Eq. (A15) for the similarity
of the SCM with the prototypes at infinite distance
from the origin. Thus we find that, in this limit case, the SLP-biases
*b _{j}* are equivalent to the SCM parameters ,
which stand for , and the SLP-weights

**Appendix 2**

In this appendix it is shown that locally extremal points can
exist in the SLP-based model with arbitrary number of classes
and dimension of feature space. Only the case of *p*_{1}(__ F__)
is considered. Other classes follow by symmetry.

The SLP with *N _{F}* input nodes and

*d _{i}*(

(A20) |

Implicitly, *s _{i}* depends on the coefficients

Differentiating, we obtain

(A21) |

which should vanish for each *j*. Using the fact that *s*_{1}
> 0, , this leads to

(A22) |

for all *j*. It is our purpose to prove that the *w*_{11},
*w*_{21},…, *b*_{1} can be found
such that Eq. (A22) holds for each *j*, given the other coefficients
*w _{ji}* and

(A23) |

and

(A24) |

We obtain

(A25) |

which implies

(A26) |

which yields ()

(A27) |

So all we have to do is to choose the *w _{j}*

This shows that it is possible, in general, for *p*_{1}(__ F__)
to have a local extremum for bounded

In the limit case of , *s*_{1}
will in general tend to 0 or 1 (only in particular cases this
will not be the case). In any case for
each , since
will in general tend to 0. So, if ,
must tend to 0 to avoid degeneration, which means that every *s _{k}*,

© 1996 Roel Smits and Louis Ten Bosch

Back to SHL 9 Contents

Back to Phonetics and Linguistics Home Page

Comments to: martyn@phon.ucl.ac.uk