The comparative adequacy of various psychophysical procedures employed to measure visual thresholds has been investigated. Adequacy is defined by: a reliability; b inferred validity ; and c sensory-determinacy Reliability refers to the extent to which repetitions of measurements made under presumably identical experimental conditions differ from each other.
Inferred validity refers to the extent to which measurements depend upon variables which are generally conceded to be irrelevant to the visual functions of interest. Sensory-determinacy refers to the absolute magnitude of the threshold. The lower the threshold, the more sensory-determinate it is considered to be. Procedural variables which have been studied include the response utilized by the subject to indicate discrimination; the number, spacing, and order of light intensities presented in the measurement series; the general attitude which the subject adopts; and the extent to which the subject is given knowledge of the correctness of his responses.
Psychophysical procedures have been found which appear to have optimum reliability, validity, and sensory-determinacy. Richard Blackwell J. Henry W. Mertens and Mark F. Lewis J.
Judith Wheeler Onley and Charles E. Sternheim J. Olof Bryngdahl J. Peckham and W. Arner J. You do not have subscription access to this journal. Citation lists with outbound citation links are available to subscribers only. Use precise geolocation data. Select personalised content. Create a personalised content profile. Measure ad performance. Select basic ads. Create a personalised ads profile. Select personalised ads. Apply market research to generate audience insights.
Measure content performance. Develop and improve products. List of Partners vendors. An absolute threshold is the smallest level of stimulus that can be detected, usually defined as at least half the time.
The term is often used in neuroscience and experimental research and can be applied to any stimulus that can be detected by the human senses including sound, touch, taste, sight, and smell. For example, in an experiment on sound detention, researchers may present a sound with varying levels of volume. The smallest level that a participant is able to hear is the absolute threshold. However, it is important to note that at such low levels, participants may only detect the stimulus part of the time.
For hearing, the absolute threshold refers to the smallest level of a tone that can be detected by normal hearing when there are no other interfering sounds present. An example of this might be measured at what levels participants can detect the ticking sound of a clock. Young children generally have a lower absolute threshold for sounds since the ability to detect sounds at the lowest and highest ranges tends to decrease with age.
For vision, the absolute threshold refers to the smallest level of light that a participant can detect. Determining the absolute threshold for vision might involve measuring the distance at which a participant can detect the presence of a candle flame in the dark.
For example, imagine that you are a participant in a psychology experiment. You are placed in a dark room and asked to detect when you are first able to detect the presence of light at the other end of a long room. In order to determine the absolute threshold, you would go through a number of trials. During each trial, you would signal when you are first able to detect the presence of light.
The smallest level that you are able to detect half of the time is your absolute threshold for light detection. In one classic experiment, researchers found that after controlling for dark adaptation, wavelength, location, and stimulus size, the human eye was able to detect a stimulus between the range of 54 and photons.
For odors, the absolute threshold involves the smallest concentration that a participant is able to smell. An example of this would be to measure the smallest amount of perfume that a subject is able to smell in a large room. The absolute threshold for smell can vary considerably depending upon the type of odor used, the dilution methods, the data collection methods the researchers are utilizing, characteristics of the participants, and environmental factors.
Even the time of day that data is collected can have an influence on the absolute threshold. We then extract reaction thresholds from reaction probabilities and from reaction times to tones of different levels and durations, compare those thresholds, and finally examine the relationship between reaction and detection thresholds directly. In this section, we show that there is substantial intersession variation in detection threshold levels in excess of intrasession variation.
We can do this, because we measured, for all listeners and in combination with RT experiment A, detection threshold levels for the same 12 stimuli in each session. The differences shown are averages across two independent groups of stimuli, viz. These data strongly point to the existence of substantial intersession variation in detection thresholds. Detection thresholds are subject to intersession variation.
Differences shown are averages across two independent groups of stimuli, viz. Figure 2 shows that this was indeed the case in nearly every listener. Figure 2 a shows that for 20 of 22 listeners included in this analysis, the observed intersession variance of the average of the detection thresholds for all 12 stimuli, V mean , is larger, and in many cases considerably so, than the one expected in the absence of systematic intersession variation, i.
Figure 2 b provides a scatterplot of V inter versus V intra , calculated with Eqs. They sum to yield V total [Eq. It is apparent that V inter can be as large as or even exceed V intra. Note that the observed variance considerably exceeds the one predicted with this assumption. Note that the intersession variance can be as large as, or even larger than, the intrasession variance. The figure includes data from 11 additional listeners who participated in measurements of detection thresholds but not in RT experiments, yielding 22 listeners in total.
We did observe a weak tendency for detection thresholds to improve with session number, consistent with findings of other researchers e.
This improvement may be attributable to perceptual learning in some listeners, but other factors are clearly involved as well. Listener L9 quit smoking halfway through experiment A, at which point her sensitivity improved rapidly by about 4 dB. In any event, these analyses clearly show that there is considerable intersession variation in a listener's sensitivity.
Such variation will inevitably broaden psychometric functions when computed from detection responses collected on different days relative to functions obtained from responses on a given day. Reaction probabilities and reaction times also exhibited substantial intersession variation. To demonstrate this intersession variation and to analyze it quantitatively, we exploit the fact that the reference stimulus Corresponding data for listener L5 are shown in Figure 4. Data obtained in a given session are represented by the same symbol and are connected by lines, and in the following are referred to as RP L and RQ L functions.
The panels also allow one to appreciate that the shapes of the RP L functions and of the RQ L functions obtained from a given listener on different days are rather similar. Thus, most of the variation appears to be attributable to simple displacements of the individual functions relative to each other along the level axis, and in the case of RQ also along the RQ axis.
Consequently, it should be possible to bring these functions into close register by appropriate corrections for such displacements. Indeed, this proves to be the case, as shown next. Reaction probability RP a—c and 0. Each function was derived from a single session. In the top row a, b , L is the SPL corresponding to the stimulus maximum plateau amplitude. These are the raw data without any corrections applied CP0. In the bottom row, the same data are shown after application of correction procedure 2 CP2 for RP c and of 3 CP3 for RQ d and as described in the text.
Note the close alignment of the functions from different days. Data as those in Figure 3 , but for listener L5. Same format as in Figure 3. We explored three correction procedures termed CP1, CP2, and CP3 and quantified their effect on a measure of distance between the functions of different days. This measure was derived by first obtaining the differences between all possible pairs of functions at supporting points for each pair.
These points were equally spaced along the level axis over the range from the lowest to the highest stimulus levels shared by both functions of a pair.
We then averaged the squared differences for each pair of functions and divided the sum of these averages by the number of pairs of functions. Medians symbols and interquartile ranges error bars , across all listeners and experiments, of the measures of distance resulting from the different correction procedures and normalized with respect to those for the original RP L and RQ L functions i.
The second correction procedure CP2 shifted each RP L and RQ L function obtained on a given day along the level axis by the amount required to minimize the measure of distance without altering the mean SPL across all days. For RP, the correction produced by this procedure was essentially optimal. For RQ, the correction produced by CP2 was not yet optimal. As expected, the measures of distance between these functions are smaller than with CP2 Fig.
For reasons of space, we restrict all following analyses to the displacement estimates derived from those correction procedures that yielded the optimal alignment of RP L and RQ L functions from different days, viz. The relative displacements of the RP L and the RQ L functions along the level axis estimated with these procedures viz. First, a straight line was fit to the data. We used a perpendicular regression for this purpose, because both parameters are subject to error and a linear regression, which only minimizes the vertical distance of the data points from the regression line, cannot be used.
The perpendicular regression, which minimizes the sum of the squares of the perpendicular distance of each data point from the regression line, is ideally suited here as both axes have the same units. It yielded a regression line with an intercept of 0 and a slope of 1. Their mean was 0 dB and the SD was 0. A Kolmogoroff—Smirnoff test revealed that the null hypothesis of a normal distribution of the differences around zero could not be rejected.
Note their close correlation and lack of systematic differences. Note their independence. Here, the perpendicular regression yielded a straight line with an intercept of 0 and a slope of 1. Note the close and presumably causal relationships.
Note the close correlation of the three measures. These intimate and likely causal relationships between intersession differences in detection threshold levels and in the positions of RP L and RQ L functions along the level axis are also reflected in Figure 7 c. The panel emphasizes the common fluctuations of these measures from session to session.
Of course, due to averaging these fluctuations are smaller than those seen in individual listeners cf. In summary, these analyses strongly suggest that the true relative displacements of RP L and RQ L functions along the level axis are identical, and also identical with the differences in detection thresholds.
The small differences between the estimates of these displacements and differences, viz. It is thus more likely that the drop is caused by other factors. Rather, the values appeared to fluctuate irregularly around zero. The analyses reported so far were based on RP and RQ not corrected for false alarms.
It is conceivable that intersession differences in false alarms might have biased the displacement estimates. However, this was not the case. First, we found that for all listeners but one the observed numbers of false alarms in different sessions of an experiment were compatible with binomial distributions of constant probability binomial statistics were used because the number of false alarms on a given catch trial could either be 1 or 0.
The only exception was listener L1. She had completed 12 sessions of experiment A and 12 of B before completing another four sessions of experiment A, about a year after the initial ones. In these later sessions, her probability for producing false alarms had dropped more than fold from the previous value. Therefore, we treated the last four sessions as a separate repeat experiment. The mean difference in displacements along the RQ axis was 0 ms and the SD 3 ms. In this section, we use RP to extract reaction thresholds for stimuli of different durations.
We first focus on the RP to the reference stimuli Because these functions from different days are in close register see Figs. An additional example L4 is shown in Figure 8 filled circles. For simplicity, we refrain here from including the generally low probability of lapses see Treutwein and Strasburger as another free parameter. The normal distribution function provided excellent least-squares fits to the data continuous lines through filled circles in Fig.
For reasons of space, we restrict the following analyses to RPs corrected for false alarms. The difference between the two measures increased as F FA increased, to a maximum of 2. Psychometric functions and reaction thresholds derived from response probabilities. The inset shows a blowup of the bottom left-hand section of the functions. In each experiment, we also presented stimuli with durations other than Table 1.
Figure 9 a shows such data for listener L11 and for stimuli A to A, in addition to that for stimuli A-1 to A The data for the shorter-duration stimuli were of course less extensive and covered a psychometric function only incompletely. Therefore, they were discarded from further analyses and are not shown in Figure 9 a. The reaction thresholds obtained from the two fitting variants were virtually identical in most cases, but in some cases, we had little confidence in the fits when both parameters were free.
Reaction thresholds for stimuli of different durations. Here, we explore the possibility that a reaction threshold can also be obtained from RT and not only from RP. The idea stems from the fact shown above that 1 RTs, just as RPs, are exquisitely sensitive to the variation in a listener's detection threshold, and 2 reaction thresholds are a function of stimulus amplitude and duration. One component of RT, a minimum reaction time, needs to be considered before we can embark on the extraction of a reaction threshold from RT.
RT is generally thought to include a minimum time needed to execute the proper response once the stimulus has been detected. Thus, in order to obtain the stimulus-dependent component of RT, RT min should be subtracted from the RT on every trial. However, RT min or its distribution is unknown and cannot be directly measured.
But, to a first approximation, it provides a reasonable fit to the dependence of RT on stimulus intensity in several sensory systems and in general a better fit than the alternative functions that have been suggested see Pins and Bonnet , ; Bonnet et al.
In analogy to the way in which we derived thresholds from RPs, we used the RQs corrected for false alarms and for intersession differences, i. The latter values were converted into pressure units Pascal before fitting Eq. We restricted the fits to data from the reference stimuli that formed the level series in each experiment.
The values of RQ min that were obtained from such fits ranged from about to ms in different listeners. This range was large relative to the within-listener variation of RQ min between experiments Figs. The latter did not exceed 15 ms and thus was similar to the intersession variation of RQ, viz.
Comparison of reaction thresholds obtained from reaction times with those obtained from reaction probabilities and with detection thresholds. Each panel a — h shows data from a different listener and experiment A or B. The ordinates represent level corresponding to the mean stimulus amplitude referred to dB SPL m , derived either from the entire stimulus duration or from the time-to-threshold. The crosses represent the functions relating SPL m of the reference stimuli of the level series Note the close match of reaction thresholds obtained from reaction times and reaction probabilities and their distance to detection thresholds.
Comparison of reaction thresholds with detection thresholds. SPL refers to the maximum stimulus amplitude. Data from the four listeners who contributed more than one data point are shown with different symbols and are connected by dashed lines. The thick line represents a running average over 5 points. Note the slight increase as stimulus duration decreases.
Other conventions as in b. Note the trade-off between RQ min and accuracy across listeners but the lack of such a trade-off within listeners. The left side of Eq. That initial portion of the stimulus can thus be considered as a threshold quantity. Consequently, the maximum stimulus amplitude, the one prevalent during the plateau, may not be the best measure from which to derive a specification of threshold level.
We have previously shown Heil and Neubauer , ; Neubauer and Heil that the mean amplitude during the interval from stimulus onset to the end of the initial portion that may suffice to evoke the response provides a better measure for the type of stimuli used here. Eight representative examples from six listeners are shown, with data from both experiments shown for two listeners in Figures 10 e—h. The subtraction of RQ min causes the data points to move to the left and slightly downward, as indicated for a single pair of corresponding crosses and open circles in Figure 10 a pointer and line.
The resulting threshold estimates are also plotted in Figure 10 see key and are virtually identical with those obtained from the reference stimuli open circles.
The two threshold estimates are essentially identical for listeners L2 Fig. For the other listeners, slight discrepancies between the two threshold estimates remain Fig. And the shortest RQs measured were always shorter than the RQ min estimated from a fit of Pieron's law, and were thus unexplained by that law.
Clearly, this issue needs further attention in future studies. In summary, by and large, very similar estimates of reaction thresholds are obtained from reaction probabilities and reaction times. In this final section, we compare the reaction and the detection thresholds directly.
The open squares in each panel of Figure 10 show the detection thresholds, L T D also in units of dB SPL m , for single-burst stimuli of different durations obtained from each of the six listeners, allowing an easy comparison with the reaction thresholds.
In each case, the two threshold functions are roughly similar and parallel, but the function for the reaction threshold always lies above that for the detection threshold obtained with the adaptive 3I-3AFC procedure. Figure 11 a plots data from another three listeners. We next calculated the difference in dB between the two threshold estimates, first at the stimulus duration of Two observations on those data are remarkable.
This means that a higher SPL is required to yield a probability of 0. This covariation is also seen in three L1, L2, L5 of the four individual listeners who took part in both experiments. F FA of L4 did not change much between experiments. In Figure 11 b, data from each of those listeners are represented by the same symbol and connected by dashed lines. The F FA of L1 in the four repeat sessions of experiment A see above had dropped more than fold from its previous value.
In addition, the difference between this reaction and the detection threshold would have been negative in the case of the largest F FA observed, a rather implausible difference. Figure 11 c plots this difference as a function of stimulus duration.
We have shown here that auditory thresholds for tones in quiet can be extracted from the reaction probabilities and reaction times of listeners in a simple RT paradigm, and that such reaction thresholds are somewhat higher than the detection thresholds measured from the same listeners with an adaptive forced-choice procedure under otherwise nearly identical conditions.
In a first step, we showed that RP and RT, even to clearly audible stimuli, are intimately linked to detection thresholds by exploiting the intersession variation in RP, RQ, and detection thresholds Figs. To demonstrate this covariance, it was essential to probe the auditory system with a sufficient percentage of near-threshold stimuli to cover the steep portions of the psychometric, RP L , functions and of the RQ L functions.
The intersession variations and their magnitudes of RP, RQ, and detection thresholds are interesting in their own right, and raise questions regarding the factors that cause such variation, particularly that of detection thresholds. Variations in absolute thresholds have been previously observed in studies concerned with test—retest reliability of pure-tone thresholds see, e.
It is unlikely that differences in headphone placements on different days are a major cause of this variation. First, the output of the special circumaural earphone used is largely independent of placement, as long as it is placed correctly own measurements with an artificial ear yielded an estimate of variability of about 0.
Second, potential placement effects would tend to average out because listeners put the headphones on and took them off at least three times in the course of a session. Thus, if their exact shapes are of interest, e. A major purpose of our study was to compare reaction and detection thresholds. This comparison required the use of comparable measures.
We achieved this by determining the level of a stimulus, of given envelope and duration, which led to an RP of 0. When specified in dB SPL, both thresholds decrease in an orderly fashion with increasing stimulus duration Figs. To extract a threshold from RT, we determined RQ, the 0. We then followed the reasoning and the approach developed to extract the thresholds of single auditory neurons from their first-spike latencies Heil and Neubauer , ; Heil The approach is based on the rationale that the response the first spike in the case of a neuron; the reaction in the case of a listener in an RT experiment following the onset of a stimulus is evoked when the stimulus reaches the threshold, of the neuron or the listener, respectively, but occurs with some delay thereafter.
In the case of a neuron, the delay is mainly a transmission delay that can be considered constant for a given neuron and for a given stimulus frequency see Heil and Neubauer , ; Heil for discussion. In the case of a listener in an RT experiment, the delay, RQ min , comprises transmission delays in both the sensory and the motor pathways.
In addition, the required motor response is under conscious control and, thus, the time of its initiation and the speed with which it is executed will be subject to more variability than a passive transmission delay in a sensory system of an anesthetized animal. Thus, the assumption of a constant RQ min for a given listener during a given session, as done here, can only be a first approach. Nevertheless, even this simple approach showed that the thresholds derived from RQ are very similar to those derived from RP Fig.
A better model may reveal that they are identical. However, when we derived the reaction threshold from RPs uncorrected for false alarms, we obtained the reverse order in the listener with the highest probability of producing false alarms. This emphasizes the need to correct RPs and RTs for false alarms, especially when the probability for false alarms is high, before meaningful conclusions can be drawn.
Already early models of simple RT include a decision stage e. As listeners may have some resistance to react, they may need to accumulate more evidence to reach the decision criterion to press the key in an unforced design than necessary for detection in a forced-choice design. Therefore, reaction thresholds can be expected to be higher than, or possibly equal to, detection thresholds.
This conclusion was also drawn by Pfingst et al. This amount is expressible in decibels and when done appears to increase slightly as stimulus duration decreases Fig.
Because this relationship is derived from RP corrected for the effects of false alarms, it indicates that a large F FA is the consequence, and not the origin, of a small amount of evidence, and vice versa. This relationship could, in some sense, also be viewed as a speed—accuracy trade-off. Thus, with everything else unchanged, the listener would produce fewer false alarms, when the required amount of evidence is high than when it is low, but would take longer to respond to a stimulus of a given SPL.
The delay would be more pronounced as the SPL becomes lower, consistent with the observed stimulus-intensity-dependent effects of speed versus accuracy emphasis on RT e.
On the other hand, however, and provided the stimuli cover the steep portion of the psychometric function, the proportion of misses will also increase when the required amount of evidence increases. This stresses the need to precisely define the term accuracy before dealing with this type of trade-off. Currently, we have no simple explanation for this phenomenon. In this context, it is interesting that Seitz and Rakerd observed that an individual's overall speed is largely modality-independent.
Simple RT to auditory stimuli is generally thought to be a measure of the loudness of the test stimuli. The idea was originally proposed by Chocholle , who studied RT as a function of sound frequency and level. A number of subsequent investigations obtained results that supported Chocholle's conclusion of a close inverse relationship between loudness and RT e. To our knowledge, only Kohfeld and colleagues Santee and Kohfeld ; Kohfeld et al.
0コメント