Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.77 MB, 266 trang )
2.4
How big is big enough
Summary
To conclude this section, estimates of error rates are dependent upon the size of the
sample and the analysis methods used. In order to estimate error rates accurately, we
need datasets big enough or statistical measures sensitive enough to capture examples
of, and to estimate rates of, low frequency errors, short lived errors and errors in low
frequency structures. However, even if we employ such methods, we should be especially cautious about drawing conclusions about how data support our hypothesis
when we know that the methods we have used may bias the results in its favour. Those
who hypothesize that error rates will be low for a certain structure (e.g., Hyams 1986)
must recognize that overall error rates are likely to under-estimate rates of error in low
frequency parts of the system. Those who argue for high error rates in low frequency
structures (e.g., Maratsos 2000) cannot point to high error rates in individual samples
or at particular points in time as support for their predictions, unless they have also
demonstrated that such error rates cannot be attributed to chance variation.
3. Sampling and the investigation of productivity
A second issue at the heart of much recent work is the extent to which children have
productive knowledge of syntax and morphology from a very early age. Many have
claimed that children have innate knowledge of grammatical categories from the outset (e.g., Hyams 1986; Pinker 1984; Radford 1990; Valian 1986; Wexler 1998). In support is the fact that even children’s very first multi-word utterances obey the distributional and semantic regularities governing the presence and positioning of
grammatical categories.
However, others have claimed that children could demonstrate adult like levels of
correct performance without access to adult like knowledge, simply by applying much
narrower scope lexical and/or semantic patterns such as agent + action or even ingester
+ ingest or eater + eat. In support are studies on naturalistic data that suggest that children’s performance, although accurate, may reflect an ability to produce certain high
frequency examples of grammatical categories, rather than abstract knowledge of the
category itself (e.g., Bowerman 1973; Braine 1976; Lieven, Pine and Baldwin 1997;
Maratsos 1983). These studies suggest that we cannot attribute abstract categorical
knowledge to children until we have first ruled out the possibility that their utterances
could be produced with only partially productive lexically-specific knowledge.
This is clearly a valid argument. However, it is equally important that we do not
assume that lexical specificity in children’s productions equates simply and directly to
partial productivity in their grammar. In fact, the apparent lexical specificity of children’s speech may sometimes simply be an artefact of the fact that researchers are analysing samples of data. There are three potential problems. First, even in big samples,
we capture only a proportion of the child’s speech, which means children are unlikely
Caroline F. Rowland, Sarah L. Fletcher and Daniel Freudenthal
to demonstrate their full range of productions. Second, the frequency statistics of the
language itself may bias the analysis in favour of a few high frequency structures. Third,
the productivity of the child’s speech is limited by the range of lexical items they have
in their vocabulary. These three problems are illustrated below.
3.1
The effect of sample size on measures of productivity
In small samples, the presence or absence of just one or two utterance types can have a
large effect on the proportion of utterances that can be explained in terms of a small
number of lexical frames. In particular, the chance capture of just one or two tokens of
a high frequency utterance type can increase the proportion of data accounted for by
this utterance type quite dramatically. Conversely, the chance capture of one or two
tokens of a low frequency utterance type will decrease the amount of data accounted
for by high frequency types. In other words, the smaller the sample, the greater the
possibility that the analysis will either over- or under-estimate the degree of lexical
specificity in the data.
Rowland and Fletcher (2006) tested the effect of sample size on estimates of lexical
specificity in English wh-question acquisition directly. The idea that children’s early
wh-questions may be based on semi-formulaic question frames dates back over 20
years to Fletcher (1985), who argued that the earliest correct wh-questions produced
by the child in his study could be explained in term of three formulaic patterns. Rowland and Fletcher used the intensive data from Lara at age 2;8 (described in section
2.1.2) to compare the lexical specificity of wh-question data in different sized samples.
They extracted all correct object and adjunct wh-questions from the intensive sample,
and then created three further smaller sample sizes out of these data using a randomizing algorithm. The smaller samples represented sampling densities of four hours per
month, two hours per month and one hour per month. For each sample, they then
calculated how many of the child’s wh-questions could have been produced simply by
the application of the three most frequent lexical frames. A frame was defined as a whword + auxiliary unit (a pivot; e.g., what are, where have), which combined with a
number of lexical items (variable) to produce a pivot + variable pattern (e.g., what are
+ X; where have + X; see Rowland and Pine 2000).
Table 4 indicates the effect of sample size on estimates of lexical specificity, based
on the same data that has been reported in Rowland and Fletcher (2006). The table
demonstrates that a substantial number (76%) of the questions recorded in the intensive diary data could have been based on just three lexical frames. Some of the smaller
samples yielded measures of lexical specificity very similar to those gathered from the
How big is big enough
Table 4. Effect of sample size on estimates of lexical specificity in Lara’s wh-questions
% Questions accounted for by three most frequent lexical frames
Smaller samples
Mean across
seven samples
(%)
Standard
deviation (sd)
4-Hour samples
2-Hour samples
1-Hour samples
78.00
78.29
76.19
5.77
8.90
14.77
Intensive
Diary data
Lowest rate
from any
individual
sample (%)
70
68
50
Highest rate
from any
individual
sample (%)
86
92
92
76%
intensive data despite being based on much smaller numbers of utterances. However,
many of the individual small samples yielded inaccurate estimations, which meant that
the chances of any one sample grossly under- or over-estimating the rate of lexical
specificity increased with reducing sample size. For example, the estimates based on
the one-hour samples varied between 50% and 92%. Thus, if Lara’s questions had been
sampled for only one-hour per month, the data would be equally likely to over-estimate
(92%) as under-estimate (50%) the lexical specificity of Lara’s data. In other words,
with a small sample, it would be chance that determined whether Lara’s data supported
or undermined the claim that lexical frames underlie children’s early productions.
3.2
The effect of frequency statistics on measures of productivity
A second possible confound is the effect of the frequency statistics of the language being learned on estimates of lexical specificity/productivity. The traditional measure of
lexical specificity is to calculate the proportion of children’s utterances that could be
produced using a small number of lexical frames (e.g., a + X; the + Y; Pine and Lieven
1997). However, even in adult speech, speakers tend to over-use a small number of
words (e.g., the verbs do and be), and under-use a much larger number of words (e.g.,
bounce, gobble; see e.g., Cameron-Faulkner, Lieven, and Tomasello 2003). This means
that a small number of items will tend to account for a large proportion of the observed occurrences of a grammatical category, even in speakers with abstract adult like
knowledge of the category. Thus, analyses on naturalistic data samples are likely to
under-estimate the variety and productivity of children’s speech (Naigles 2002).
Similarly, correlations between frequency of use in caregiver’s speech and order of
acquisition in the child’s speech have traditionally been seen as evidence that children
are first acquiring knowledge of the most highly frequent lexical constructions that
they are hearing, (e.g., Diessel and Tomasello 2001). However, the correlation could
simply reflect the fact that the most frequently produced examples of a structure are
Caroline F. Rowland, Sarah L. Fletcher and Daniel Freudenthal
those that are most likely to occur in the early samples. For example, suppose that, in
order to investigate the order of acquisition of different verbs, we collect 100 utterances per week for five weeks. We are very likely to capture frequent verbs (e.g., verbs
that occur at least once per 100 utterances) in our very first sample (after we have collected 100 utterances). However, verbs that occur less frequently are very unlikely to
occur in our first sample. For example, it is only after 2 weeks (i.e. after we have collected 200 samples) that we are likely to capture at least one example of verbs that occur once every 200 utterances. It will take us five weeks (500 utterances) before we can
be certain of capturing a verb that occurs once every 500 utterances. In other words,
more frequent verbs are more likely to occur in earlier samples (and thus be identified
as early acquired) than less frequent verbs, even if both verbs were acquired before the
beginning of the sampling period.
3.3
The effect of vocabulary size on productivity measures
The third possible confound on estimates of productivity is the fact that children’s vocabularies are smaller than those of adults. Since speakers can only produce utterances
using vocabulary items they have already learned, children are less likely than adults to
be capable of demonstrating productivity with a wide range of grammatical structures.
For example, a child who knows only two determiners will have far less opportunity to
demonstrate a sophisticated knowledge of the determiner category than a child who
knows four, even if both children have equally abstract knowledge of the category
(Pine and Lieven 1997). Thus, lexical specificity in the data could also be due to a limi
ted knowledge of vocabulary, not to limited grammatical knowledge.
3.4
Assessing productivity: A solution
To recap, the accuracy with which any one sample assesses productivity is affected by
sample size, by the frequency statistics of the language, and by the vocabulary size of
the child. Importantly, even collecting much bigger samples will not overcome these
problems. There will still be an impact of sample size and frequency statistics on measures of productivity, no matter how many utterances are collected. In addition, children’s limited vocabulary knowledge will still affect the range and variability of the
syntactic structures they produce. In order to attribute limited productivity to children
reliably it is important to control for the effect of sample size and vocabulary, while
taking into account the frequency statistics of the language. The best way to do this is
to use a comparison measure based on a matched sample of adult data.
Aguado-Orea and Pine’s (Aguado-Orea and Pine 2002; Aguado-Orea 2004) study
on Spanish verb morphology provides such a comparison measure. They investigated
the productivity of children’s verb morphology in Spanish, controlling for a number of
methodological factors that could explain limited flexibility in verb inflection use.