Idea Transcript
Actuation without production bias James Kirby, University of Edinburgh Morgan Sonderegger, McGill University
Sound Change in Interacting Human Systems UC-Berkeley May 31, 2014
Introduction • Change at the population level is often claimed to be based
in phonetic variation at the individual level (e.g. Ohala, 1993) • One source of variation: production bias (e.g., coarticulation)
WGmc
Pre-OHG
OHG (NHD)
*gasti *lambir *fasti
gesti lembir festi
¨ gest (Gaste) ¨ lemb (Lamme) fest (fest)
Primary umlaut in West Germanic (after Iverson and Salmons, 2006).
• This conference: other types of bias (group membership,
cognitive endowment...)
Stability and change • Existence of a bias does not mean change is inevitable:
default is stability! (Weinrich et al., 1968; cf. Kiparsky’s “non-phonologization problem”)
• “Accumulation-of-error” approaches often criticized for this
very reason (e.g. Baker, 2008) • Adequate account of actuation must explain: 1. Stability of limited coarticulation in the population; 2. Stability of full coarticulation in the population; 3. Change from stable limited to full coarticulation.
Roadmap
• First: summary of previous work showing that one way to
get both stability and change at the population level is to assume both 1. a force promoting contrast maintenance, to keep separate phonetic categories stable; and, 2. an external force, such as a production bias, which induces change (cf. Pierrehumbert, 2001; Wedel, 2006).
Kirby & Sonderegger (2013), Proc. CogSci
Roadmap
• Then: today’s questions 1. Does using production bias as the external force have a unique dynamics? 2. If not, will any kind of external force produce the same behaviour at the population level? 3. Broader Q: can we safely assume that any proposed source of change could lead to change, iterated over time in a population?
Roadmap
• Our example scenario: phonologization of coarticulation WGmc
Pre-OHG
OHG (NHD)
*gasti *lambir *fasti
gesti lembir festi
¨ gest (Gaste) ¨ lemb (Lamme) fest (fest)
• Simple models ⇒ potentially unintuitive outcomes
Framework • Lexicon: {V1 , V2 , V12 }, where V12 represents V1 in the
/a_i/
0.003
/i/
/a/
0.000
p(F1)
0.006
coarticulation-inducing context of V2
300
400
500
600 F1 (Hz)
700
800
900
Framework • Task: learn an offset parameter p: how much /a/ is
/a_i/
0.003
/i/
/a/
p 0.000
p(F1)
0.006
produced like /i/ in the context of /i/ (/a i/)
300
400
500
600 F1 (Hz)
700
800
900
Framework • Data: F1 values for /a i/ tokens, potentially subject to
/a_i/
0.003
/i/
300
400
/a/
p
ℓ
0.000
p(F1)
0.006
production bias ` (assuming fixed /a/, /i/)
500
600 F1 (Hz)
700
800
900
Framework
0.8 0.6
a=0.3
0.4
a=0.1
0.2
a=0.99
a=0.01
0.0
P(p)
1.0
• Learner’s prior: (strength of) categoricity bias (CB)
0
50
100 p
150
200
Framework
• Population structure: learners learn from (potentially)
multiple teachers
Framework • Outcome: distribution of p in the population at time t (πt (p))
p (mean +- 2 SD)
120
80
40
0
0
25
50
Generation
75
100
Effects of varying production bias (KS 2013, Model 3) 2
5
7
10
12
15
300 200
p (mean +- 2 SD)
100 0 -100 300 200 100 0 -100 0
250
500
750
1000 0
250
500
750
Generation
1000 0
250
500
750
1000
KS (2013)
• Only model with both production and categoricity biases
could achieve all 3 goals: I
stable limited coarticulation (low `)
I
stable full coarticulation (high `)
I
change from one to the other (medium `)
• In models with categoricity bias, dynamics are not linear
and phonologization is not inevitable (cf. Baker, 2008)
Now • Production bias is the external force most commonly
invoked in models of sound change • ... but clearly not behind all changes: many other factors
invoked by (socio)phon(eticians), e.g. I
Contact (between subpopulations)
I
Social weight (of variants, speakers, groups)
I
Interaction (convergence, divergence)
• Today’s questions: 1. Does using production bias as the external force have a unique dynamics? 2. If not, will any kind of external force produce the same behaviour at the population level?
Now • Production bias is the external force most commonly
invoked in models of sound change • ... but clearly not behind all changes: many other factors
invoked by (socio)phon(eticians), e.g. I
Contact (between subpopulations)
I
Social weight (of variants, speakers, groups)
I
Interaction (convergence, divergence)
• Today’s questions: 1. Does using production bias as the external force have a unique dynamics? 2. If not, will any kind of external force produce the same behaviour at the population level?
Subpopulations in contact: background
• Linguistic features can spread through contact between
different groups (e.g. Thomason, 2001) • These may be different languages, dialects, or
subpopulations of a single group • Are both stability and change possible when heterogenous
groups interact?
Model 1: Subpopulations in contact
0.003
/i/
/a/
0.000
p(F1)
0.006
• Simple instantiation: population divided into two groups:
300
400
500
600 F1 (Hz)
700
800
900
Model 1: Subpopulations in contact • Simple instantiation: population divided into two groups:
0.006
Group a has little/no coarticulation
0.003
(a)
/i/
/a/
0.000
p(F1)
I
300
400
500
600 F1 (Hz)
700
800
900
Model 1: Subpopulations in contact • Simple instantiation: population divided into two groups:
0.006
Group b has extreme coarticulation
0.003
(b)
/i/
/a/
0.000
p(F1)
I
300
400
500
600 F1 (Hz)
700
800
900
Model 1: Subpopulations in contact • aProb: P(Group B agent learns from Group A agent)
0.003
(b)
(a)
/i/
/a/
0.000
p(F1)
0.006
• bProb: P(Group A agent learns from Group B agent)
300
400
500
600 F1 (Hz)
700
800
900
Model 1: Results bProb = 0.03
bProb = 0.06
bProb = 0.09 aProb = 0
200 150 100 50 0
aProb = 0.03
200 150 100 50 0
aProb = 0.06
p (mean +− 2 SD)
bProb = 0 200 150 100 50 0
group A
aProb = 0.09
200 150 100 50 0 0
10 20 30 40 50 0
10 20 30 40 50 0
10 20 30 40 50 0
Generation
10 20 30 40 50
B
Model 1: Discussion
• All three outcomes possible • Stability can be preserved in both groups even when there
is some interaction between them • But: obtaining just 5% of training examples from a different
group can be enough to induce the entire population to converge to a single group’s mean
Social weighting: Background • From the pool of synchronic variation, certain linguistic
features can spread due to association with I
particular variants
I
individuals
I
groups
(e.g. Labov, 2001)
• Are both stability and change possible in the presence of
social value associated with:
?
I
more coarticulated variants (nearer to [i])
I
speakers who coarticulate more
I
groups
00
00
Models 2–4: social weighting
• Each token yi has a social weight wi ∈ [1, wmax ] • Higher social weight associated with: I Model 2: more coarticulated tokens (nearer to [i]) I
Model 3: tokens from teachers who coarticulate more
I
Model 4: tokens from high-coarticulation group
• Learner estimates p using weighted average of the yi I tokens which are {more coarticulated, from teachers/group which coarticulate more} have more influence
Model 2: social weighting by variant • Start with a single population, little coarticulation
0.003
/i/
/a/ /a_i/
0.000
p(F1)
0.006
• Parameter: wmax (preference for coarticulated variants)
300
400
500
600 F1 (Hz)
700
800
900
Model 2: Results Varying wmax : 1.02
1.06
1.08
1.10
1.20
1.30
200 150
p (mean +− 2 SD)
100 50 0 200 150 100 50 0 0
50 100 150 200 250 0
50 100 150 200 250 0
Generation
50 100 150 200 250
Model 2: Discussion
• All three outcomes possible • Stability can be preserved even when coarticulated
variants are socially valued • But: social value of coarticulated variant just 10% more
than uncoarticuated variant can be enough to induce change to full coarticulation in the whole population!
Model 3: social weighting by group
0.003
(b)
(a)
/i/
/a/
0.000
p(F1)
0.006
• Same architecture as Model 1:
300
400
500
600
700
800
900
F1 (Hz)
but with additional parameters: weight of I
data from group A for learner in group B: aWeight
I
data from group B for learner in group A: bWeight
Model 3: Results Fix aProb = bProb = 0.03 bWeight: 0.4
bWeight: 0.5
bWeight: 0.6
bWeight: 0.8 aWeight: 0.2
200 150 100 50 0
aWeight: 0.4
200 150 100 50 0
aWeight: 0.5
200 150 100 50 0
aWeight: 0.6
p (mean +− 2 SD)
bWeight: 0.2 200 150 100 50 0
aWeight: 0.8
200 150 100 50 0
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500
Generations
group A B
Model 3: Discussion
• All three outcomes possible • Stability can be preserved even when group with high
coarticulation socially valued • But: even a small preference for tokens from coarticulating
group can be enough to induce change to full coarticulation in the whole population.
Interim summary: Models 1–3
• Question 1: does driving force = production bias give
unique dynamics? • No: very similar dynamics when driving force is 1. extent of contact 2. social weighting of variants 3. social weighting by group • Question 2: will any kind of driving force produce the same
behavior?
Model 4: social weighting by individual • Setup: every teacher in generation t has I a social weight I
a value of p
• If these happen to be positively correlated (i.e., data from
teachers who coarticulate more is more highly valued): I
more coarticulation in generation t + 1
I
could accumulate and lead to change
(cf. Baker, Archangeli & Mielke 2011)
• Parameters: I w max : maximum social weight I
ρ: correlation between teacher’s prestige and degree of coarticulation
Model 4: Results
rho: 0.6
rho: 0.8
rho: 1 wMax: 2 wMax: 10 wMax: 100 wMax: 1000
p (mean, 5%, 95%)
rho: 0.2 200 150 100 50 0 200 150 100 50 0 200 150 100 50 0 200 150 100 50 0
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500
Generations
Model 4: Discussion • Stability: default • Change: not really • Driving force is much weaker than Models 1–3! “Change”: 1. requires near-perfect coarticulation/social weight correlation, individuals who coarticulate weighted 100-1000x higher than individuals who don’t. 2. is very slow (1000s of generations) • Compare: change in < 200 generations for small increases
in driving force in Models 1–3
Model 4: Discussion • Models 2– 4: all implementations of social weight. Why are
dynamics of Model 4 different? • Social weight on individuals (M4): I correlation between w and observations: weak • Social weight on groups (M3): I correlation between w and observations: stronger • Social weight on variants (M2): I correlation between w and observations: perfect • Question 2: will any kind of driving force produce the same
behavior? I
No
Conclusions • different external forces + categoricity bias = similar
population dynamics I
Implication: a similar dynamics may underlie actuation of changes initiated from different sources
I
Good: sound change can have different sources, and doesn’t show radically different dynamics by source (?)
• But not all external forces give both stability and change I Some intuitively plausible mechanisms “too noisy” to have an effect iterated over time in a speech community. I
Population dynamics as partial solution to the “non-phonologization problem”
Thanks! • Ideas/comments: I Participants in “Computational Models of Sound Change” at 2013 LSA Institute I
Audiences at Ohio State, McGill
• Funding: FRQSC #183356, CFI #32451