Adaptive Sampling Steve Thompson
[email protected]
Simon Fraser University 16-17 June 2011 BANOCOSS 2011
Adaptive Sampling – p. 1/??
Adaptive sampling: An adaptive sampling design is one in which the selection of units to include in the sample depends on values of the variable of interest observed during the survey.
Adaptive Sampling – p. 2/??
Sketch 1. Adaptive sampling ideas and examples 2. Design and inference considerations 3. Spatial, network, temporal settings
Adaptive Sampling – p. 3/??
Rare, clustered population
Adaptive Sampling – p. 4/??
Random sample of 40 units
Adaptive Sampling – p. 5/??
Same population
Adaptive Sampling – p. 6/??
Initial sample of 20 units
Adaptive Sampling – p. 7/??
Adaptive cluster sample
Adaptive Sampling – p. 8/??
Adaptive cluster sample
Adaptive Sampling – p. 8/??
Adaptive cluster sample
Adaptive Sampling – p. 8/??
Adaptive cluster sample
Adaptive Sampling – p. 8/??
Changed population!
Adaptive Sampling – p. 9/??
Initial sample of 20 units
Adaptive Sampling – p. 10/??
Adaptive cluster sample
Adaptive Sampling – p. 11/??
Adaptive cluster sample
Adaptive Sampling – p. 11/??
Adaptive cluster sample
Adaptive Sampling – p. 11/??
Adaptive cluster sample
Adaptive Sampling – p. 11/??
Adaptive cluster sample
Adaptive Sampling – p. 11/??
Types of sampling designs The procedure by which we select the sample.
Conventional design:
p(s)
Procedure for selecting the sample does not depend on values of variables of interest observed during the survey.
Adaptive design:
p(s | y)
Procedure for selecting sample can depend on values of variables of interest. (Design can also depend on auxiliary variables x.)
Adaptive Sampling – p. 12/??
Approaches to inference from samples Design based approach: The values of the variables of interest in the population are fixed, unknown constants. y = (y1 , . . . , yN )
Model based approach: The population values are random variables, which we try to model. Y1 , . . . , YN have some joint probability distribution f (y1 , . . . , yN | θ).
Adaptive Sampling – p. 13/??
Trawl survey, Kodiak Island
Adaptive Sampling – p. 14/??
Optimal sampling strategies Find the design p(s | y) and estimator Zˆ of population quantity Z to minimize the mean square error E(Zˆ − Z)2 ˆ = E(Z) subject to unbiasedness, E(Z)
The optimal strategy is in most cases an adaptive one.
Adaptive Sampling – p. 15/??
Reasoning: 1. Stop part way through the survey and look at what has been observed so far: initial sample and values (s1 , ys1 ) 2. Choose the rest of the sample s2 to minimize the mean square error of the estimate given what has been observed so far. h i min E (Zˆ − Z)2 | s1 , ys1 (Zacks 1969, Thompson and Seber 1996, Chao and Thompson 2000)
Adaptive Sampling – p. 16/??
The idea: Given what you’ve observed so far, choose subsequent sample units with good design conditional on that. Say the sample is in two phases, s = (s0 , s1 ). The data are d = (s, ys ). τ − τ )2 | s0 , ys0 ] ≤ min E(ˆ τ − τ )2 E min E[(ˆ s1 s | {z } | {z } using
current
data
not
using
Adaptive Sampling – p. 17/??
Practical, efficient designs Theoretically optimal designs are hard to implement, computationally complex, and overly dependent on model based assumptions. We seek instead practical, efficient, robust designs
Adaptive Sampling – p. 18/??
Adaptive cluster sampling estimation y¯ is not unbiased for µ
Unbiased estimate has form X yi αi
αi = network intersection probability,
or Rao-Blackwell form.
Adaptive Sampling – p. 19/??
Sufficiency, completeness, Rao-Blackwell sampling data = (s, ys ) sufficient statistic = set of distinct units, associated y values Rao-Blackwell estimate = E[simple estimator | sufficient statistic] Minimal sufficient statistic is not complete so more than one possible estimator.
Adaptive Sampling – p. 20/??
Bering Sea king crab survey
Johnson, Chao, Thompson and Stevens - Draft 10/28/2001
Adaptive Sampling – p. 21/??
Likelihood function Prob(data | parameters) = P(s, ys | θ)
L(θ; s, ys ) = =
Z
Z
p(s | y; θ)f (y; θ)dys¯ (design)(model)d(unobserved)
In general a likelihood function involves both the selection mechanism (design) and the model and effective inference should take into account both.
Adaptive Sampling – p. 22/??
“Ignorable” design If the design depends only on values that are observed and recorded in the data, then the design disappears from likelihood-based estimates.
L(θ; s, ys ) = p(s | ys ; θ1 )
Z
f (y; θ2 )dys¯
But to be ignorable for frequentist model-based inference, the design must be a conventional one p(s), depending on no y -values at all. It can be argued that in most real situations, the design is ignorable for data analysis only if the study used a known probability design. Adaptive Sampling – p. 23/??
Adaptive sampling in networks
Adaptive Sampling – p. 24/??
Studies of hidden populations
HIV/AIDS at-risk study M. Miller
Adaptive Sampling – p. 25/??
Sampling in networks Population of units or nodes: 1, 2, . . . , N Node variables of interest: y1 , y2 , . . . , yN Link-indicators or weights: wij , i, j = 1, . . . , N (Variables of interest associated with pairs of nodes) Sample: A subset or sequence s of units and pairs of units (1) (2) from the population: s = (s , s ) y is observed in s(1) . w is observed in s(2) .
Adaptive Sampling – p. 26/??
Approaches to inference in network sampling Design based approach: The values of the variables of interest in the population are fixed, unknown constants. y = (y1 , . . . , yN ) w = {wij }, i, j ∈ {1, . . . , N }
Probability enters only through the design Model based approach: The population values are random variables, which we try to model. Y1 , . . . , YN , W11 , . . . , WN N have some joint probability distribution, described by a stochastic graph model
Adaptive Sampling – p. 27/??
Snowball and Random Walk Designs 1. Snowball designs and inference 2. Random walk designs and inference
Adaptive Sampling – p. 28/??
Example network population
population graph
Adaptive Sampling – p. 29/??
Random sample
sample
Adaptive Sampling – p. 30/??
Snowball sample
sample
Adaptive Sampling – p. 31/??
Snowball sample
sample
Adaptive Sampling – p. 31/??
One-wave snowball selection probabilities
0.0
0.2
0.4
0.6
0.8
1.0
One−wave selection probabilities
0.0
0.2
0.4
0.6
0.8
1.0 Adaptive Sampling – p. 32/??
The population again
population graph
Adaptive Sampling – p. 33/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk sample
walk
Adaptive Sampling – p. 34/??
Random walk limit selection probabilities
Limit random walk probabilities
Adaptive Sampling – p. 35/??
Random walk as Markov chain Wk is the node of the graph selected at k th wave. aij = 1 indicates a link from node i to node j . {W0 , W1 , W2 , . . . } is a Markov chain with P (Wk+1 = j | Wk = i) = aij /ai· Q is the transition matrix of the chain, qij = P (Wk+1 = j | Wk = i).
The stationary probabilities (π1 , . . . , πN ) satisfy πj = for j = 1, . . . , N .
P
πi qij
Adaptive Sampling – p. 36/??
Approach using limiting distribution of random walk For random walk design with-replacement in a single-component network and if the links are symmetric, then the limiting selection probability is proportional to the person’s degree (di ) Generalized ratio estimator of mean for behavioral characteristic y : P yi /di s µ ˆ= P s 1/di
Adaptive Sampling – p. 37/??
Targeted random walk designs 1. Uniform random walk 2. More general targetting
Adaptive Sampling – p. 38/??
Targeted walk designs Let πi (y) denote the desired stationary selection probability for the ith node as a function of its value or degree. The transition probabilities for the targeted walk are Pij = qij αij
Pii = 1 −
for i 6= j
X
Pij
j6=i
where πj qji ,1 αij = min πi qij Adaptive Sampling – p. 39/??
TARGETED RANDOM WALK DESIGNS 1. Random walk as a Markov chain 2. Random, uniform, and targeted walks
Adaptive Sampling – p. 40/??
Random walk as Markov chain Wk is the node of the graph selected at k th wave. aij = 1 indicates a link from node i to node j . {W0 , W1 , W2 , . . . } is a Markov chain with P (Wk+1 = j | Wk = i) = aij /ai· Q is the transition matrix of the chain, qij = P (Wk+1 = j | Wk = i).
The stationary probabilities (π1 , . . . , πN ) satisfy πj = for j = 1, . . . , N .
P
πi qij
Adaptive Sampling – p. 41/??
Targeted walk design Let πi (y) denote the desired stationary selection probability for the ith node as a function of its value or degree. The transition probabilities for the targeted walk are Pij = qij αij
Pii = 1 −
for i 6= j
X
Pij
j6=i
where πj qji ,1 αij = min πi qij Adaptive Sampling – p. 42/??
Uniform targeted walk design
0.8 0.6 0.4 0.2 0.0
0.0
0.2
0.4
0.6
0.8
1.0
Limit selection propabalities
1.0
population
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.8
1.0
1.0
uniform walk sample
1.0
‘random walk’ sample
0.6
0.8
0.8
3
0.6
0.6
5
2 4
2
0.4
1
0.4
1
4
5
0.2 0.0
0.0
0.2
3
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
Adaptive Sampling – p. 43/?? 1.0
Random, uniform, and targeted walks
30
40
50
0
10
20
30
wave
wave
uniform walk
value 2/1 walk
40
50
40
50
0.30 0.10
0.20
expected node value
0.30 0.20 0.10
20
30
40
50
0
10
20
30
wave
wave
degree+1 walk
degree walk
0.30 0.10
0.10
0.20
0.30
expected node value
0.40
10
0.40
0
0.20
expected node value
0.30 0.10
20
0.40
10
0.40
0
expected node value
0.20
expected node value
0.30 0.20 0.10
expected node value
0.40
random walk (with jumps)
0.40
random walk, no jumps
0
10
20
30
40
50
Adaptive Sampling – p. 44/?? 0
10
20
30
40
50
Adaptive web sampling 1. How they work 2. Variations 3. Inference methods
Adaptive Sampling – p. 45/??
Adaptive web sampling At any point in the sampling, • the next unit or set of units is selected from a distribution that depends on the values of variables of interest in an active set of units already selected. (follow a link)
Adaptive Sampling – p. 46/??
Adaptive web sampling At any point in the sampling, • the next unit or set of units is selected from a distribution that depends on the values of variables of interest in an active set of units already selected. (follow a link) • With some probability, however, the selection may be made from a distribution not dependent on those values. (random jump)
Adaptive Sampling – p. 46/??
Population graph
population graph
Adaptive Sampling – p. 47/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Adaptive web design
weighted links
Adaptive Sampling – p. 48/??
Inference Estimation of a population characteristic such as a population mean, degree distribution, or other quantity, based on the sample data. • Design-based simple preliminary estimator improve with Rao-Blackwell or resampling
Adaptive Sampling – p. 49/??
Inference Estimation of a population characteristic such as a population mean, degree distribution, or other quantity, based on the sample data. • Design-based simple preliminary estimator improve with Rao-Blackwell or resampling • Model-based assume stochastic graph model produce realizations from predictive posterior
Adaptive Sampling – p. 49/??
Design-unbiased estimators ˆ0 , such • Start with some preliminary unbiased estimator µ as the initial sample mean, an unequal probability estimator, or conditional probability estimator
Adaptive Sampling – p. 50/??
Design-unbiased estimators ˆ0 , such • Start with some preliminary unbiased estimator µ as the initial sample mean, an unequal probability estimator, or conditional probability estimator • Improve it using the Rao-Blackwell method: X µ ˆ0 (s)p(s | d) µ ˆ = E(ˆ µ0 |d) = paths
Adaptive Sampling – p. 50/??
Design-unbiased estimators ˆ0 , such • Start with some preliminary unbiased estimator µ as the initial sample mean, an unequal probability estimator, or conditional probability estimator • Improve it using the Rao-Blackwell method: X µ ˆ0 (s)p(s | d) µ ˆ = E(ˆ µ0 |d) = paths
d is the minimal sufficient statistic
Adaptive Sampling – p. 50/??
Estimator based on initial sample mean • Start with unbiased estimator of µ based on the initial sample s0 . For example, µ ˆ01 = y¯0
or µ ˆ01
1 X yi = N πi i∈s0
Adaptive Sampling – p. 51/??
Estimator based on initial sample mean • Start with unbiased estimator of µ based on the initial sample s0 . For example, µ ˆ01 = y¯0
or µ ˆ01
1 X yi = N πi i∈s0
• Improve it using the Rao-Blackwell method: X µ ˆ1 = E(ˆ µ01 |dr ) = µ ˆ01 (s)p(s | dr ) {s:r(s)=s}
Adaptive Sampling – p. 51/??
Estimators based on conditional probabilities τˆs0 , an unbiased estimator of the population total based on the initial sample s0 . P For the kth selection after the initial sample, zk = i∈sck yi + yk /qak i
where qak i is the conditional probability of selecting person i given the current active set ak . An unbiased estimator of the population mean is " # n X 1 n0 τˆs0 + zi µ ˆ02 = Nn i=n +1 0
The improved estimator is
µ ˆ2 = E(ˆ µ02 |dr ) =
X
µ ˆ02 (s)p(s | dr )
{s:r(s)=s} Adaptive Sampling – p. 52/??
Composite conditional generalized ratio ˆ0 an estimator of the population size N based on the initial sample, for N P ˆ example, N0 = k∈s0 (1/πk ).
ˆk = nck + 1/qa i , where nck is the size of the After the initial sample, N k current sample. A composite estimator of N is "
n X
ˆ0 + ˆ = 1 n0 N N n i=n
0 +1
ˆi N
#
A generalized ratio estimator is then formed as the ratio of the two ˆ unbiased estimators: µ ˆ03 = N µ ˆ02 /N The improved version of this estimator is X µ ˆ3 = E(ˆ µ03 |dr ) =
{s:r(s)=s}
µ ˆ03 (s)p(s | dr ) Adaptive Sampling – p. 53/??
Composite conditional mean of ratios An alternate way to use the ratios of unbiased estimators in a composite estimator is " # n X zi z0 1 n0 + µ ˆ04 = ˆ ˆi n N0 i=n +1 N 0
The improved version of this estimator is X µ ˆ4 = E(ˆ µ04 |dr ) =
µ ˆ04 (s)p(s | dr )
{s:r(s)=s}
Adaptive Sampling – p. 54/??
Computational issue
µ ˆ = E(ˆ µ0 |dr ) =
X
µ ˆ01 (s)p(s)
{s:r(s)=s}
The sum is over all possible sample paths giving dr . Sample sequence s = (s0 , in0 +1 , . . . , in ) has selection probability p(s) = p0 qan0 ,i1 · · · qan−1 in For a nonreplacement design, n! reordings of the sample.
Adaptive Sampling – p. 55/??
Computational issue
µ ˆ = E(ˆ µ0 |dr ) =
X
µ ˆ01 (s)p(s)
{s:r(s)=s}
The sum is over all possible sample paths giving dr . Sample sequence s = (s0 , in0 +1 , . . . , in ) has selection probability p(s) = p0 qan0 ,i1 · · · qan−1 in For a nonreplacement design, n! reordings of the sample. • n = 9 has 362,880 reordings.
Adaptive Sampling – p. 55/??
Computational issue
µ ˆ = E(ˆ µ0 |dr ) =
X
µ ˆ01 (s)p(s)
{s:r(s)=s}
The sum is over all possible sample paths giving dr . Sample sequence s = (s0 , in0 +1 , . . . , in ) has selection probability p(s) = p0 qan0 ,i1 · · · qan−1 in For a nonreplacement design, n! reordings of the sample. • n = 9 has 362,880 reordings. • n = 10 has 3.6 million.
Adaptive Sampling – p. 55/??
Computational issue
µ ˆ = E(ˆ µ0 |dr ) =
X
µ ˆ01 (s)p(s)
{s:r(s)=s}
The sum is over all possible sample paths giving dr . Sample sequence s = (s0 , in0 +1 , . . . , in ) has selection probability p(s) = p0 qan0 ,i1 · · · qan−1 in For a nonreplacement design, n! reordings of the sample. • n = 9 has 362,880 reordings. • n = 10 has 3.6 million. • n = 20 has 2.4 quintillion (1018 ), as in “million, billion, trillion, quadrillion, quintillion,...”
Adaptive Sampling – p. 55/??
Markov chain resampling estimators Let x be a permutation of the sample s. The object is to obtain a Markov chain x0 , x1 , x2 , . . . having stationary distribution p(x | dr ). 1. A tentative permutation tk is produced by applying the original sampling design to the data as if the sample were the whole population. 2. With probability α, tk is accepted and xk = tk , while with probability 1 − α, tk is rejected and xk = xk−1 , where p(tk ) pt (xk−1 ) ,1 α = min p(xk−1 ) pt (tk )
Adaptive Sampling – p. 56/??
Markov chain resampling estimators Let x be a permutation of the sample s. The object is to obtain a Markov chain x0 , x1 , x2 , . . . having stationary distribution p(x | dr ). 1. A tentative permutation tk is produced by applying the original sampling design to the data as if the sample were the whole population. 2. With probability α, tk is accepted and xk = tk , while with probability 1 − α, tk is rejected and xk = xk−1 , where p(tk ) pt (xk−1 ) ,1 α = min p(xk−1 ) pt (tk )
Adaptive Sampling – p. 57/??
Spatial adaptive web sampling
spatial population
Adaptive Sampling – p. 58/??
Network structure of spatial population
population graph
Adaptive Sampling – p. 59/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
Adaptive web sample
sample
Adaptive Sampling – p. 60/??
The resulting spatial sample
spatial population
Adaptive Sampling – p. 61/??
Active set design variations
spatial population
population graph
active set sample
active set sample
Adaptive Sampling – p. 62/??
Blue-winged teal population
spatial population 0
0
3
5
0
0
0
0
0
0
0
0
0
24
14
0
0
10
103
0
0
0
0
0
2
3
2
0
13639
1
0
0
0
0
0
0
0
0
14
122
0
0
0
0
0
0
2
0
0
177
population graph
Adaptive Sampling – p. 63/??
Two samples, n=20. Top: n0 = 13. Bottom: n0 = 1.
sample
sample
Adaptive Sampling – p. 64/??
1.6
1.8
MSE of estimators depending on n0, with n = 20
1.2 1.0 0.8 0.6
mse
1.4
µ^1 µ^2 µ^3 µ^4
5
10
15
20
n0 Adaptive Sampling – p. 65/??
Model-based inference with network designs (Work with Ove Frank, Mosuk Chow, Mike Kwanisai, and others).
Adaptive Sampling – p. 66/??
Bayes predictive inference Inference about population characteristics based on the Bayes predictive posterior distribution given the data d = (s, ys , ws ) Z P (ys¯, ws¯ | d) = P (ys¯, ws¯ , θ, β | d) dθ dβ Based on an assumed stochastic graph model f (y, w; θ, β), θ = node paramaters, β = link parameters.
Adaptive Sampling – p. 67/??
Sampling from predictive posterior The object is to produce many realizations of the entire population from the posterior distribution given the sample data. This is the data augmentation step of a Markov chain Monte Carlo procedure.
Adaptive Sampling – p. 68/??
Within Bayes: The likelihood function The likelihood function depends on both the design used in obtaining the data and the model describing the population. Prob(data | parameters) = P(s, ys , ws | θ, β)
L(θ, β; s, ys , ws ) = =
Z
Z
p(s | y, w; θ, β)f (y, w; θ, β)dys¯dws¯ (design)(model)d(unobserved)
Adaptive Sampling – p. 69/??
MCMC for network Bayes inference 1. Using current values of θ and β , select a realization of (ys¯, ws¯) from P (ys¯, ws¯ | θ, β, s, ys , ws ). 2. Using the values (ys¯, ws¯) obtained in step (1) to augment the data values (ys , ws ), select new parameter values (θ, β) from the posterior distribution of the parameters given the whole graph realization π(θ, β | ys , ys¯, ws , ws¯) Repeat.
Adaptive Sampling – p. 70/??
Bayes predictive inference; Actual pattern
0.0
0.2
0.4
coord2
0.6
0.8
1.0
actual pattern in region
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 71/??
Sample and observed values
0.0
0.2
0.4
coord2[ss]
0.6
0.8
1.0
sample and observed values
0.0
0.2
0.4
0.6
0.8
1.0
coord1[ss] Adaptive Sampling – p. 72/??
Realization from posterior distribution
0.0
0.2
0.4
coord2
0.6
0.8
1.0
inferred possible pattern, given data
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 73/??
Realization from posterior distribution
0.0
0.2
0.4
coord2
0.6
0.8
1.0
inferred possible pattern, given data
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 74/??
Realization from posterior distribution
0.0
0.2
0.4
coord2
0.6
0.8
1.0
inferred possible pattern, given data
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 75/??
Realization from posterior distribution
0.0
0.2
0.4
coord2
0.6
0.8
1.0
inferred possible pattern, given data
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 76/??
Realization from posterior distribution
0.0
0.2
0.4
coord2
0.6
0.8
1.0
inferred possible pattern, given data
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 77/??
Median of posterior distribution
0.0
0.2
0.4
coord2
0.6
0.8
1.0
median of possible patterns, given data
0.0
0.2
0.4
0.6
0.8
1.0
coord1 Adaptive Sampling – p. 78/??
Systematic sample, 16 sites
0.6
0.8
1.0
1.0 0.8
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
coord1[ss]
conditional realization
conditional realization
conditional realization
0.4
0.6
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.0
0.2
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6 coord1
conditional realization
conditional realization
conditional median
0.6 coord1
0.8
1.0
1.0
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.2
0.4
0.8
1.0
coord1
1.0
coord1
0.2
1.0
1.0
coord1
1.0
x
0.0
0.0
0.6 0.2 0.0
0.0
1.0
0.0
0.4
coord2[ss]
0.8 0.6
coord2
0.2 0.0
0.4
0.2
coord2
0.4
0.8 0.6 0.4
exp((−(x/delta)^2))
0.2 0.0
0.2
1.0
0.0
coord2
sample data
1.0
actual pattern in region
1.0
spatial covariance
0.0
0.2
0.4
0.6 coord1
0.8
1.0
0.0
0.2
0.4
0.6 coord1
Adaptive Sampling – p. 79/??
Systematic sample, 4 sites
0.6
0.8
1.0
1.0 0.8
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
coord1[ss]
conditional realization
conditional realization
conditional realization
0.4
0.6
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.0
0.2
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6 coord1
conditional realization
conditional realization
conditional median
0.6 coord1
0.8
1.0
1.0
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.2
0.4
0.8
1.0
coord1
1.0
coord1
0.2
1.0
1.0
coord1
1.0
x
0.0
0.0
0.6 0.2 0.0
0.0
1.0
0.0
0.4
coord2[ss]
0.8 0.6
coord2
0.2 0.0
0.4
0.2
coord2
0.4
0.8 0.6 0.4
exp((−(x/delta)^2))
0.2 0.0
0.2
1.0
0.0
coord2
sample data
1.0
actual pattern in region
1.0
spatial covariance
0.0
0.2
0.4
0.6 coord1
0.8
1.0
0.0
0.2
0.4
0.6 coord1
Adaptive Sampling – p. 80/??
Random sample, 16 sites
0.6
0.8
1.0
1.0 0.8
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
coord1[ss]
conditional realization
conditional realization
conditional realization
0.4
0.6
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.0
0.2
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6 coord1
conditional realization
conditional realization
conditional median
0.6 coord1
0.8
1.0
1.0
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.2
0.4
0.8
1.0
coord1
1.0
coord1
0.2
1.0
1.0
coord1
1.0
x
0.0
0.0
0.6 0.2 0.0
0.0
1.0
0.0
0.4
coord2[ss]
0.8 0.6
coord2
0.2 0.0
0.4
0.2
coord2
0.4
0.8 0.6 0.4
exp((−(x/delta)^2))
0.2 0.0
0.2
1.0
0.0
coord2
sample data
1.0
actual pattern in region
1.0
spatial covariance
0.0
0.2
0.4
0.6 coord1
0.8
1.0
0.0
0.2
0.4
0.6 coord1
Adaptive Sampling – p. 81/??
Random sample, 16 sites
0.6
0.8
1.0
1.0 0.8
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
coord1[ss]
conditional realization
conditional realization
conditional realization
0.4
0.6
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.0
0.2
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6 coord1
conditional realization
conditional realization
conditional median
0.6 coord1
0.8
1.0
1.0
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.2
0.4
0.8
1.0
coord1
1.0
coord1
0.2
1.0
1.0
coord1
1.0
x
0.0
0.0
0.6 0.2 0.0
0.0
1.0
0.0
0.4
coord2[ss]
0.8 0.6
coord2
0.2 0.0
0.4
0.2
coord2
0.4
0.8 0.6 0.4
exp((−(x/delta)^2))
0.2 0.0
0.2
1.0
0.0
coord2
sample data
1.0
actual pattern in region
1.0
spatial covariance
0.0
0.2
0.4
0.6 coord1
0.8
1.0
0.0
0.2
0.4
0.6 coord1
Adaptive Sampling – p. 82/??
MCMC data augmentation steps
0.6
0.8
1.0
1.0 0.8
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
coord1
mcmc augmented data 2
mcmc augmented data 3
mcmc augmented data 4
0.4
0.6
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.0
0.2
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
coord1
mcmc augmented data 5
mcmc augmented data 6
mcmc augmented data 7
0.6 coord1
0.8
1.0
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6
coord2
0.0
0.2
0.4
0.8 0.6 0.4 0.2
0.4
1.0
1.0
coord1
1.0
coord1
0.2
1.0
1.0
coord1[ss]
1.0
coord1
0.0
0.0
0.6
coord2
0.2 0.0
0.0
1.0
0.0
0.4
0.8 0.6 0.2 0.0
0.4
0.2
coord2
0.4
coord2[ss]
0.8 0.6
coord2
0.4 0.2 0.0
0.2
1.0
0.0
coord2
mcmc augmented data 1
1.0
sample data
1.0
actual pattern in region
0.0
0.2
0.4
0.6 coord1
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
coord1
Adaptive Sampling – p. 83/??
Design and estimation comparisons
population graph
Adaptive Sampling – p. 84/??
Random walk n=20, initial pp-degree
mean of draws, random walk
0
0
1
2
Density
2 1
Density
3
3
4
sample mean, random walk
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
rwmean
rwnommean
gen ratio est, random walk
gen ratio of draws, random walk
1.0
0
1
2
Density
2 1 0
Density
3
3
4
4
0.0
0.0
0.2
0.4
0.6 rwnaive
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
rwnaivenom
Adaptive Sampling – p. 85/??
5 random walks, n=4 each, pp-deg starts
mean of draws, random walk
2.0 1.5
Density
0.5 0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
rwmean
rwnommean
gen ratio est, random walk
gen ratio of draws, random walk
1.0
1.0 0.0
0.0
0.5
0.5
1.0
Density
1.5
1.5
2.0
2.0
0.0
Density
1.0
1.5 1.0 0.5
Density
2.0
2.5
2.5
sample mean, random walk
0.0
0.2
0.4
0.6 rwnaive
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
rwnaivenom
Adaptive Sampling – p. 86/??
random walk, n=20 , equal probability start
mean of draws, random walk
0
0
1
1
2
3
Density
3 2
Density
4
4
5
5
6
sample mean, random walk
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
rwmean
rwnommean
gen ratio est, random walk
gen ratio of draws, random walk
1.0
3
Density
3
2
2
0
1
1 0
Density
4
4
5
5
6
0.0
0.0
0.2
0.4
0.6 rwnaive
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
rwnaivenom
Adaptive Sampling – p. 87/??
AWS, n0=1, n=20, random links, jump=.1
2 0
1
Density
3
4
generalized ratio estimate 1
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
dgre1
2 1 0
Density
3
4
generalized ratio estimate 2
0.0
0.2
0.4
0.6 dgre2
Adaptive Sampling – p. 88/??
AWS, n0=10, n=20, random links, jump=.1
2 0
1
Density
3
4
generalized ratio estimate 1
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
dgre1
2 1 0
Density
3
4
generalized ratio estimate 2
0.0
0.2
0.4
0.6 dgre2
Adaptive Sampling – p. 89/??
Design and model based estimators, AWS n0=10, n=20
rb initial mean
1.0 0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
ybar
rbw0vec
rb norepl est
rb nr alt
0.6
0.8
1.0
0.6
0.8
1.0
0.8
1.0
2.0
Density
0.0
1.0 0.0
1.0
2.0
3.0
3.0
0.0
Density
2.0
Density
1.5 1.0 0.0
0.5
Density
2.0
3.0
sample mean
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
rbnoreplvec
rbnraltvec
rb nr alt0
bayes predictor
2.0
Density
1.0 0.0
1.0 0.0
Density
2.0
3.0
0.0
0.0
0.2
0.4
0.6 rbnraltvec0
0.8
1.0
0.0
0.2
0.4
0.6
bayes.predtvec
Adaptive Sampling – p. 90/??
0.08
0.10
0.12
0.14
Designs and Estimators
DESIGN
0.00
0.02
0.04
0.06
Random Walk AWS n0=1, n=20 AWS n0=10, n=20
grhh
grht
gre
est1
est3
gre
est1
est3
bayes
Adaptive Sampling – p. 91/??
Empirical Example HIV/AIDS at-risk hidden population: Colorado Springs Study on the heterosexual transmission of HIV/AIDS (Potterat et al. 1993, Rothenberg et al. 1995, Darrow et al. 1999)
Adaptive Sampling – p. 92/??
Colorado springs study population
population graph
Adaptive Sampling – p. 93/??
Sample of 80 individuals Initial n0 = 10, final n = 20, m = 4 independent selections. sample
Adaptive Sampling – p. 94/??
Design and Model based inferences
mle
0
0
1
2
4
Density
4 3 2
Density
5
6
6
7
sample mean
0.4
0.6
0.8
1.0
0.0
0.6
bayes estimator
bayes predictor
0.8
1.0
0.8
1.0
0.8
1.0
6 4 0
1
2
3
Density
5
6 5 4 3 1
2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
bayes.t
pred.t
design unbiased
design consistent
0
0
2
2
4
Density
6
6
0.2
4
Density
0.4 mle.theta
0 0.0
Density
0.2
s.mean
7
0.2
7
0.0
0.0
0.2
0.4
0.6 dunbiased
(Sex worker data)
0.8
1.0
0.0
0.2
0.4
0.6 dconst
Adaptive Sampling – p. 95/??
HIV Behavioral Monitoring Design Study
population graph
Adaptive Sampling – p. 96/??
Design-based and Bayes estimators
rb initial mean
1.0 0.0 0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
ybar
rbw0vec
rb norepl est
rb nr alt
0.6
0.8
1.0
0.6
0.8
1.0
0.8
1.0
2.0 1.0 0.0
0.0
1.0
Density
2.0
3.0
3.0
0.0
Density
2.0
Density
1.5 1.0 0.0
0.5
Density
2.0
3.0
sample mean
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
rbnoreplvec
rbnraltvec
rb nr alt0
bayes predictor
2.0 0.0
1.0
Density
1.0 0.0
Density
2.0
3.0
0.0
0.0
0.2
0.4
0.6 rbnraltvec0
0.8
1.0
0.0
0.2
0.4
0.6
bayes.predtvec
Adaptive Sampling – p. 97/??
Empirical Example HIV/AIDS at-risk hidden population: Colorado Springs Study on the heterosexual transmission of HIV/AIDS (Potterat et al. 1993, Rothenberg et al. 1995, Darrow et al. 1999)
Adaptive Sampling – p. 98/??
Colorado springs study population
population graph
Adaptive Sampling – p. 99/??
Sample of 80 individuals Initial n0 = 10, final n = 20, m = 4 independent selections. sample
Adaptive Sampling – p. 100/??
Estimating idu use, random links design
rb initial mean
0.6
0.8
1.0
1.2
0.0
0.8
1.0
1.2
0.8
1.0
1.2
0.8
1.0
1.2
0.8
1.0
1.2
rb norepl est
Density 0.6
0.8
1.0
1.2
0.0
0.2
0.4
0.6
norepl.est
rbnoreplvec
norepl alt
rb nr alt 4 0
2
Density
1.5 0.0
0.4
0.6
0.8
1.0
1.2
0.0
0.2
0.4
0.6
noreplalt
rbnraltvec
norepl alt0
rb nr alt0
4
0
2 0
4
0.2
Density
0.0
2
Density
0.6
norepl est
1.0
0.4
0.4
rbw0vec
0.0
0.2
0.2
y0
3.0
0.0
Density
4 0
0.4
0.0 1.0 2.0
0.2
2.0
0.0
Density
2
Density
2 0
Density
4
initial mean
0.0
0.2
0.4
0.6 noreplalt0
0.8
1.0
1.2
0.0
0.2
0.4
0.6 rbnraltvec0
Adaptive Sampling – p. 101/??
Estimating idu use, weighted links design
0
2
Density
2
1.0
1.5
0.0
0.5 rbw0vec
norepl est
rb norepl est
Density
1.5
1.0
1.5
1.0
1.5
1.0
1.5
0.0
1.0 0.0
1.0
2.0
y0
1.0
Density
0
0.5
2.0
0.0
Density
4
rb initial mean
4
initial mean
0.5
1.0
1.5
0.0
0.5
norepl.est
rbnoreplvec
norepl alt
rb nr alt 3 2 0
0.5
1.0
1.5
0.0
0.5
noreplalt
rbnraltvec
norepl alt0
rb nr alt0
4 2 0
0
2
4
Density
6
6
0.0
Density
1
Density
2 1 0
Density
3
4
0.0
0.0
0.5
1.0 noreplalt0
1.5
0.0
0.5 rbnraltvec0
Adaptive Sampling – p. 102/??
Degree distribution, HIV/AIDS study
−3 −5
−4
log(freq)
0.2 0.0
−6
0.1
frequency
0.3
−2
0.4
−1
degree distribution
0
5
10
15
20
0.0
0.5
1.0
1.5
2.0
2.5
3.0
2.0
2.5
3.0
log(degree)
degree
−3 −5
−4
log(freq)
0.2
−6
0.1 0.0
frequency
0.3
−2
0.4
−1
sample degree distribution
0
5
10 degree
15
20
0.0
0.5
1.0
1.5 log(degree)
Adaptive Sampling – p. 103/??
Estimating mean degree estimate of mean degree
Density
0.3 0.2
0.2
0.0
0.1
0.1 0.0
Density
0.4
0.3
0.5
0.6
0.4
average degree in sample
0
2
4
6 degree
8
0
2
4
6
8
rbw0vec
Adaptive Sampling – p. 104/??
Design and Model based inferences
mle
0
0
1
2
4
Density
4 3 2
Density
5
6
6
7
sample mean
0.4
0.6
0.8
1.0
0.0
0.6
bayes estimator
bayes predictor
0.8
1.0
0.8
1.0
0.8
1.0
6 4 0
1
2
3
Density
5
6 5 4 3 1
2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
bayes.t
pred.t
design unbiased
design consistent
0
0
2
2
4
Density
6
6
0.2
4
Density
0.4 mle.theta
0 0.0
Density
0.2
s.mean
7
0.2
7
0.0
0.0
0.2
0.4
0.6 dunbiased
(Sex worker data)
0.8
1.0
0.0
0.2
0.4
0.6 dconst
Adaptive Sampling – p. 105/??
Spatial-Temporal designs For detecting releases of biological pathogens and other airborne health hazards, is it better to set out sensors in fixed positions or to have them move in some pattern?
Adaptive Sampling – p. 106/??
The more basic sampling question: What is the best design for sampling a population that is changing, when the sampling units themselves may move as observations are collected.
Adaptive Sampling – p. 107/??
Background Early health warning of exposure to airborne biological pathogens Bionet Project places fixed sensor units in selected cities Builds on the Envionmental Protection Agency’s air quality monitoring program
Adaptive Sampling – p. 108/??
Purpose: • Rapid health response - earlier diagnosis and treatment
Adaptive Sampling – p. 109/??
Purpose: • Rapid health response - earlier diagnosis and treatment • Environmental remediation
Adaptive Sampling – p. 109/??
“Streets and avenues design”
Adaptive Sampling – p. 110/??
Array of rectangular (square) paths
Adaptive Sampling – p. 111/??
0.90
lines squares
0.75
0.80
0.85
one line
streets
0.70
prob detect
0.95
1.00
Sample size, moving units
4
6
8
10
12
14
16
n
Adaptive Sampling – p. 112/??
1.0
Sample size, fixed units
0.4
0.6
fixed
0.2
prob detect
0.8
moving
10
20
30
40
50
60
n
Adaptive Sampling – p. 113/??
Adaptive designs in space-time-network settings
Adaptive Sampling – p. 114/??