Non-linear panel data modeling [PDF]

it > 0) = Pr(uit > âxitÎ² â ci ). = F(xitÎ² + ci ) due to the simmetry of logit and probit. Laura Magazzini (@

4 downloads 3 Views 405KB Size

Report

Download PDF

PNG Network

Recommend Stories

Panel data

Almost everything will work again if you unplug it for a few minutes, including you. Anne Lamott

Panel Data Econometrics

If you are irritated by every rub, how will your mirror be polished? Rumi

Data Driven Modeling

Do not seek to follow in the footsteps of the wise. Seek what they sought. Matsuo Basho

Data flow modeling

What we think, what we become. Buddha

advanced elasticsearch: data modeling

Everything in the universe is within you. Ask all from yourself. Rumi

Data Modeling Basics

Nothing in nature is unbeautiful. Alfred, Lord Tennyson

Data Analysis Through Modeling

I cannot do all the good that the world needs, but the world needs all the good that I can do. Jana

Simulation and Modeling of Nonlinear Magnetics

Don't ruin a good today by thinking about a bad yesterday. Let it go. Anonymous

[PDF] Econometric Analysis of Cross Section and Panel Data

Almost everything will work again if you unplug it for a few minutes, including you. Anne Lamott

Modeling data using directional distributions

The wound is the place where the Light enters you. Rumi

Idea Transcript

Non-linear panel data modeling Laura Magazzini University of Verona

[email protected] http://dse.univr.it/magazzini

May 2010

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

1 / 29

Binary models...

In many economic studies, the dependent variable is discrete . car purchase, labor force participation, default on a loan, ...

Binary choice modeling: yit = 1 if the event happens for individual (household, firm, ...) i at time t, 0 otherwise pit = Pr(yit = 1) = E (yit |xit ) = F (xit0 β) For estimation: LPM, logit, probit

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

2 / 29

... in panel data

The presence of individual effects complicates matters significantly . LPM also implies −xit0 β ≤ ci ≤ 1 − xit0 β

In a latent variable framework (i = 1, ..., N; t = 1, ..., T ) yit∗ = xit0 β + ci + uit with yit = 1 if yit∗ > 0, yit = 0 otherwise Therefore: Pr(yit = 1) = Pr(yit∗ > 0) = Pr(uit > −xit0 β − ci ) = F (xit0 β + ci )

Laura Magazzini (@univr.it)

due to the simmetry of logit and probit

Non-linear panel data modeling

May 2010

3 / 29

FE and RE approach

Pr(yit = 1) = F (xit0 β + ci ) RE approach: ci is assumed to be unrelated to xit . Stronger assumption than the linear case: also place restrictions on the form of heterogeneity

FE approach: no assumption about the relationship between ci and xit Modeling framework fraught with difficulties and unconventional estimation problems

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

4 / 29

The incidental parameter problem Neyman and Scott (1948)

Pr(yit = 1) = F (xit0 β + ci ) If you want to treat ci as a fixed parameter, then as N → ∞, for fixed T , the number of parameters ci increases with N This means that ci cannot be consistently estimated for fixed T In the linear case the problem is solved using the within-transformation . In the linear case the MLE of β and ci are asymptotically independent (Hsiao, 2003)

This is not possible in the non-linear case! The inconsistency of cˆi is transmitted to βˆ within a FE framework Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

5 / 29

The incidental parameter problem A simple example

Suppose yit ∼ N(ci , σ 2 ) MLE yields: cˆi = y¯i

2

and σ ˆ =

Pn

i=1

Pt

t=1 (yit

− y¯1 )2

NT

E [ˆ σ 2 ] = σ 2 (T − 1)/T , so σ ˆ 2 is inconsistent for N → ∞ for fixed T . With T = 2, σ ˆ 2 → 0.5σ 2

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

6 / 29

Road map

Pooled probit Random effect approach Fixed effect approach . How serious is the bias?

Alternative approach: max score estimator

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

7 / 29

Pooled probit or logit (1) Partial likelihood methods

Max Lik estimation: we assume that the parametric model for the density of y given x is correctly specified Inference is made under the assumption that observations are i.i.d., i.e. in case of a panel dataset, the likelihood should be written as L(θ|y , x) =

N Y T Y

f (yit |xit ; θ)

i=1 t=1

This is not suited to the panel data case!

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

8 / 29

Pooled probit or logit (2) Partial likelihood methods

Suppose we have correctly specified the density of yt given xt : ft (yt |xt ; θ) Define the partial log likelihood of each observation as li (θ) =

T X

ln ft (yt |xt ; θ)

t=1

The partial maximum likelihood estimator (PMLE) solves max θ∈Θ

Laura Magazzini (@univr.it)

N X i=1

li (θ) = max θ∈Θ

N X T X

ln ft (yt |xt ; θ)

i=1 t=1

Non-linear panel data modeling

May 2010

9 / 29

Dynamic completeness A model is dynamic complete if once xt is conditioned on, neither past lags of yt nor elements of x from any other time period (past or future) appear in the conditional density of yt given xt Quite a strong assumption: strict exogeneity + absence of dynamics Pr(yit = 1|xit , yit−1 , xit−1 , ...) = Pr(yit = 1|xit ) Inference is considerably easier: all the usual statistics from a probit or logit that pools observations and treats the sample as a long independent cross section of size NT are valid, including likelihood ratio statistics We are not assuming independence across t . For example, xit can contain lagged dependent variables

DC implies that the scores are serially uncorrelated across t (the key condition for the standard inference procedures to be valid) Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

10 / 29

Testing dynamic completeness (1) Add lagged dependent variable and possibly lagged explanatory variables (2) Chi-square statistic: Define uit = yit − F (xit0 β) Under DC, for each t: E [uit |xit , yit−1 , xit−1 , ...] = 0, i.e. uit is uncorrelated with any function of the variables (xit , yit−1 , xit−1 , ...) including uit−1 Let uˆit = yit − F (xit0 β). A simple test is available by using pooled data to estimate the artificial model Pr(yit = 1|xit , uˆit−1 ) = F (xit0 β + γˆ uit−1 ) The null hypothesis is H0: γ = 0 Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

11 / 29

Random effect probit approach (1) yit∗ = xit0 β + εit with yit = 1yit∗ >0

We let εit = ci + uit and assume: . Strict exogeneity assumption: Pr(yit = 1|xi , ci ) = Pr(yit = 1|xit , ci ) . Independence between ci and xit . Normally distributed error components: ci ∼ N(0, σc2 ) and uit ∼ N(0, σu2 )

Since E (εit εis ) = σc2 for t 6= s, the joint likelihood of (yi1 , ..., yiT ) cannot be written as the product of the marginal likelihood of the yit This complicates derivation of the max lik that now involves T -dimensional integrals Z Z Li = Pr(yi1 , ..., yiT |x) = ... f (εi1 , ..., εiT )dεi1 ...dεiT . Extreme of integration is (−∞, −xit0 β) if yit = 0 and (−xit0 β, +∞) if yit = 1 Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

12 / 29

Random effect probit approach (2)

Computation of the likelihood function is simplified if we consider the joint density of εi and ci and then obtain the marginal density of εi integrating out the individual effect: f (εi1 , ..., εiT , ci ) = f (εi1 , ..., εiT |ci )f (ci ) R Therefore: f (εi1 , ..., εiT ) = f (εi1 , ..., εiT |ci )f (ci )dci Conditional on ci , εit are independent f (εi1 , ..., εiT ) =

Z Y T

f (εit |ci )f (ci )dci

t=1

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

13 / 29

Random effect probit approach (3) This simplifies the computation of the likelihood . Key: lack of autocorrelation over time in uit . Allowing autocorrelation in uit : hallmark of simulation methods (Hajivassiliou, 1984)

Z Li

Z

= Pr(yi1 , ..., yiT |x) = ... f (εi1 , ..., εiT )dεi1 ...dεiT # Z Z "Z +∞ Y T f (εit |ci )f (ci )dci dεi1 ...dεiT = ... −∞ t=1

Ranges of integration are independent: exchange order of int.

Z

+∞

"Z

=

... −∞

Z Y T

# f (εit |ci )dεi1 ...dεiT f (ci )dci

t=1

Conditioned on ci , the error terms are independent

Z

+∞

= −∞ Laura Magazzini (@univr.it)

"

T Z Y

# f (εit |ci )dεit f (ci )dci

t=1 Non-linear panel data modeling

May 2010

14 / 29

Random effect probit approach (4)

Z Li

+∞

"

= −∞

Z

+∞

= −∞

"

T Z Y t=1 T Y

# f (εit |ci )dεit f (ci )dci #

Pr(Yit = yit |xit0 β + ci ) f (ci )dci

t=1

We are left with one-dimensional integral! Pr(Yit = yit |xit0 β + ci ) = Φ(qit (xit0 β + ci )) with qit = 2yit − 1 Butler and Moffit (1982) proposes a procedure to approximate the integral under normality of ci (Gaussian quadrature) Alternatively, simulated-maximum likelihood methods Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

15 / 29

RE – What are we estimating? Pr(yit = 1|xi , ci ) = Pr(yit = 1|xit , ci ) = Φ(xit0 β + ci ) The interest is in average partial effect βj ∂ Pr(yit = 1|xi , ci ) E =p φ ∂xit(j) 1 + σc2

!

xit0 β p 1 + σc2

The traditional random effect probit model assumes ci |xi ∼ N(0, σc2 ) . As a result the composite error term of the latent equation has variance 1 + σc2 . Recall APE in the case of neglected heterogeneity (for a continuous xit(j) ) . Therefore by pooled probit we can estimate βc = β/(1 + σc2 )1/2 and APE

If we further assume independence of (yi1 , ..., yiT ) conditional on (xi , ci ) we can separately estimate β and σc2 . ρ = σc2 /(1 + σc2 ): relative importance of the unobserved effect

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

16 / 29

RE – allowing for correlation between ci and xi Chamberlain (1980)

Chamberlain (1980) allowed for correlation between ci and xi under the assumption of conditional normal distribution with linear expectation and constant variance: ci |xi ∼ N(ψ + ¯ x0i ξ, σα2 ) . The approach allows some dependence of ci on xi . In its original formulation, all elements of xi are included in the conditional distribution . The proposed formulation is more conservative on parameters . Known as Chamberlain’s random effect probit model

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

17 / 29

Chamberlain’s random effect probit model 2 ci |xi ∼ N(ψ + ¯ x0i ξ, σα )

We can write yit∗ = xit0 β + ci + uit = xit0 β + ψ + ¯ x0i ξ + uit Estimation is straightforward: we include ¯ xi among the regressor of a RE probit model As in the linear case, it is not possible to estimate the effect of time-invariant variables Intuitively, we are adding ¯ xi as a control for unobserved heterogeneity A test of the usual RE probit model is easily obtained as a test of H0: ξ=0

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

18 / 29

RE & the strict exogeneity assumption Wooldridge, 2003 (pag. 490)

RE estimation relies on the strict exogeneity assumption Correcting for an explanatory variable that is not strictly exogenous is quite difficult in nonlinear models (see Wooldridge, 2000) It is however possible to test for strict exo: . Let wit denote a variable suspected of failing the strict exogeneity requirement (subset of xit ) . A simple test adds wit+1 as an additional set of covariates . If strict exo holds, wit+1 should be insignificant . If the test does not reject, it provides at least some justification for the strict exo assumption

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

19 / 29

Fixed Effect approach FE in non-linear model: unsolved problem in econometrics Incidental parameter problem If you force estimation (by including dummies), how serious is the bias? . Consider a logit model with T = 2; one regressor with xi1 = 0 and xi2 = 1: plimβˆMLE = 2β (Hsiao, 2003) . Simulation experiment by Greene (2004): MLE is biased even for large T however it improves as T increases. The bias is 100% with T = 2; 16% with T = 10 and 6.9% with T = 20 (N = 1000) . Simulation experiment by Heckman and MaCurdy (1981), the bias is about 10% (N = 100; T = 8) . Trade-off between the virtue of FE and incidental parameter problem (Arellano, 2001)

The problem can be solved in the logit (and poisson) models Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

20 / 29

Conditional maximum likelihood estimation (1)

For the logit model, Chamberlain (1980) finds that minimal sufficient statistics for ci

PT

t=1 yit

is a

PT . Put it differently, conditioned on ni = t=1 yit , the log-lik does not contain ci , solving the incidental parameter problem

Consider the case T = 2 The conditional likelihood QN P2 can be computed by looking at Lc = i=1 Pr(yi1 , yi2 | t=1 yit ) The sum yi1 + yi2 can be 0, 1, 2 . If yi1 + yi2 = 0, then yi1 = yi2 = 0: Pr(0, 0|sum = 0) = 1 . If yi1 + yi2 = 2, then yi1 = yi2 = 1: Pr(1, 1|sum = 2) = 1 . Only units where yi1 + yi2 = 1 will contribute to the log-lik

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

21 / 29

Conditional maximum likelihood estimation (2) Pr(0, 1|sum = 1) =

Pr(0, 1, sum = 1) Pr(0, 1) = Pr(sum = 1) Pr(0, 1) + Pr(1, 0)

Therefore the conditional probability can be written in a form that does not contain ci :

Pr(0, 1|1) =

=

=

Pr1 (0) × Pr2 (1) Pr1 (0) × Pr2 (1) + Pr1 (1) × Pr2 (0) 0 β+c ) exp(xi2 i 1 0 β+c ) 1+exp(x 0 β+c ) 1+exp(xi1 i i i2 0 β+c ) 0 β+c ) exp(xi2 exp(xi1 i i 1 1 0 β+c ) 1+exp(x 0 β+c ) + 1+exp(x 0 β+c ) 1+exp(x 0 β+c ) 1+exp(xi1 i i i i i2 i1 i2 0 0 exp(xi2 β) exp[(xi2 − xi1 ) β] = 0 0 exp(xi1 β) + exp(xi2 β) 1 + exp[(xi2 − xi1 )0 β]

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

22 / 29

Conditional maximum likelihood estimation (3)

Analogously: Pr(1, 0|1) =

0 β) exp(xi1 1 0 β) = 1 + exp[(x − x )0 β] + exp(xi2 i2 i1

0 β) exp(xi1

Standard logit package can be used for estimation Only observations where yi1 + yi2 = 1 contribute to the likelihood Easily generalized to T > 2 Test for individual heterogeneity by Hausman’s test comparing conditional MLE and the usual MLE ignoring the effects . Conditional lik approach not available with probit

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

23 / 29

Max score estimator (MSE) Manski (1975, 1987)

It is possible to relax the logit assumption by generalizing the MSE to panel data In cross-section let qi = 2yi − 1 and α a preset quantile MSE is based on the fitting rule max S(β) =

N 1 X [qi − (1 − 2α)]sgn(xi0 β) N i=1

If α = 1/2 then (1 − 2α) and the MSE is computed as N 1 X max S(β) = qi sgn(xi0 β) N i=1

It max the number of times the predictor xi0 β has the same sign as qi (i.e. it max the number of correct predictions) Identification condition: β 0 β = 1 Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

24 / 29

Manski estimator with panel data Manski allows for a strictly increasing distribution function which differs across individuals, but not over time for the same individual Strict exo is still needed (lagged dep vars are ruled out) For T = 2, the identification of β is based on the fact that (under regularity conditions on the distribution of exogenous variables) sgn[Pr(yi2 = 1|xi , ci ) − Pr(yi1 = 1|xi , ci )] = sgn[(xi2 − xi1 )0 β] For panel, MSE can be applied to the differences ∆yit on ∆xit Exploit only the observations where yi1 6= yi2 Note that there is no likelihood, no information matrix, no s.e.: bootstrap can be employed for computing s.e. No functional form for Pr(yit = 1), therefore no marginal effects Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

25 / 29

Overview of STATA commands For probit, xtprobit only allows re approach . There is no command for a conditional FE model, as there does not exist a sufficient statistic allowing the fixed effects to be conditioned out of the likelihood . Estimation is slow because the likelihood function is calculated by adaptive Gauss-Hermite quadrature . Computation time is roughly proportional to the number of points used for the quadrature; the default is intpoints(12) . Use quadchk to check sensitivity of quadrature approximation

In the case of xtlogit, both re and fe options can be considered . fe is conditional fixed-effect (also obtained by clogit) . re estimates are obtained under the assumption of normality of ci

MSE not implemented (feasible up to 15 coeffs and 1,500-2,000 observations)

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

26 / 29

Censored regression model Unobserved effect Tobit model

yit∗ = xit0 β ∗ ci + uit yit uit |xi , ci

= max(0, yit∗ ) ∼ N(0, σu2 )

Analogous treatment to the probit case FE approach provides inconsistent estimates RE – Need to specify the distribution of ci |xi ∼ N(0, σc2 ) Approximation is needed for solving the integral in the “probit” part

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

27 / 29

Count data and panel data models

Leading ref: Hausman, Hall, Griliches (1984) – developed fixed and random effect models under full distributional assumptions Pooled Poisson QMLE Conditional estimation of fixed effect models . Sufficient statistics: ni =

PT

t=1 yit

Random effect approach (Gamma distribution assumed for ci ) . Recent advances in simulation methods allow ci ∼ N

Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

28 / 29

Main references Arellano M, Honor´e B (2001): “Panel Data Models: Some Recent Developments”, Handbook of Econometrics Chamberlain G (1980): “Analysis of Covariance with Qualitative Data”, Review of Economic Studies 47, 225–238 Chamberlain G (1984): “Panel Data”, in Griliches Intriligator (eds) Handbook of Econometrics, 1247–1318 Greene WH (2003): Econometric Analysis, ch.21 Baltagi BH (2008): Econometric Analysis of Panel Data (4th ed.), ch.11 Hajivassiliou VA (1984): “Estimation by Simulation of External Debt Repayment Problems”, Journal of Applied Econometrics 9, 109–132 Hsiao C (2003): Analysis of Panel Data Wooldridge, JM (2002): Econometric Analysis of Cross Section and Panel Data, ch.15 Laura Magazzini (@univr.it)

Non-linear panel data modeling

May 2010

29 / 29

Non-linear panel data modeling [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch