Package 'mhurdle' reference manual

Title:	Multiple Hurdle Tobit Models
Description:	Estimation of models with dependent variable left-censored at zero. Null values may be caused by a selection process Cragg (1971) <doi:10.2307/1909582>, insufficient resources Tobin (1958) <doi:10.2307/1907382>, or infrequency of purchase Deaton and Irish (1984) <doi:10.1016/0047-2727(84)90067-7>.
Authors:	Yves Croissant [aut, cre] , Fabrizio Carlevaro [aut], Stephane Hoareau [aut]
Maintainer:	Yves Croissant <[email protected]>
License:	GPL (>=2)
Version:	1.3-2
Built:	2025-02-13 03:12:25 UTC
Source:	https://github.com/ycroissant/mhurdle

Interview

Description

a cross section from 2014

Format

A dataframe containing :

month: the month of the interview,
size: the number of person in the household,
cu: the number of consumption units in the household,
income: the income of the household for the 12 month before the interview,
linc: the logarithme of the net income per consumption unit divided by its mean,
linc2: the square of link,
smsa: does the household live in a SMSA (yes or no),
sex: the sex of the reference person of the household (male and female),
race: the race of the head of the household, one of white, black, indian, asian, pacific and multirace,
hispanic: is the reference person of the household is hispanic (no or yes),
educ: the number of year of education of the reference person of the household,
age: the age of the reference person of the household - 50,
age2: the square of age
car: cars in the household,
food: food,
alcool: ,
housing: ,
apparel: ,
transport: ,
health: ,
entertainment: ,
perscare: ,
reading: ,
education: ,
tobacco: ,
miscexp: ,
cashcont: ,
insurance: ,
shows: ,
foodaway: ,
vacations: .

Details

number of observations : 1000

observation : households

country : United-States

Source

Consumer Expenditure Survey (CE), program of the US Bureau of Labor Statistics https://www.bls.gov/cex/, interview survey.

Estimation of limited dependent variable models

Description

mhurdle fits a large set of models relevant when the dependent variable is 0 for a part of the sample.

Usage

mhurdle(
  formula,
  data,
  subset,
  weights,
  na.action,
  start = NULL,
  dist = c("ln", "n", "bc", "ihs"),
  h2 = FALSE,
  scaled = TRUE,
  corr = FALSE,
  robust = TRUE,
  check_gradient = FALSE,
  ...
)
mhurdle(
  formula,
  data,
  subset,
  weights,
  na.action,
  start = NULL,
  dist = c("ln", "n", "bc", "ihs"),
  h2 = FALSE,
  scaled = TRUE,
  corr = FALSE,
  robust = TRUE,
  check_gradient = FALSE,
  ...
)

Arguments

`formula`	a symbolic description of the model to be fitted,
`data`	a `data.frame`,
`subset`	see `stats::lm()`,
`weights`	see `stats::lm()`,
`na.action`	see `stats::lm()`,
`start`	starting values,
`dist`	the distribution of the error of the consumption equation: one of `"n"` (normal), `"ln"` (log-normal) `"bc"` (box-cox normal) and `"ihs"` (inverse hyperbolic sinus transformation),
`h2`	if `TRUE` the second hurdle is effective, it is not otherwise,
`scaled`	if `TRUE`, the dependent variable is divided by its geometric mean,
`corr`	a boolean indicating whether the errors of the different equations are correlated or not,
`robust`	transformation of the structural parameters in order to avoid numerical problems,
`check_gradient`	if `TRUE`, a matrix containing the analytical and the numerical gradient for the starting values are returned,
`...`	further arguments.

Details

mhurdle fits models for which the dependent variable is zero for a part of the sample. Null values of the dependent variable may occurs because of one or several mechanisms : good rejection, lack of ressources and purchase infrequency. The model is described using a three-parts formula : the first part describes the selection process if any, the second part the regression equation and the third part the purchase infrequency process. y ~ 0 | x1 + x2 | z1 + z2 means that there is no selection process. y ~ w1 + w2 | x1 + x2 | 0 and y ~ w1 + w2 | x1 + x2 describe the same model with no purchase infrequency process. The second part is mandatory, it explains the positive values of the dependant variable. The dist argument indicates the distribution of the error term. If dist = "n", the error term is normal and (at least part of) the zero observations are also explained by the second part as the result of a corner solution. Several models described in the litterature are obtained as special cases :

A model with a formula like y~0|x1+x2 and dist="n" is the Tobit model proposed by (Tobin 1958).

y~w1+w2|x1+x2 and dist="l" or dist="t" is the single hurdle model proposed by (Cragg 1971). With dist="n", the double hurdle model also proposed by (Cragg 1971) is obtained. With corr="h1" we get the correlated version of this model described by (Blundell and Meghir 1987).

y~0|x1+x2|z1+z2 is the P-Tobit model of (Deaton and Irish 1984), which can be a single hurdle model if dist="t" or dist="l" or a double hurdle model if dist="n".

Value

#' an object of class c("mhurdle", "maxLik").

A mhurdle object has the following elements :

coefficients: the vector of coefficients,
vcov: the covariance matrix of the coefficients,
fitted.values: a matrix of fitted.values, the first column being the probability of 0 and the second one the mean values for the positive observations,
logLik: the log-likelihood,
gradient: the gradient at convergence,
model: a data.frame containing the variables used for the estimation,
coef.names: a list containing the names of the coefficients in the selection equation, the regression equation, the infrequency of purchase equation and the other coefficients (the standard deviation of the error term and the coefficient of correlation if corr = TRUE,
formula: the model formula, an object of class Formula
call: the call,
rho: the lagrange multiplier test of no correlation.

References

Blundell R, Meghir C (1987). “Bivariate Alternatives to the Tobit Model.” Journal of Econometrics, 34, 179-200.

Cragg JG (1971). “Some Statistical Models for Limited Dependent Variables with Applications for the Demand for Durable Goods.” Econometrica, 39(5), 829-44.

Deaton AS, Irish M (1984). “A Statistical Model for Zero Expenditures in Household Budgets.” Journal of Public Economics, 23, 59-80.

Tobin J (1958). “Estimation of Relationships for Limited Dependent Variables.” Econometrica, 26(1), 24-36.

Examples


data("Interview", package = "mhurdle")

# independent double hurdle model
idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs")

# dependent double hurdle model
ddhm <- mhurdle(vacations ~ car + size | linc + linc2  | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE)

# a double hurdle p-tobit model
ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE)
data("Interview", package = "mhurdle")

# independent double hurdle model
idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs")

# dependent double hurdle model
ddhm <- mhurdle(vacations ~ car + size | linc + linc2  | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE)

# a double hurdle p-tobit model
ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE)

Methods for mhurdle fitted objects

Description

specific predict, fitted, coef, vcov, summary, ... for mhurdle objects. In particular, these methods enables to extract the several parts of the model

Usage

## S3 method for class 'mhurdle'
coef(
  object,
  which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"),
  ...
)

## S3 method for class 'mhurdle'
vcov(
  object,
  which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"),
  ...
)

## S3 method for class 'mhurdle'
logLik(object, naive = FALSE, ...)

## S3 method for class 'mhurdle'
print(
  x,
  digits = max(3, getOption("digits") - 2),
  width = getOption("width"),
  ...
)

## S3 method for class 'mhurdle'
summary(object, ...)

## S3 method for class 'summary.mhurdle'
coef(
  object,
  which = c("all", "h1", "h2", "h3", "sd", "corr", "tr", "pos"),
  ...
)

## S3 method for class 'summary.mhurdle'
print(
  x,
  digits = max(3, getOption("digits") - 2),
  width = getOption("width"),
  ...
)

## S3 method for class 'mhurdle'
fitted(object, which = c("all", "zero", "positive"), mean = FALSE, ...)

## S3 method for class 'mhurdle'
predict(object, newdata = NULL, what = c("E", "Ep", "p"), ...)

## S3 method for class 'mhurdle'
update(object, new, ...)

## S3 method for class 'mhurdle'
nobs(object, which = c("all", "null", "positive"), ...)

## S3 method for class 'mhurdle'
effects(
  object,
  covariate = NULL,
  data = NULL,
  what = c("E", "Ep", "p"),
  reflevel = NULL,
  mean = FALSE,
  ...
)
## S3 method for class 'mhurdle'
coef(
  object,
  which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"),
  ...
)

## S3 method for class 'mhurdle'
vcov(
  object,
  which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"),
  ...
)

## S3 method for class 'mhurdle'
logLik(object, naive = FALSE, ...)

## S3 method for class 'mhurdle'
print(
  x,
  digits = max(3, getOption("digits") - 2),
  width = getOption("width"),
  ...
)

## S3 method for class 'mhurdle'
summary(object, ...)

## S3 method for class 'summary.mhurdle'
coef(
  object,
  which = c("all", "h1", "h2", "h3", "sd", "corr", "tr", "pos"),
  ...
)

## S3 method for class 'summary.mhurdle'
print(
  x,
  digits = max(3, getOption("digits") - 2),
  width = getOption("width"),
  ...
)

## S3 method for class 'mhurdle'
fitted(object, which = c("all", "zero", "positive"), mean = FALSE, ...)

## S3 method for class 'mhurdle'
predict(object, newdata = NULL, what = c("E", "Ep", "p"), ...)

## S3 method for class 'mhurdle'
update(object, new, ...)

## S3 method for class 'mhurdle'
nobs(object, which = c("all", "null", "positive"), ...)

## S3 method for class 'mhurdle'
effects(
  object,
  covariate = NULL,
  data = NULL,
  what = c("E", "Ep", "p"),
  reflevel = NULL,
  mean = FALSE,
  ...
)

Arguments

`object`, `x`	an object of class `"mhurdle"`,
`which`	which coefficients or covariances should be extracted ? Those of the selection (`"h1"`), consumption (`"h2"`) or purchase (`"h3"`) equation, the other coefficients `"other"` (the standard error and the coefficient of corr), the standard error (`"sigma"`) or the coefficient of correlation (`"rho"`),
`...`	further arguments.
`naive`	a boolean, it `TRUE`, the likelihood of the naive model is returned,
`digits`	see `print`,
`width`	see `print`,
`mean`	if `TRUE`, the mean of the effects is returned,
`newdata`, `data`	a `data.frame` for which the predictions or the effectsshould be computed,
`what`	for the `predict` and the `effects` method, the kind of prediction, one of `E` `Ep` and `p` (respectively for expected values in the censored sample, expected values in the truncated sample and probability of positive values),
`new`	an updated formula for the `update` method,
`covariate`	the covariate for which the effect has to be computed,
`reflevel`	for the computation of effects for a factor, the reference level,

R squared and pseudo R squared

Description

This function computes the R squared for multiple hurdle models. The measure is a pseudo coefficient of determination or may be based on the likelihood.

Usage

rsq(
  object,
  type = c("coefdet", "lratio"),
  adj = FALSE,
  r2pos = c("rss", "ess", "cor")
)
rsq(
  object,
  type = c("coefdet", "lratio"),
  adj = FALSE,
  r2pos = c("rss", "ess", "cor")
)

Arguments

`object`	an object of class `"mhurdle"`,
`type`	one of `"coefdet"` or `"lratio"` to select a pseudo coefficient of correlation or a Mc Fadden like measure based on the likelihood function,
`adj`	if `TRUE` a correction for the degrees of freedom is performed,
`r2pos`	only for pseudo coefficient of determination, should the positive part of the R squared be computed using the residual sum of squares (`"rss"`), the explained sum of squares (`"ess"`) or the coefficient of correlation between the fitted values and the response (`cor`).

Value

a numerical value

References

McFadden D (1974). The Measurement of Urban Travel Demand. Journal of Public Economics, 3, 303-328.

Examples


data("Interview", package = "mhurdle")
# independent double hurdle model
idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs")
rsq(idhm, type = "lratio")
rsq(idhm, type = "coefdet", r2pos = "rss")
data("Interview", package = "mhurdle")
# independent double hurdle model
idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bfgs")
rsq(idhm, type = "lratio")
rsq(idhm, type = "coefdet", r2pos = "rss")

Vuoung test for non-nested models

Description

The Vuong test is suitable to discriminate between two non-nested models.

Usage

vuongtest(
  x,
  y,
  type = c("non-nested", "nested", "overlapping"),
  true_model = FALSE,
  variance = c("centered", "uncentered"),
  matrix = c("large", "reduced")
)
vuongtest(
  x,
  y,
  type = c("non-nested", "nested", "overlapping"),
  true_model = FALSE,
  variance = c("centered", "uncentered"),
  matrix = c("large", "reduced")
)

Arguments

`x`	a first fitted model of class `"mhurdle"`,
`y`	a second fitted model of class `"mhurdle"`,
`type`	the kind of test to be computed,
`true_model`	a boolean, `TRUE` if one of the models is asumed to be the true model,
`variance`	the variance is estimated using the `centered` or `uncentered` expression,
`matrix`	the W matrix can be computed using the general expression `large` or the reduced matrix `reduced` (only relevant for the nested case),

Value

an object of class "htest"

References

Vuong Q.H. (1989) Likelihood ratio tests for model selection and non-nested hypothesis, Econometrica, vol.57(2), pp.307-33.

Examples


data("Interview", package = "mhurdle")
# dependent double hurdle model
dhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE)

# a double hurdle p-tobit model
ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview,
              dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE)
vuongtest(dhm, ptm)
data("Interview", package = "mhurdle")
# dependent double hurdle model
dhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview,
              dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE)

# a double hurdle p-tobit model
ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview,
              dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE)
vuongtest(dhm, ptm)

Package 'mhurdle'

Help Index

Interview

Description

Format

Details

Source

Estimation of limited dependent variable models

Description

Usage

Arguments

Details

Value

References

Examples

Methods for mhurdle fitted objects

Description

Usage

Arguments

R squared and pseudo R squared

Description

Usage

Arguments

Value

References

Examples

Vuoung test for non-nested models

Description

Usage

Arguments

Value

References

See Also

Examples