Performs a logistic (binomial) or auto-logistic (spatially lagged binomial) regression using maximum likelihood or penalized maximum likelihood estimation.
It should be noted that the auto-logistic model (Besag 1972) is intended for exploratory analysis of spatial effects. Auto-logistic are know to underestimate the effect of environmental variables and tend to be unreliable (Dormann 2007). Wij matrix options under style argument - B is the basic binary coding, W is row standardized (sums over all links to n), C is globally standardized (sums over all links to n), U is equal to C divided by the number of neighbours (sums over all links to unity) and S is variance-stabilizing. Spatially lagged y defined as: W(y)ij=sumj_(Wij yj)/ sumj_(Wij) where; Wij=1/Euclidean(i,j) If the object passed to the function is an sp class there is no need to call the data slot directly via "object@data", just pass the object name.
logistic.regression( ldata, y, x, penalty = TRUE, autologistic = FALSE, coords = NULL, bw = NULL, type = "inverse", style = "W", longlat = FALSE, ... )
ldata | data.frame object containing variables |
---|---|
y | Dependent variable (y) in ldata |
x | Independent variable(s) (x) in ldata |
penalty | Apply regression penalty (TRUE/FALSE) |
autologistic | Add auto-logistic term (TRUE/FALSE) |
coords | Geographic coordinates for auto-logistic model matrix or sp object. |
bw | Distance bandwidth to calculate spatial lags (if empty neighbors result, need to increase bandwidth). If not provided it will be calculated automatically based on the minimum distance that includes at least one neighbor. |
type | Neighbor weighting scheme (see autocov_dist) |
style | Type of neighbor matrix (Wij), default is mean of neighbors |
longlat | Are coordinates (coords) in geographic, lat/long (TRUE/FALSE) |
... | Additional arguments passed to lrm |
A list class object with the following components:
model - lrm model object (rms class)
bandwidth - If AutoCov = TRUE returns the distance bandwidth used for the auto-covariance function
diagTable - data.frame of regression diagnostics
coefTable - data.frame of regression coefficients
Residuals - data.frame of residuals and standardized residuals
AutoCov - If an auto-logistic model, AutoCov represents lagged auto-covariance term
Besag, J.E., (1972) Nearest-neighbour systems and the auto-logistic model for binary data. Journal of the Royal Statistical Society, Series B Methodological 34:75-83
Dormann, C.F., (2007) Assessing the validity of autologistic regression. Ecological Modelling 207:234-242
Le Cessie, S., Van Houwelingen, J.C., (1992) Ridge estimators in logistic regression. Applied Statistics 41:191-201
Shao, J., (1993) Linear model selection by cross-validation. JASA 88:486-494
Jeffrey S. Evans jeffrey_evans@tnc.org
#>#>#>#>#> #>#>#> #>#>#>#> #>#>#> #>#>#> #>#>#> #>#>#> #>#>#> #>#> Warning: undefined subclass "numericVector" of class "Mnumeric"; definition not updateddata(meuse) coordinates(meuse) <- ~x+y meuse@data <- data.frame(DepVar=rbinom(dim(meuse)[1], 1, 0.5), meuse@data) #### Logistic model lmodel <- logistic.regression(meuse, y='DepVar', x=c('dist','cadmium','copper')) lmodel$model#> Logistic Regression Model #> #> rms::lrm(formula = form, data = ldata, x = TRUE, y = TRUE, penalty = bf$penalty) #> #> #> Penalty factors #> #> simple nonlinear interaction nonlinear.interaction #> 1 1 1 1 #> #> Model Likelihood Discrimination Rank Discrim. #> Ratio Test Indexes Indexes #> Obs 155 LR chi2 7.39 R2 0.061 C 0.616 #> 0 88 d.f. 2.623 g 0.483 Dxy 0.232 #> 1 67 Pr(> chi2)0.0447 gr 1.621 gamma 0.232 #> max |deriv| 2e-09 Penalty 0.24 gp 0.110 tau-a 0.115 #> Brier 0.235 #> #> Coef S.E. Wald Z Pr(>|Z|) Penalty Scale #> Intercept 0.7961 0.5882 1.35 0.1759 0.0000 #> dist -1.0084 1.0439 -0.97 0.3340 0.1977 #> cadmium -0.0440 0.1173 -0.38 0.7076 3.5237 #> copper -0.0175 0.0171 -1.03 0.3037 23.6804 #>lmodel$diagTable#> Names Value #> 1 Obs 1.550000e+02 #> 2 Max Deriv 2.107191e-09 #> 3 Model L.R. 7.393192e+00 #> 4 d.f. 2.623476e+00 #> 5 P 4.466970e-02 #> 6 C 6.159261e-01 #> 7 Dxy 2.318521e-01 #> 8 Gamma 2.320489e-01 #> 9 Tau-a 1.145371e-01 #> 10 R2 6.054035e-02 #> 11 Brier 2.346796e-01 #> 12 g 4.832799e-01 #> 13 gr 1.621384e+00 #> 14 gp 1.098474e-01 #> 15 PEN 1.000000e+00 #> 16 AIC 2.146239e+00lmodel$coefTable#> Variable Coef StdError Wald Prob #> 1 Intercept 0.79612771 0.58816259 1.3535844 0.1758690 #> 2 dist -1.00843098 1.04392027 -0.9660038 0.3340423 #> 3 cadmium -0.04401098 0.11731756 -0.3751440 0.7075534 #> 4 copper -0.01753671 0.01705085 -1.0284945 0.3037173#### Logistic model with factorial variable lmodel <- logistic.regression(meuse, y='DepVar', x=c('dist','cadmium','copper', 'soil')) lmodel$model#> Logistic Regression Model #> #> rms::lrm(formula = form, data = ldata, x = TRUE, y = TRUE, penalty = bf$penalty) #> #> #> Penalty factors #> #> simple nonlinear interaction nonlinear.interaction #> 1 1 1 1 #> #> Model Likelihood Discrimination Rank Discrim. #> Ratio Test Indexes Indexes #> Obs 155 LR chi2 7.65 R2 0.062 C 0.612 #> 0 88 d.f. 4.278 g 0.495 Dxy 0.224 #> 1 67 Pr(> chi2)0.1231 gr 1.640 gamma 0.225 #> max |deriv| 2e-09 Penalty 0.27 gp 0.113 tau-a 0.111 #> Brier 0.234 #> #> Coef S.E. Wald Z Pr(>|Z|) Penalty Scale #> Intercept 0.8395 0.5969 1.41 0.1596 0.0000 #> dist -1.0111 1.2272 -0.82 0.4100 0.1977 #> cadmium -0.0443 0.1176 -0.38 0.7064 3.5237 #> copper -0.0179 0.0171 -1.04 0.2965 23.6804 #> soil=2 -0.1259 0.4050 -0.31 0.7559 0.8165 #> soil=3 0.1227 0.6643 0.18 0.8535 0.8165 #>lmodel$diagTable#> Names Value #> 1 Obs 1.550000e+02 #> 2 Max Deriv 1.865718e-09 #> 3 Model L.R. 7.652394e+00 #> 4 d.f. 4.278462e+00 #> 5 P 1.230936e-01 #> 6 C 6.121947e-01 #> 7 Dxy 2.243894e-01 #> 8 Gamma 2.245036e-01 #> 9 Tau-a 1.108504e-01 #> 10 R2 6.236510e-02 #> 11 Brier 2.342899e-01 #> 12 g 4.945176e-01 #> 13 gr 1.639707e+00 #> 14 gp 1.126479e-01 #> 15 PEN 1.000000e+00 #> 16 AIC -9.045299e-01lmodel$coefTable#> Variable Coef StdError Wald Prob #> 1 Intercept 0.83948551 0.59687893 1.4064586 0.1595880 #> 2 dist -1.01106194 1.22716994 -0.8238973 0.4099979 #> 3 cadmium -0.04428282 0.11755836 -0.3766880 0.7064055 #> 4 copper -0.01788235 0.01712859 -1.0440063 0.2964825 #> 5 soil=2 -0.12589581 0.40497832 -0.3108705 0.7558991 #> 6 soil=3 0.12268901 0.66429948 0.1846893 0.8534727### Auto-logistic model using 'autocov_dist' in 'spdep' package lmodel <- logistic.regression(meuse, y='DepVar', x=c('dist','cadmium','copper'), autologistic=TRUE, coords=coordinates(meuse), bw=5000) lmodel$model#> Logistic Regression Model #> #> rms::lrm(formula = form, data = ldata, x = TRUE, y = TRUE, penalty = bf$penalty) #> #> #> Penalty factors #> #> simple nonlinear interaction nonlinear.interaction #> 1 1 1 1 #> #> Model Likelihood Discrimination Rank Discrim. #> Ratio Test Indexes Indexes #> Obs 155 LR chi2 10.49 R2 0.085 C 0.628 #> 0 88 d.f. 3.577 g 0.602 Dxy 0.256 #> 1 67 Pr(> chi2)0.0240 gr 1.825 gamma 0.256 #> max |deriv| 1e-08 Penalty 0.41 gp 0.136 tau-a 0.126 #> Brier 0.230 #> #> Coef S.E. Wald Z Pr(>|Z|) Penalty Scale #> Intercept 3.7502 1.8505 2.03 0.0427 0.0000 #> dist -1.0663 1.0486 -1.02 0.3092 0.1977 #> cadmium -0.0286 0.1195 -0.24 0.8112 3.5237 #> copper -0.0220 0.0176 -1.25 0.2122 23.6804 #> AutoCov -6.5606 3.8810 -1.69 0.0909 0.0434 #>lmodel$diagTable#> Names Value #> 1 Obs 1.550000e+02 #> 2 Max Deriv 1.480466e-08 #> 3 Model L.R. 1.049036e+01 #> 4 d.f. 3.577436e+00 #> 5 P 2.403887e-02 #> 6 C 6.279681e-01 #> 7 Dxy 2.559362e-01 #> 8 Gamma 2.560665e-01 #> 9 Tau-a 1.264349e-01 #> 10 R2 8.450814e-02 #> 11 Brier 2.303498e-01 #> 12 g 6.015511e-01 #> 13 gr 1.824947e+00 #> 14 gp 1.356461e-01 #> 15 PEN 1.000000e+00 #> 16 AIC 3.335485e+00lmodel$coefTable#> Variable Coef StdError Wald Prob #> 1 Intercept 3.75019690 1.85052025 2.0265636 0.04270707 #> 2 dist -1.06628332 1.04859510 -1.0168685 0.30921599 #> 3 cadmium -0.02855437 0.11952431 -0.2389001 0.81118303 #> 4 copper -0.02196396 0.01760738 -1.2474288 0.21224033 #> 5 AutoCov -6.56060992 3.88103656 -1.6904272 0.09094625est <- predict(lmodel$model, type='fitted.ind') #### Add residuals, standardized residuals and estimated probabilities VarNames <- rownames(lmodel$model$var)[-1] meuse@data$AutoCov <- lmodel$AutoCov meuse@data <- data.frame(meuse@data, Residual=lmodel$Residuals[,1], StdResid=lmodel$Residuals[,2], Probs=predict(lmodel$model, meuse@data[,VarNames],type='fitted') ) #### Plot fit and probabilities resid(lmodel$model, "partial", pl="loess")#> Sum of squared errors Expected value|H0 SD #> 35.7042169 35.7891682 0.1244555 #> Z P #> -0.6825841 0.4948697# Approx. leave-out linear predictors lp1 <- resid(lmodel$model, "lp1") # Approx leave-out-1 deviance -2 * sum(meuse@data$DepVar * lp1 + log(1-plogis(lp1)))#> [1] 211.513