Michael Minn - 23 March 2015
ZNDVIDIFF: Parcel pre-/post-foreclosure difference in NDVI deviation from tract median
ZASSESVALUE: Assessed value at time of foreclosure
ZHOMESQFT: Home square footage
ZHOMEAGE: Age of home at time of foreclosure
ZMEDHHINC: 2012 ACS median household income in census tract containing parcel
ZMEDIANAGE: 2012 ACS median age of residents in census tract containing parcel
ZPMEDPRE: Median parcel-level NDVI estimate one year prior to foreclosure
All variables were normalized to z-score before processing
The data is avilable for download as a zipped CSV HERE...
The following is a summary of a linear model with ZNDVIDIFF (NDVI difference in deviation) as the dependent variable and using QR factorization for least squares approximation.
While most of the variables are flagged as significant, the model fit is extremely poor (R2 = 0.014), as shown in the graph below.
Call: lm(formula = ZNDVIDIFF ~ ZASSESVALUE + ZHOMESQFT + ZHOMEAGE + ZMEDHHINC + ZMEDIANAGE + ZPMEDPRE, data = foreclosures) Residuals: Min 1Q Median 3Q Max -13.0331 -0.5687 -0.0142 0.5613 23.8114 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.035769 0.006667 -5.365 8.10e-08 *** ZASSESVALUE 0.071671 0.013239 5.414 6.19e-08 *** ZHOMESQFT 0.001032 0.031124 0.033 0.974 ZHOMEAGE -0.223564 0.035440 -6.308 2.83e-10 *** ZMEDHHINC -0.026457 0.003363 -7.866 3.68e-15 *** ZMEDIANAGE 0.003477 0.002924 1.189 0.234 ZPMEDPRE -0.115643 0.002566 -45.063 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.9926 on 159503 degrees of freedom (83128 observations deleted due to missingness) Multiple R-squared: 0.01424, Adjusted R-squared: 0.0142 F-statistic: 384 on 6 and 159503 DF, p-value: < 2.2e-16
Probit models were built using with the sampled Google Earth observations of vegetation change (INCREASE and DECREASE) as the dependent variables.
In both cases, none of the provided variables were found to be significant, as shown in the summary below.
Call: glm(formula = DECREASE ~ ZASSESVALUE + ZHOMESQFT + ZHOMEAGE + ZMEDHHINC + ZMEDIANAGE + ZPMEDPRE, family = binomial(link = "probit"), data = sample) Deviance Residuals: Min 1Q Median 3Q Max -1.3964 -0.6537 -0.5333 -0.3612 2.3093 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.24507 0.63362 -0.387 0.699 ZASSESVALUE 0.08499 0.76696 0.111 0.912 ZHOMESQFT 1.71773 1.76614 0.973 0.331 ZHOMEAGE 5.30920 3.40561 1.559 0.119 ZMEDHHINC 0.02243 0.20007 0.112 0.911 ZMEDIANAGE -0.04649 0.24648 -0.189 0.850 ZPMEDPRE 0.12789 0.10096 1.267 0.205 (Dispersion parameter for binomial family taken to be 1) Null deviance: 103.260 on 100 degrees of freedom Residual deviance: 91.798 on 94 degrees of freedom AIC: 105.8 Number of Fisher Scoring iterations: 5 =================================================== Call: glm(formula = INCREASE ~ ZASSESVALUE + ZHOMESQFT + ZHOMEAGE + ZMEDHHINC + ZMEDIANAGE + ZPMEDPRE, family = binomial(link = "probit"), data = sample) Deviance Residuals: Min 1Q Median 3Q Max -1.3303 -0.4810 -0.3094 -0.1364 2.5999 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.12159 1.06155 -1.999 0.0457 * ZASSESVALUE 0.71895 0.93249 0.771 0.4407 ZHOMESQFT 0.82453 2.26813 0.364 0.7162 ZHOMEAGE -4.63441 5.37619 -0.862 0.3887 ZMEDHHINC 0.05727 0.22800 0.251 0.8017 ZMEDIANAGE 0.06152 0.35986 0.171 0.8642 ZPMEDPRE -0.20513 0.14984 -1.369 0.1710 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 65.226 on 100 degrees of freedom Residual deviance: 51.898 on 94 degrees of freedom AIC: 65.898 Number of Fisher Scoring iterations: 7