Michael Minn - 23 March 2015
ZNDVIDIFF: Parcel pre-/post-foreclosure difference in NDVI deviation from tract median
ZASSESVALUE: Assessed value at time of foreclosure
ZHOMESQFT: Home square footage
ZHOMEAGE: Age of home at time of foreclosure
ZMEDHHINC: 2012 ACS median household income in census tract containing parcel
ZMEDIANAGE: 2012 ACS median age of residents in census tract containing parcel
ZPMEDPRE: Median parcel-level NDVI estimate one year prior to foreclosure
All variables were normalized to z-score before processing
The data is avilable for download as a zipped CSV HERE...
The following is a summary of a linear model with ZNDVIDIFF (NDVI difference in deviation) as the dependent variable and using QR factorization for least squares approximation.
While most of the variables are flagged as significant, the model fit is extremely poor (R2 = 0.014), as shown in the graph below.
Call:
lm(formula = ZNDVIDIFF ~ ZASSESVALUE + ZHOMESQFT + ZHOMEAGE +
ZMEDHHINC + ZMEDIANAGE + ZPMEDPRE, data = foreclosures)
Residuals:
Min 1Q Median 3Q Max
-13.0331 -0.5687 -0.0142 0.5613 23.8114
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.035769 0.006667 -5.365 8.10e-08 ***
ZASSESVALUE 0.071671 0.013239 5.414 6.19e-08 ***
ZHOMESQFT 0.001032 0.031124 0.033 0.974
ZHOMEAGE -0.223564 0.035440 -6.308 2.83e-10 ***
ZMEDHHINC -0.026457 0.003363 -7.866 3.68e-15 ***
ZMEDIANAGE 0.003477 0.002924 1.189 0.234
ZPMEDPRE -0.115643 0.002566 -45.063 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.9926 on 159503 degrees of freedom
(83128 observations deleted due to missingness)
Multiple R-squared: 0.01424, Adjusted R-squared: 0.0142
F-statistic: 384 on 6 and 159503 DF, p-value: < 2.2e-16
Probit models were built using with the sampled Google Earth observations of vegetation change (INCREASE and DECREASE) as the dependent variables.
In both cases, none of the provided variables were found to be significant, as shown in the summary below.
Call:
glm(formula = DECREASE ~ ZASSESVALUE + ZHOMESQFT + ZHOMEAGE +
ZMEDHHINC + ZMEDIANAGE + ZPMEDPRE, family = binomial(link = "probit"),
data = sample)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.3964 -0.6537 -0.5333 -0.3612 2.3093
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.24507 0.63362 -0.387 0.699
ZASSESVALUE 0.08499 0.76696 0.111 0.912
ZHOMESQFT 1.71773 1.76614 0.973 0.331
ZHOMEAGE 5.30920 3.40561 1.559 0.119
ZMEDHHINC 0.02243 0.20007 0.112 0.911
ZMEDIANAGE -0.04649 0.24648 -0.189 0.850
ZPMEDPRE 0.12789 0.10096 1.267 0.205
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 103.260 on 100 degrees of freedom
Residual deviance: 91.798 on 94 degrees of freedom
AIC: 105.8
Number of Fisher Scoring iterations: 5
===================================================
Call:
glm(formula = INCREASE ~ ZASSESVALUE + ZHOMESQFT + ZHOMEAGE +
ZMEDHHINC + ZMEDIANAGE + ZPMEDPRE, family = binomial(link = "probit"),
data = sample)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.3303 -0.4810 -0.3094 -0.1364 2.5999
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.12159 1.06155 -1.999 0.0457 *
ZASSESVALUE 0.71895 0.93249 0.771 0.4407
ZHOMESQFT 0.82453 2.26813 0.364 0.7162
ZHOMEAGE -4.63441 5.37619 -0.862 0.3887
ZMEDHHINC 0.05727 0.22800 0.251 0.8017
ZMEDIANAGE 0.06152 0.35986 0.171 0.8642
ZPMEDPRE -0.20513 0.14984 -1.369 0.1710
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 65.226 on 100 degrees of freedom
Residual deviance: 51.898 on 94 degrees of freedom
AIC: 65.898
Number of Fisher Scoring iterations: 7