Code
library(tidyverse)
library(sf)
library(stars)
library(terra)
library(ggspatial)
library(plotly)
library(leaflet)
library(htmltools)
November 26, 2023
In February 2021, the state of Texas suffered a major energy crisis as a result of three severe winter storms and left many people without power. Historically, power outages may disproportionately affect marginalized groups and lower-income areas can be of lower priority in recovery. For more background: engineering and political perspectives.
In this post, I am interested in estimating the number of homes in Houston that lost power as a result of the first two storm and investigating if there is a correlation between loss of power and socioeconomic status.
To estimate homes that lost power I used remotely-sensed night lights data from the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite (specifically VNP46A1). To look at specific homes and factor in roads, I also used data from OpenStreetMap. And finally, socioeconomic data was obtained from the US Census Bureau.
Since these data files are quite large they have been omitted from the repository. Data can be downloaded as a .zip file here.
Since these night light files are quite large, I am using stars::read_stars()
to work with the raster data. Here I have two tiles for 2021-02-07 and two tiles for 2021-02-16.
# read in night light files using read_stars()
feb7_1 <- read_stars("data/VNP46A1/VNP46A1.A2021038.h08v05.001.2021039064328.tif")
feb7_2 <- read_stars("data/VNP46A1/VNP46A1.A2021038.h08v06.001.2021039064329.tif")
feb16_1 <- read_stars("data/VNP46A1/VNP46A1.A2021047.h08v05.001.2021048091106.tif")
feb16_2 <- read_stars("data/VNP46A1/VNP46A1.A2021047.h08v06.001.2021048091105.tif")
The tile pairs now need to be combined to create a full image of out area of interest.
Now that we have the full area we can begin finding the difference light intensity.
To find if there is a different in the night lights from before and after the storm, I found the different between the light intensity, assuming that any location that experienced a drop of more than 200 nW cm-2sr-1 experienced a power outage.
# finding the difference in night lights
night_diff <- (feb7 - feb16)
#plot(night_diff, main = "Difference in night light")
# create a mask, sectioning in to 200 to infinity; the rest become NA
night_mask <- cut(night_diff, c(200, Inf), labels = "outage")
# the night mask should have one level "outage" and NA for the rest; check using:
# unique(night_mask$VNP46A1.A2021038.h08v05.001.2021039064328.tif)
# vectorizing night_mask using st_as_sf
night_mask_vec <- st_as_sf(night_mask) %>%
st_make_valid() # fix invalid geometries
# checking that it is now sf data.frame
# class(night_mask_vec)
In this project I am wanting to specifically look at Houston, and right now, the map covers more area than just Houston. To focus in on Houston, I first need to define the coordinates of the city boundary and then turn them into a polygon and simple features collection.
# creating Houston border with the given coordinates using st_polygon()
houston_border <- st_polygon(list(rbind(c(-96.5,29), c(-96.5,30.5), c(-94.5, 30.5), c(-94.5,29), c(-96.5,29)))) # rbind combines objects
# convert to sfc and define crs
houston_border_sf <- st_sfc(houston_border, crs = 'EPSG:4326')
Now that the boarder is defined, I can use it to crop the blackout mask. Additionally, I need to make sure that both of the objects have the same coordinate reference system (CRS), and in this case, I want to use EPSG:3083 (NAD83 / Texas Centric Albers Equal Area).
Highways often make up a large portion of observable night light and this is not something I want to capture. So, using the roads data, I will exclude areas with roads, as well as include a 200m buffer around the roads.
This data includes more than just highways, so it is important to select just the highways (motorways).
## highway data ##
# define sql query
query_highways <- "SELECT * FROM gis_osm_roads_free_1 WHERE fclass='motorway'"
# load highway data using the query and reproject to EPSG:3083
highways <- st_read("data/gis_osm_roads_free_1.gpkg", query = query_highways) %>%
st_transform(highways, crs = 'EPSG:3083')
# highway buffer setting distance to 200m
highway_buffer <- st_buffer(x = highways, dist = 200) %>%
st_union() # dissolve undissolved buffers
Using the newly prepared highway data, I can now exclude areas within 200m of a highway.
# combine the geometries and exclude those within 200m of a highway
mask_houston_highway <- st_difference(outage_mask, highway_buffer)
## check that areas have been excluded ##
# nrow(mask_houston_highway) > nrow(outage_mask) # output should be FALSE
# plot(mask_houston_highway) # check out what it looks like
There were 7247 areas that experienced blackouts that are further than 200m from a highway.
Now that we have accounted for the highways it’s time to find the number of homes that experienced a blackout. Similarly to the road data, the data set that contains information on homes also contains other building types, so it’s important to select only residential buildings (and to make sure the CRS is EPSG:3083!).
## houses data ##
# define sql query
query_houses <- "SELECT * FROM gis_osm_buildings_a_free_1 WHERE (type IS NULL AND name IS NULL) OR type in ('residential', 'apartments', 'house', 'static_caravan', 'detached')"
# read in the houses data using query
houses <- st_read("data/gis_osm_buildings_a_free_1.gpkg", query = query_houses)
# transform crs
houses <- st_transform(houses, crs = 'EPSG:3083')
#plot(houses['type'])
# filter houses data using the blackout mask
outage_houses <- houses[mask_houston_highway, drop = FALSE]
# check and see that filtered data has fewer houses than the original
# nrow(houses) > nrow(outage_houses)
There were 157411 homes in Houston affected by the power outage.
The last data that needs to be prepared is the income data. From the geodatabase, we want the census tract geometries and the median income.
# Texas geometries data and transform crs
census_geom <- st_read("data/ACS_2019_5YR_TRACT_48_TEXAS.gdb", layer = "ACS_2019_5YR_TRACT_48_TEXAS") %>%
st_transform(census_geom, crs = 'EPSG:3083')
#st_crs(census_geom) # check the crs is correct
# income data
income <- st_read("data/ACS_2019_5YR_TRACT_48_TEXAS.gdb", layer = "X19_INCOME")
# select variables; rename median income and GEOID to match census_geom
income_median <- income %>%
select(GEOID, B19013e1) %>%
rename(median_income = B19013e1, GEOID_Data = GEOID)
With all of this data, I can now determine which census tracts experiences power outages and look at them in relation to median income for the census tract.
# joining census geometries and median income data
census_data <- left_join(census_geom,
income_median, by = "GEOID_Data")
# filtering the census data based on buildings to add a column indicating that these census tracts were part of a blackout
census_outage <- st_filter(census_data, outage_houses) %>%
mutate(blackout = 'had a blackout')
#head(census_outage)
## preparing data for mapping ##
houston_border_sf <- st_transform(houston_border_sf, crs = "EPSG:3083")
#creating a full df of Houston geometries
census_houston <- census_data[houston_border_sf, ,]
census_houston2 <- st_transform(census_houston, "+init=epsg:4326")
# making centroids to indicate which tracts had a blackout
census_outage_map <- census_outage %>%
st_centroid() %>%
st_transform("+init=epsg:4326")
# color paletted
pal <- colorBin("viridis", domain = census_houston2$median_income, reverse = TRUE) # define color palette
pal2 <- colorFactor("black", domain = census_outage_map$blackout, reverse = TRUE) # define color palette
# map labels
labels <- sprintf("$%g",
round(census_houston2$median_income, 2)) %>%
lapply(htmltools::HTML)
# mapping
leaflet() %>%
addPolygons(data = census_houston2,
fillColor = ~pal(census_houston2$median_income), # color region by median income
weight = 1,
opacity = 1,
color = "black",
dashArray = "3",
fillOpacity = 1,
# popup = p_popup,
highlightOptions = highlightOptions( # when you hover over there will be pop up text
weight = 5,
color = "#666",
dashArray = "",
fillOpacity = 1,
bringToFront = TRUE),
label = labels, # label by previously defined
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto")) %>%
addLegend(pal = pal, # legend based on color palette
values = census_houston2$median_income,
opacity = 0.7,
labFormat = labelFormat(prefix = "$"), # add units
title = paste("Median Income ($)"), # updates with species input
position = "bottomright") %>%
addCircleMarkers(data = census_outage_map, # add centroids to map
color = "black",
radius = 0.25,
opacity = 1) %>%
addLegend(pal = pal2, # legend for centroids
values = census_outage_map$blackout,
opacity = 1,
position = "bottomright") %>%
addScaleBar() %>%
addTiles() # basemap