ggsomewhere template + sf2stat for writing sf layers from flat files. #10
Replies: 4 comments 1 reply
-
A new, protocol for creating reference data for the geo geom extensions, free of the {sf2stat} dependency! Hooray! 😭 usmapdata::us_map("states") %>% usmapcrsinfo <- sf::st_crs(usmapdata::us_map("states")) # get crs (I almost think you shouldn't need to do this either and should be able to use a default_crs = T condition, but I just haven't figured that out). |
Beta Was this translation helpful? Give feedback.
-
If we have functions like qstat() and prep_geo_reference to work with ... library(tidyverse)
## Prep reference data for Stat (NEW!)
qstat <- function(compute_group = Stat$compute_group, ...){ggproto("StatTemp", Stat, compute_group = compute_group, ...)}
prep_geo_reference <- function(ref_data){
ref_data |>
ggplot2::StatSf$compute_panel(coord = ggplot2::CoordSf) |>
ggplot2::StatSfCoordinates$compute_group(coord = ggplot2::CoordSf)
} ... we can see how we might get to a dedicated layer like geom_nc_county and geom_nc_county_text pretty easily. Exciting. We probably want to be doing panel-wise compute here because of the join... You can see at the end geom_nc_county2 which uses
# Flip the script... prepare compute (join) to happen in layer (NEW!)
compute_panel_region <- function(data, scales, ref_data){
ref_data %>%
prep_geo_reference() %>%
inner_join(data)
}
geom_nc_county <- function(...){
geom_sf(stat = qstat(compute_panel = compute_panel_region),
ref_data = nc_ref, ...)
}
geom_nc_county_text <- function(...){
geom_sf_text(stat = qstat(compute_panel = compute_panel_region,
default_aes = aes(label = after_stat(county_name))),
ref_data = nc_ref, ...)
} Then new mapping interface: nc |>
ggplot() +
aes(fips = FIPS) +
geom_nc_county() +
aes(fill = BIR74) +
geom_nc_county_text(color = "white",
check_overlap = TRUE)
#> Warning in layer_sf(data = data, mapping = mapping, stat = stat, geom = GeomText, : Ignoring unknown parameters: `fun.geometry`
#> st_point_on_surface may not give correct results for longitude/latitude data
#> Joining with `by = join_by(fips, geometry)`
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data
#> Joining with `by = join_by(fips, geometry)` groupwise for comparison... geom_nc_county2 <- function(...){
geom_sf(stat = qstat(compute_panel_region), # w compute group
ref_data = nc_ref, ...)
}
last_plot() +
geom_nc_county2()
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data
#> Joining with `by = join_by(fips, geometry)`
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data
#> Joining with `by = join_by(fips, geometry)`
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data
#> Joining with `by = join_by(fips, geometry)`
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data
#> Joining with `by = join_by(fips, geometry)`
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
---redacted Created on 2024-09-25 with reprex v2.1.0 |
Beta Was this translation helpful? Give feedback.
-
In sf2stat, now exporting the stat_region() generic stat, w/ proposed usage. Thanks @emilyriederer for data example and data questions: And some geography data for nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))
# prep data to id cols and geometry column..
nc_ref <- nc |>
select(county_name = NAME, fips = FIPS)
# then create a stat_* from stat_region(), (here we create stat_county, but your stat could be stat_state, stat_province, etc.)
stat_county <- function(...){stat_region(ref_data = nc_ref, required_aes = "county_name|fips", id_index = 1, ...)} # uses GeomSf as default And (recommended) convenience layer functions. do this routine, change out the ref data, required_aes and 'county' in convenience function names to appropriate name, e.g.GeomOutline <- ggproto("GeomOutline", GeomSf,
default_aes = aes(!!!modifyList(GeomSf$default_aes,
aes(fill = NA,
color = "black"))))
geom_county_sf <- function(...){stat_county(geom = GeomSf,...)}
geom_county <- geom_county_sf # convenience short name
geom_county_outline <- function(...){stat_county(geom = GeomOutline, ...)}
geom_county_label <- function(...){stat_county(geom = GeomLabel,...)}
geom_county_text <- function(...){stat_county(geom = GeomText, ...)}
stamp_county_sf <- function(...){geom_county_sf(stamp = T, ...)}
stamp_county <- stamp_county_sf
stamp_county_outline <- function(...){geom_county_outline(stamp = T, ...)}
stamp_county_label <- function(...){geom_county_label(stamp = T, ...)}
stamp_county_text <- function(...){geom_county_text(stamp = T, ...)} Then get mapping! given north carolina flat file w/ counties and characteristics: # given some flatfile data of interest
read.csv("nc-midterms.csv") |> head()
#> desc_county n cd_party ind_vote
#> 1 ONSLOW 24406 0.2059283 0.3862985
#> 2 ROBESON 36367 0.5061306 0.4066599
#> 3 RANDOLPH 15867 0.1651505 0.4230793
#> 4 ANSON 9028 0.5674062 0.4267833
#> 5 HALIFAX 21875 0.5865712 0.4337829
#> 6 ROWAN 23667 0.2424922 0.4338108 |
Beta Was this translation helpful? Give feedback.
-
I keep thinking that one of the reasons I'm working on the sf alternative workflow might be that joins just feel a little awkward to me. Like in the pipeline, the information about dataframe that you want to join isn't displayed (i.e. the variable names). Just to explore that a little here, https://github.com/EvaMaeRey/visualjoin. The sf2stat alleviates this (?) a bit because you look at your flat data, and then the plot poem requires an aes (i.e. state_name). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
'ggsomewhere' is a template repo showing how to build ggplace using new helper package sf2stat. sf2stat helps you prepare a reference dataset for use in creating a Stat proto to be used in geom layer:
https://github.com/EvaMaeRey/ggsomewhere
https://github.com/EvaMaeRey/sf2stat... you make a dataframe with id columns and geometries and x and y centers for annotation purposes.
Then you your required aesthetic can be something like state_name or state_code. So your input data ggplot(data = data), can be a flat file where you have ids and characteristics that you'd like to visualize (think choropleth). Geometry is joined to the data when you use the layer(GeomSF, StatNorthCarolina, ...) user-facing function.
Template tested w/ Brazilian and Chilean data so far: https://github.com/EvaMaeRey/ggbrasil2 https://github.com/EvaMaeRey/ggchile2
Built ggbrasil and ggchile with previously. Round 2 w/ template and sf2stat was a better experience - I actually used some natively flat files for chile and explored demographics, percent of population 0 to 5 in each of the chilean regions!
chilemapas::censo_2017_zonas |>
count(region = stringr::str_extract(geocodigo, ".."),
edad,
wt = poblacion) %>%
group_by(region) %>%
mutate(prop = 100*n/sum(n)) ->
chile_regiones_edades
chile_regiones_edades |>
pivot_wider(names_from = edad, values_from = prop, id_cols = region) |>
janitor::clean_names() |>
ggplot() +
aes(region_codigo = region) +
stat_region(linewidth = .02) +
aes(fill = x0_a_5) +
scale_fill_viridis_c()
A lot of the time in mapping, Chile's Isla de Pascua is dropped or moved in, and North and South parts of the country is split. None of this is attempted here.
Beta Was this translation helpful? Give feedback.
All reactions