5

I am trying to merge the two dataframes (main and sub). I want the 'variable' data from 'sub' to be merged to 'main' based on distance or better yet, whichever 'sub' row/site is closest to the 'main' row/site.

library(sf)

a <- structure(list(`Site#` = c("Site1", "Site2", "Site3", "Site4", "Site5", "Site6"), Longitude = c(-94.609, -98.1391, -99.033, -98.49, -96.4309, -95.99), `Latitude` = c(38.922, 37.486111, 37.811, 38.364, 39.4402, 39.901)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))

main <- st_as_sf(a, coords = c("Longitude", "Latitude"), crs = 4326)

b <- structure(list(Longitude = c(-98.49567, -96.22451, -98.49567, -98.941391, -95.91411, -99.031113), `Latitude` = c(38.31264,39.97692, 38.31264, 37.486111, 39.92143, 37.814171), Variable = c(400, 50, 100, 201, 99, 700)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))

sub <- st_as_sf(b, coords = c("Longitude", "Latitude"), crs = 4326)

c <- st_intersection(main,sub)
c <- st_is_within_distance(main,sub,dist=0.001)

I believe that the st_intersection is what I want, but if I could do it one-to-one based on distance, that would make it work. Does anyone know what could provide the result I am looking for?

2 Answers 2

9

st_join() allows for joining in a single step:

st_join(main, sub, join = st_nearest_feature, left = T)


#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#> Simple feature collection with 6 features and 2 fields
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: -99.033 ymin: 37.48611 xmax: -94.609 ymax: 39.901
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 6 x 3
#>   `Site#`            geometry Variable
#>   <chr>           <POINT [°]>    <dbl>
#> 1 Site1      (-94.609 38.922)       99
#> 2 Site2   (-98.1391 37.48611)      201
#> 3 Site3      (-99.033 37.811)      700
#> 4 Site4       (-98.49 38.364)      400
#> 5 Site5    (-96.4309 39.4402)       50
#> 6 Site6       (-95.99 39.901)       99

Created on 2020-01-19 by the reprex package (v0.3.0)

Sign up to request clarification or add additional context in comments.

1 Comment

This is more concise so I will give this the answer. Thanks!
2

This is what I tried. It seems that you need st_nearest_feature(), which gets index of nearest feature. Once you have indices, you add them to main. You also add row numbers (the indices) to b. Then, you want to handle join.

library(dplyr)
library(sf)

# Which feature in y is closest to each feature in x?
# You get row index
st_nearest_feature(x = main, y = sub)

# Add the index number to main.
mutate(main, ind = st_nearest_feature(x = main, y = sub)) -> main

# Add row numbers (index) to b
mutate(b, ind = 1:n()) -> b

left_join(main, b, by = "ind")

#  `Site#`            geometry   ind Longitude Latitude Variable
#  <chr>           <POINT [°]> <int>     <dbl>    <dbl>    <dbl>
#1 Site1      (-94.609 38.922)     5     -95.9     39.9       99
#2 Site2   (-98.1391 37.48611)     4     -98.9     37.5      201
#3 Site3      (-99.033 37.811)     6     -99.0     37.8      700
#4 Site4       (-98.49 38.364)     1     -98.5     38.3      400
#5 Site5    (-96.4309 39.4402)     2     -96.2     40.0       50
#6 Site6       (-95.99 39.901)     5     -95.9     39.9       99

2 Comments

Great answer! Works perfectly.
@FISHnR I am glad to hear that. I hope you can keep rolling in your task. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.