This is the second post in my series on getting started visualising spatial data. Having introduced the basics of the sf package and some important concepts in my last post, I’m now going to start working through a real example.
The dataset I’ll be using to try out sf and mapdeck is from movebank.org, which is a website for sharing animal tracking data. The dataset includes tracking data for seven Western marsh harriers between 2013-2018 as they migrate annually from Western Europe to Africa. The data have been released under a Creative Commons Zero Waiver. You can find this dataset here. This dataset includes both altitude and timestamp data, so is a great candidate for playing around with plotting sf objects with four dimensions.
The first step is to drop some unnecessary columns from the data and reference files and set the timestamp correctly. I’m also filtering out some points that have recorded a height above sea level of more than 4 km and are likely outliers. Finally, I’m joining the data.table with the GPS data to the reference file containing important information such as the names of the birds.
library(sf)
library(sfheaders)
library(mapdeck)
library(data.table)
dt_harriers[, c(2, 6:16, 18:23, 25:30) := NULL]
dt_harriers_ref[, c(1, 3:8, 10:24) := NULL]
dt_harriers[, timestamp := as.POSIXct(timestamp, tz = "UTC")]
dt_harriers[, time := as.numeric(timestamp) / 1000 ]
dt_harriers <- dt_harriers[`height-above-msl` < 4000][
dt_harriers_ref, on = .(`individual-local-identifier` = `animal-id`)]
head(dt_harriers)
## event-id timestamp location-long location-lat
## 1: 6.044709e-314 2013-05-16 20:01:27 3.595640 51.27886
## 2: 6.044709e-314 2013-05-16 21:02:08 3.595643 51.27890
## 3: 6.044709e-314 2013-05-16 22:02:32 3.595579 51.27889
## 4: 6.044709e-314 2013-05-16 23:02:31 3.595659 51.27892
## 5: 6.044709e-314 2013-05-17 00:03:09 3.595616 51.27886
## 6: 6.044709e-314 2013-05-17 01:03:03 3.595660 51.27885
## height-above-msl individual-local-identifier time animal-nickname
## 1: 0 H173481 1368734 Mia
## 2: 2 H173481 1368738 Mia
## 3: 2 H173481 1368742 Mia
## 4: -6 H173481 1368745 Mia
## 5: -2 H173481 1368749 Mia
## 6: 3 H173481 1368753 Mia
The most relevant columns for our purposes will be animal-id
, location-long
, location-lat
, height-above-msl
(height above mean sea level), and “timestamp”. The first column gives the name of each bird, and the last four give all the information needed to plot the positions of the birds over time.
The easiest (though not necessarily the most informative) thing to do with all this data is the show all the points individually. There are 377846 points, so I’ll plot a small subset of these to show how they looks. I’ll pull out just the data for the bird named Ben.
To plot the data, we need to create an sf (simple features) object. One of the ways to do this is to use the function sf::st_as_sf
to convert the dt_harriers data.table. We specify the columns containing the coordinates (being careful to put the longitude and latitude in the correct order) and get back an sf object. This isessentially a dataframe that contains a geometry column.
Note that although we started with a data.table, the sf object won’t really behave like a data.table anymore. If you’re using data.table, it’s best to do any initial analysis and manipulation of the data before creating the sf object, and then make the sf object only when you’re ready to plot.
To use mapdeck, you’ll need to set up your own Mapbox account and get a Mapbox access token. The free tier will allow you to make plenty of maps. Then, you can set your token for the global environment as so: set_token("YOUR TOKEN HERE")
Once you’ve done this, converting your data.table to an sf object and plotting it is straightforward. There are a lot of different arguments to mapdeck() and add_scatterplot() to customise how you want your map to look - for instance, here I’m using mapdeck_style() to show the standard “light” map, but there are also other options (“dark”, “satellite”, “streets”). You can also change things like the colours of lines/points, the level of zoom, the location the map is focussed on, and whether to include a legend. Here I’ve adjusted the zoom to focus on the area of interest. By default each layer will refocus the map, so I also used update_view = FALSE
in add_scatterplot()
to prevent this.
dt_harriers_ben <- dt_harriers[`animal-nickname` == "Ben"]
sf_harriers_ben <- st_as_sf(dt_harriers_ben, coords = c("location-long", "location-lat"), crs = 4326)
mapdeck(
style = mapdeck_style("light"),
location = c(0, 35),
zoom = 2
) %>%
add_scatterplot(
data = sf_harriers_ben,
update_view = FALSE
)
Note that because this is an interactive map, you can zoom in and move the map around to see particular regions better. The responsiveness of the map will depend on your browser and hardware. If you’re working in Rstudio and are using Windows, the map will not display in the viewer window and you’ll need to press the ‘show in a new window’ button to open in in a browser.
Alternatively, since these are points you can skip converting the data.table to an sf object and simply tell add_scatterplot()
which columns contain the longitude and latitude:
mapdeck(
style = mapdeck_style("light"),
location = c(0, 35),
zoom = 2
) %>%
add_scatterplot(
data = dt_harriers_ben,
lon = "location-long",
lat = "location-lat",
update_view = FALSE
)
This will work for any mapdeck functions that works on points, such as add_heatmap
(which I’ll show in the next post in this series).
Even from this very simple visualisation using points we can get some sense of where Ben has been. However, there are much better ways to visualise this. You can also probably see that plotting similar data for Ben and the other six birds is going to lead to a rather messy and confusing map.
In the next few posts, I’ll explore some better ways that we can visualise the harrier migration data using mapdeck.