| Type: | Package |
| Title: | Estimate and Simulate from Location Dependent Marked Point Processes |
| Version: | 1.1.1 |
| Maintainer: | Lane Drew <lanetdrew@gmail.com> |
| Description: | A suite of tools for estimating, assessing model fit, simulating from, and visualizing location dependent marked point processes characterized by regularity in the pattern. You provide a reference marked point process, a set of raster images containing location specific covariates, and select the estimation algorithm and type of mark model. 'ldmppr' estimates the process and mark models and allows you to check the appropriateness of the model using a variety of diagnostic tools. Once a satisfactory model fit is obtained, you can simulate from the model and visualize the results. Documentation for the package 'ldmppr' is available in the form of a vignette. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| LazyData: | true |
| Imports: | stats, bundle, Rcpp (≥ 1.0.12), terra, doParallel, xgboost, ranger, parsnip (≥ 1.4.0), dials, recipes, rsample, tune, workflows, magrittr, hardhat, ggplot2, spatstat.geom, spatstat.explore, nloptr, GET, progress, future, furrr, foreach, yardstick |
| LinkingTo: | Rcpp, RcppArmadillo |
| URL: | https://github.com/lanedrew/ldmppr |
| BugReports: | https://github.com/lanedrew/ldmppr/issues |
| RoxygenNote: | 7.3.3 |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), dplyr |
| VignetteBuilder: | knitr |
| Depends: | R (≥ 3.5.0) |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | yes |
| Packaged: | 2026-01-12 04:04:45 UTC; lanedrew |
| Author: | Lane Drew |
| Repository: | CRAN |
| Date/Publication: | 2026-01-13 08:30:07 UTC |
ldmppr: Estimate and Simulate from Location Dependent Marked Point Processes
Description
A suite of tools for estimating, assessing model fit, simulating from, and visualizing location dependent marked point processes characterized by regularity in the pattern. You provide a reference marked point process, a set of raster images containing location specific covariates, and select the estimation algorithm and type of mark model. 'ldmppr' estimates the process and mark models and allows you to check the appropriateness of the model using a variety of diagnostic tools. Once a satisfactory model fit is obtained, you can simulate from the model and visualize the results. Documentation for the package 'ldmppr' is available in the form of a vignette.
Author(s)
Maintainer: Lane Drew lanetdrew@gmail.com (ORCID) [copyright holder]
Authors:
Andee Kaplan (ORCID)
See Also
Useful links:
Pipe operator
Description
See magrittr::%>% for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling 'rhs(lhs)'.
calculates c_theta
Description
calculates c_theta
Usage
C_theta2_i(xgrid, ygrid, tgrid, data, params, bounds)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a t value. |
data |
a matrix of data. |
params |
a vector of parameters. |
bounds |
a vector of bounds for time, x, and y. |
Value
returns the product.
Check the fit of an estimated model using global envelope tests
Description
Performs global envelope tests for nonparametric L, F, G, J, E, and V summary
functions (spatstat/GET).
These tests assess goodness-of-fit of the estimated model relative to a reference marked point pattern.
The reference marked point pattern can be supplied directly via reference_data (a marked ppp object),
or derived internally from a ldmppr_fit object.
Usage
check_model_fit(
reference_data = NULL,
t_min = 0,
t_max = 1,
process = c("self_correcting"),
process_fit = NULL,
anchor_point = NULL,
raster_list = NULL,
scaled_rasters = FALSE,
mark_model = NULL,
xy_bounds = NULL,
include_comp_inds = FALSE,
thinning = TRUE,
edge_correction = "none",
competition_radius = 15,
n_sim = 2500,
save_sims = TRUE,
verbose = TRUE,
seed = 0
)
Arguments
reference_data |
(optional) a marked |
t_min |
minimum value for time. |
t_max |
maximum value for time. |
process |
type of process used (currently supports |
process_fit |
either an |
anchor_point |
(optional) vector of (x,y) coordinates of the point to condition on.
If |
raster_list |
a list of raster objects used for predicting marks. |
scaled_rasters |
|
mark_model |
a mark model object. May be a |
xy_bounds |
(optional) vector of bounds as |
include_comp_inds |
|
thinning |
|
edge_correction |
type of edge correction to apply ( |
competition_radius |
distance for competition radius if |
n_sim |
number of simulated datasets to generate. |
save_sims |
|
verbose |
|
seed |
integer seed for reproducibility. |
Details
This function relies on the spatstat package for the calculation of the point pattern metrics
and the GET package for the global envelope tests. The L, F, G, J, E, and V functions are a collection of
non-parametric summary statistics that describe the spatial distribution of points and marks in a point pattern.
See the documentation for Lest(), Fest(), Gest(),
Jest(), Emark(), and Vmark() for more information.
Also, see the global_envelope_test() function for more information on the global envelope tests.
Value
an object of class "ldmppr_model_check".
References
Baddeley, A., Rubak, E., & Turner, R. (2015). *Spatial Point Patterns: Methodology and Applications with R*. Chapman and Hall/CRC Press, London. ISBN 9781482210200. Available at: https://www.routledge.com/Spatial-Point-Patterns-Methodology-and-Applications-with-R/Baddeley-Rubak-Turner/p/book/9781482210200.
Myllymäki, M., & Mrkvička, T. (2023). GET: Global envelopes in R. arXiv:1911.06583 [stat.ME]. doi:10.48550/arXiv.1911.06583.
Examples
# Note: The example below is provided for illustrative purposes and may take some time to run.
# Load the small example data
data(small_example_data)
# Load the example mark model that previously was trained on the small example data
file_path <- system.file("extdata", "example_mark_model.rds", package = "ldmppr")
mark_model <- load_mark_model(file_path)
# Load the raster files
raster_paths <- list.files(system.file("extdata", package = "ldmppr"),
pattern = "\\.tif$", full.names = TRUE)
raster_paths <- raster_paths[!grepl("_med\\.tif$", raster_paths)]
rasters <- lapply(raster_paths, terra::rast)
# Scale the rasters
scaled_raster_list <- scale_rasters(rasters)
# Generate the reference pattern
reference_data <- generate_mpp(
locations = small_example_data[, c("x", "y")],
marks = small_example_data$size,
xy_bounds = c(0, 25, 0, 25)
)
# Specify the estimated parameters of the self-correcting process
# Note: These would generally be estimated using estimate_process_parameters.
# These values are estimates from the small_example_data generating script.
estimated_parameters <- c(
0.05167978, 8.20702166, 0.02199940, 2.63236890,
1.82729512, 0.65330061, 0.86666748, 0.04681878
)
# Check the model fit
example_model_fit <- check_model_fit(
reference_data = reference_data,
t_min = 0,
t_max = 1,
process = "self_correcting",
process_fit = estimated_parameters,
raster_list = scaled_raster_list,
scaled_rasters = TRUE,
mark_model = mark_model,
xy_bounds = c(0, 25, 0, 25),
include_comp_inds = TRUE,
thinning = TRUE,
edge_correction = "none",
competition_radius = 10,
n_sim = 100,
save_sims = FALSE,
verbose = TRUE,
seed = 90210
)
plot(example_model_fit, which = 'combined')
calculates sum of values < t
Description
calculates sum of values < t
Usage
conditional_sum(obs_t, eval_t, y)
Arguments
obs_t |
a vector of observed t values. |
eval_t |
a t value. |
y |
a vector of values. |
Value
the conditional sum.
calculates sum of values < t
Description
calculates sum of values < t
Usage
conditional_sum_logical(obs_t, eval_t, y)
Arguments
obs_t |
a vector of observed t values. |
eval_t |
a t value. |
y |
a vector of values. |
Value
the conditional sum.
calculates distance in one dim
Description
calculates distance in one dim
Usage
dist_one_dim(eval_t, obs_t)
Arguments
eval_t |
a t value. |
obs_t |
a vector of t values. |
Value
distance between a single t and the vector of all t values.
Estimate point process parameters using log-likelihood maximization
Description
Estimate spatio-temporal point process parameters by maximizing the (approximate) full log-likelihood
using nloptr. For the self-correcting process, the arrival times must be on (0,1)
and can either be supplied directly in data as time, or constructed from size via the
gentle-decay (power-law) mapping power_law_mapping using delta (single fit) or
delta_values (delta search).
Usage
estimate_process_parameters(
data,
process = c("self_correcting"),
x_grid = NULL,
y_grid = NULL,
t_grid = NULL,
upper_bounds = NULL,
parameter_inits = NULL,
delta = NULL,
delta_values = NULL,
parallel = FALSE,
num_cores = max(1L, parallel::detectCores() - 1L),
set_future_plan = FALSE,
strategy = c("local", "global_local", "multires_global_local"),
grid_levels = NULL,
refine_best_delta = TRUE,
global_algorithm = "NLOPT_GN_CRS2_LM",
local_algorithm = "NLOPT_LN_BOBYQA",
global_options = list(maxeval = 150),
local_options = list(maxeval = 300, xtol_rel = 1e-05, maxtime = NULL),
n_starts = 1L,
jitter_sd = 0.35,
seed = 1L,
finite_bounds = NULL,
verbose = TRUE
)
Arguments
data |
a data.frame or matrix. Must contain either columns |
process |
character string specifying the process model. Currently supports |
x_grid, y_grid, t_grid |
numeric vectors defining the integration grid for |
upper_bounds |
numeric vector of length 3 giving |
parameter_inits |
numeric vector of length 8 giving initialization values for the model parameters. |
delta |
(optional) numeric scalar used only when |
delta_values |
(optional) numeric vector. If supplied, the function fits the model for each value of |
parallel |
logical. If |
num_cores |
Integer number of workers to use when |
set_future_plan |
|
strategy |
Character string specifying the estimation strategy:
- |
grid_levels |
(optional) list defining the multi-resolution grid schedule when |
refine_best_delta |
|
global_algorithm, local_algorithm |
character strings specifying the NLopt algorithms to use for the global and local optimization stages, respectively. |
global_options, local_options |
named lists of options to pass to |
n_starts |
integer number of multi-start initializations to use for the local optimization stage. |
jitter_sd |
numeric standard deviation used to jitter the multi-start initializations. |
seed |
integer random seed used for multi-start initialization jittering. |
finite_bounds |
(optional) list with components |
verbose |
|
Details
For the self-correcting process, the log-likelihood integral is approximated using the supplied grid
(x_grid, y_grid, t_grid) over the bounded domain upper_bounds.
When delta_values is supplied, this function performs a grid search over delta values, fitting the model
separately for each mapped dataset and selecting the best objective value.
Value
an object of class "ldmppr_fit" containing the best nloptr fit and (optionally) all fits from a delta search.
References
Møller, J., Ghorbani, M., & Rubak, E. (2016). Mechanistic spatio-temporal point process models for marked point processes, with a view to forest stand data. Biometrics, 72(3), 687–696. doi:10.1111/biom.12466.
Examples
data(small_example_data)
x_grid <- seq(0, 25, length.out = 5)
y_grid <- seq(0, 25, length.out = 5)
t_grid <- seq(0, 1, length.out = 5)
parameter_inits <- c(1.5, 8.5, .015, 1.5, 3.2, .75, 3, .08)
upper_bounds <- c(1, 25, 25)
fit <- estimate_process_parameters(
data = small_example_data,
process = "self_correcting",
x_grid = x_grid,
y_grid = y_grid,
t_grid = t_grid,
upper_bounds = upper_bounds,
parameter_inits = parameter_inits,
delta = 1,
strategy = "global_local",
global_algorithm = "NLOPT_GN_CRS2_LM",
local_algorithm = "NLOPT_LN_BOBYQA",
global_options = list(maxeval = 150),
local_options = list(maxeval = 25, xtol_rel = 1e-2),
verbose = TRUE
)
coef(fit)
logLik(fit)
# Delta-search example (data has x,y,size; time is derived internally for each delta)
fit_delta <- estimate_process_parameters(
data = small_example_data, # x,y,size
process = "self_correcting",
x_grid = x_grid,
y_grid = y_grid,
t_grid = t_grid,
upper_bounds = upper_bounds,
parameter_inits = parameter_inits,
delta_values = c(0.35, 0.5, 0.65, 0.9, 1.0),
parallel = TRUE,
set_future_plan = TRUE,
num_cores = 2,
strategy = "multires_global_local",
grid_levels = list(
list(nx = 5, ny = 5, nt = 5),
list(nx = 8, ny = 8, nt = 8),
list(nx = 10, ny = 10, nt = 10)
),
global_options = list(maxeval = 100),
local_options = list(maxeval = 100, xtol_rel = 1e-3),
n_starts = 3,
refine_best_delta = TRUE,
verbose = TRUE
)
plot(fit_delta)
Extract covariate values from a set of rasters
Description
Extract covariate values from a set of rasters
Usage
extract_covars(locations, raster_list)
Arguments
locations |
a matrix/data.frame of (x,y) locations. |
raster_list |
a list of SpatRaster objects. |
Value
a data.frame of covariates (no ID column; unique names).
Examples
# Load example raster data
raster_paths <- list.files(system.file("extdata", package = "ldmppr"),
pattern = "\\.tif$", full.names = TRUE
)
raster_paths <- raster_paths[!grepl("_med\\.tif$", raster_paths)]
rasters <- lapply(raster_paths, terra::rast)
# Scale the rasters
scaled_raster_list <- scale_rasters(rasters)
# Load example locations
locations <- small_example_data %>%
dplyr::select(x, y) %>%
as.matrix()
# Extract covariates
example_covars <- extract_covars(locations, scaled_raster_list)
head(example_covars)
calculates full product for one grid point
Description
calculates full product for one grid point
Usage
full_product(xgrid, ygrid, tgrid, data, params)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a t value. |
data |
a matrix of data. |
params |
a vector of parameters. |
Value
returns the product.
calculates full self-correcting log-likelihood
Description
calculates full self-correcting log-likelihood
Usage
full_sc_lhood(xgrid, ygrid, tgrid, tobs, data, params, bounds)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a vector of grid values for t. |
tobs |
a vector of observed values for t. |
data |
a matrix of times and locations. |
params |
a vector of parameters. |
bounds |
a vector of bounds for time, x, and y. |
Value
evaluation of full log-likelihood.
calculates fast full self-correcting log-likelihood
Description
calculates fast full self-correcting log-likelihood
Usage
full_sc_lhood_fast(xgrid, ygrid, tgrid, tobs, data, params, bounds)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a vector of grid values for t. |
tobs |
a vector of observed values for t. |
data |
a matrix of times and locations. |
params |
a vector of parameters. |
bounds |
a vector of bounds for time, x, and y. |
Value
evaluation of full log-likelihood.
Generate a marked process given locations and marks
Description
Creates an object of class "ppp" that represents a marked point pattern in the two-dimensional plane.
Usage
generate_mpp(locations, marks = NULL, xy_bounds = NULL)
Arguments
locations |
a data.frame of (x,y) locations with names "x" and "y". |
marks |
a vector of marks. |
xy_bounds |
a vector of domain bounds (2 for x, 2 for y). |
Value
a ppp object with marks.
Examples
# Load example data
data(small_example_data)
# Generate a marked point process
generate_mpp(
locations = small_example_data %>% dplyr::select(x, y),
marks = small_example_data$size,
xy_bounds = c(0, 25, 0, 25)
)
calculates spatio-temporal interaction
Description
calculates spatio-temporal interaction
Usage
interaction_st(data, params)
Arguments
data |
a matrix of times and locations. |
params |
a vector of parameters. |
Value
a vector of interaction probabilities for every point.
Internal helpers (not part of the public API)
Description
These functions are used internally by ldmppr and are not intended to be called directly by users.
Usage
new_ldmppr_model_check(
combined_env,
envs,
curve_sets,
sim_metrics = NULL,
settings = list(),
call = NULL
)
new_ldmppr_sim(
process,
mpp,
realization,
params,
bounds,
anchor_point,
thinning,
edge_correction,
include_comp_inds,
competition_radius,
call = NULL,
meta = list()
)
new_ldmppr_mark_model(
engine,
fit_engine = NULL,
xgb_raw = NULL,
recipe = NULL,
outcome = "size",
feature_names = NULL,
info = list()
)
new_ldmppr_fit(
process,
fit,
fits = NULL,
mapping = NULL,
grid = NULL,
data_summary = NULL,
data = NULL,
data_original = NULL,
engine = "nloptr",
call = NULL,
timing = NULL
)
preprocess_new_data(object, new_data)
rehydrate_xgb(object)
as_mark_model(mark_model)
.build_sc_matrix(data, delta = NULL)
.default_sc_param_bounds(txy, upper_bounds)
a %||% b
.require_pkgs(pkgs)
.coerce_training_df(x, delta = NULL, xy_bounds = NULL)
infer_xy_bounds_from_ppp(ppp)
infer_anchor_from_ppp(ppp)
infer_anchor_from_df(df)
resolve_sc_params(process_fit)
resolve_reference_ppp(reference_data, process_fit, xy_bounds)
.as_sc_params(process_fit)
.infer_xy_bounds(process_fit)
.infer_anchor_point(process_fit)
Fitted point-process model object
Description
Objects of class ldmppr_fit are returned by estimate_process_parameters.
They contain the best-fitting optimization result (and optionally multiple fits,
e.g. from a delta search) along with metadata used to reproduce the fit.
Usage
## S3 method for class 'ldmppr_fit'
print(x, ...)
## S3 method for class 'ldmppr_fit'
coef(object, ...)
## S3 method for class 'ldmppr_fit'
logLik(object, ...)
## S3 method for class 'ldmppr_fit'
summary(object, ...)
## S3 method for class 'summary.ldmppr_fit'
print(x, ...)
## S3 method for class 'ldmppr_fit'
plot(x, ...)
as_nloptr(x, ...)
## S3 method for class 'ldmppr_fit'
as_nloptr(x, ...)
Arguments
x |
an object of class |
... |
additional arguments (not used). |
object |
an object of class |
Details
A ldmppr_fit is a list with (at minimum):
-
process: process name (e.g."self_correcting") -
fit: best optimization result (currently annloptrobject) -
mapping: mapping information (e.g. chosendelta, objectives) -
grid: grid definitions used by likelihood approximation
Value
print()prints a brief summary of the fit.
coef()returns the estimated parameter vector.
logLik()returns the log-likelihood at the optimum.
summary()returns a
summary.ldmppr_fit.plot()plots diagnostics for multi-fit runs, if available.
Methods (by generic)
-
print(ldmppr_fit): Print a brief summary of a fitted model. -
coef(ldmppr_fit): Extract the estimated parameter vector. -
logLik(ldmppr_fit): Log-likelihood at the optimum. -
summary(ldmppr_fit): Summarize a fitted model. -
plot(ldmppr_fit): Plot diagnostics for a fitted model. -
as_nloptr(ldmppr_fit): Extract the underlyingnloptrresult.
Functions
-
print(summary.ldmppr_fit): Print a summary produced bysummary.ldmppr_fit. -
as_nloptr(): Extract the underlyingnloptrresult.
Mark model object
Description
ldmppr_mark_model objects store a fitted mark model and preprocessing
information used to predict marks at new locations and times.
These objects are typically returned by train_mark_model and can be
saved/loaded with save_mark_model and load_mark_model.
Usage
ldmppr_mark_model(
engine,
fit_engine = NULL,
xgb_raw = NULL,
recipe = NULL,
outcome = "size",
feature_names = NULL,
info = list()
)
## S3 method for class 'ldmppr_mark_model'
print(x, ...)
predict.ldmppr_mark_model(object, new_data, ...)
save_mark_model(object, path, ...)
## S3 method for class 'ldmppr_mark_model'
save_mark_model(object, path, ...)
load_mark_model(path)
Arguments
engine |
character string (currently |
fit_engine |
fitted engine object (e.g. |
xgb_raw |
raw xgboost payload (e.g. UBJ) used for rehydration. |
recipe |
a prepped recipes object used for preprocessing new data. |
outcome |
outcome column name (default |
feature_names |
(optional) vector of predictor names required at prediction time. |
info |
(optional) list of metadata. |
x |
a |
... |
passed to methods. |
object |
a |
new_data |
a data frame of predictors (and possibly outcome columns). |
path |
path to an |
Details
The model may be backed by different engines (currently "xgboost" and
"ranger"). For "xgboost", the object can store a serialized booster payload
to make saving/loading robust across R sessions.
Value
print()prints a brief summary.
predict()returns numeric predictions for new data.
an object of class "ldmppr_mark_model".
Methods (by generic)
-
print(ldmppr_mark_model): Print a brief summary of the mark model. -
save_mark_model(ldmppr_mark_model): Save method forldmppr_mark_model.
Functions
-
ldmppr_mark_model(): Create a mark model container. -
predict.ldmppr_mark_model(): Predict marks for new data. -
save_mark_model(): Save a mark model to disk. -
load_mark_model(): Load a saved mark model from disk.
Model fit diagnostic object
Description
Objects of class ldmppr_model_check are returned by check_model_fit.
They contain global envelope test results and curve sets for multiple summary
functions/statistics.
Usage
## S3 method for class 'ldmppr_model_check'
print(x, ...)
## S3 method for class 'ldmppr_model_check'
summary(object, ...)
## S3 method for class 'summary.ldmppr_model_check'
print(x, ...)
## S3 method for class 'ldmppr_model_check'
plot(x, which = c("combined", "L", "F", "G", "J", "E", "V"), ...)
Arguments
x |
an object of class |
... |
additional arguments passed to the underlying |
object |
an object of class |
which |
which envelope to plot. |
Details
An ldmppr_model_check is a list with components such as:
-
combined_env: a global envelope test object (typically from **GET**) -
envs: named list of envelope test objects (e.g.,L,F,G,J,E,V) -
curve_sets: named list of curve set objects -
settings: list of settings used when generating envelopes (e.g.,n_sim,thinning)
Value
print()prints a brief summary of the diagnostic object.
summary()returns a
summary.ldmppr_model_checkobject.plot()plots the combined envelope or a selected statistic envelope.
Methods (by generic)
-
print(ldmppr_model_check): Print a brief summary of the diagnostic results. -
summary(ldmppr_model_check): Summarize p-values from the combined and individual envelopes. -
plot(ldmppr_model_check): Plot the combined envelope or a selected statistic.
Functions
-
print(summary.ldmppr_model_check): Print a summary produced bysummary.ldmppr_model_check.
Simulated marked point process object
Description
ldmppr_sim objects are returned by simulate_mpp. They contain the simulated
realization, an associated marked point pattern object, and metadata used to
reproduce or inspect the simulation.
Usage
## S3 method for class 'ldmppr_sim'
print(x, ...)
## S3 method for class 'ldmppr_sim'
as.data.frame(x, ...)
## S3 method for class 'ldmppr_sim'
nobs(object, ...)
## S3 method for class 'ldmppr_sim'
plot(x, pattern_type = "simulated", ...)
mpp.ldmppr_sim(x, ...)
Arguments
x |
a |
... |
additional arguments (not used). |
object |
a |
pattern_type |
type of pattern to plot |
Details
An ldmppr_sim is a list with at least:
-
process: process name (e.g."self_correcting") -
mpp: a marked point pattern object -
realization: data.frame with columnstime,x,y,marks -
params,bounds, and other metadata
Value
For methods:
print()prints a summary of the simulation.
plot()returns a ggplot visualization of the marked point pattern.
as.data.frame()returns the simulated realization as a data.frame.
nobs()returns the number of points in the realization.
mpp()returns the marked point pattern object.
Methods (by generic)
-
print(ldmppr_sim): Print a brief summary of the simulation. -
as.data.frame(ldmppr_sim): Coerce to a data.frame of the simulated realization. -
nobs(ldmppr_sim): Number of simulated points. -
plot(ldmppr_sim): Plot the simulated marked point pattern.
Functions
-
mpp.ldmppr_sim(): Extract the underlying marked point pattern object.
Medium Example Data
Description
A medium sized example dataset consisting of 111 observations in a (50m x 50m) square domain.
Usage
data("medium_example_data")
Format
## medium_example_data
A data frame with 111 rows and 3 columns:
- x
x coordinate
- y
y coordinate
- size
Size
...
Details
The dataset was generated using the Snodgrass dataset available at https://data.ess-dive.lbl.gov/view/doi:10.15485/2476543.
The full code to generate this dataset is available in the package's data_raw directory.
Source
Real example dataset. Code to generate it can be found in data_raw/medium_example_data.R.
calculates part 1-1 full
Description
calculates part 1-1 full
Usage
part_1_1_full(data, params)
Arguments
data |
a matrix of locations and times. |
params |
a vector of parameters. |
Value
full likelihood evaluation for part 1.
calculates part 1-2 full
Description
calculates part 1-2 full
Usage
part_1_2_full(data, params)
Arguments
data |
a matrix of locations and times. |
params |
a vector of parameters. |
Value
full likelihood evaluation for part 2.
calculates part 1-3
Description
calculates part 1-3
Usage
part_1_3_full(xgrid, ygrid, tgrid, data, params, bounds)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a t value. |
data |
a matrix of times and locations. |
params |
a vector of parameters. |
bounds |
a vector of time, x, and y bounds. |
Value
full likelihood evaluation for part 3.
calculates part 1-4
Description
calculates part 1-4
Usage
part_1_4_full(data, params)
Arguments
data |
a matrix of times and locations. |
params |
a vector of parameters. |
Value
full likelihood evaluation for part 4.
calculates part 1 of the likelihood
Description
calculates part 1 of the likelihood
Usage
part_1_full(xgrid, ygrid, tgrid, data, params, bounds)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a t value. |
data |
a matrix of times and locations. |
params |
a vector of parameters. |
bounds |
a vector of bounds for time, x, and y. |
Value
full evaluation of first part of likelihood.
calculates part 2 of the likelihood
Description
calculates part 2 of the likelihood
Usage
part_2_full(xgrid, ygrid, tgrid, data, params, bounds)
Arguments
xgrid |
a vector of grid values for x. |
ygrid |
a vector of grid values for y. |
tgrid |
a vector of grid values for t. |
data |
a matrix of times and locations. |
params |
a vector of parameters. |
bounds |
a vector of bounds for time, x, and y. |
Value
full evaluation of second part of likelihood.
Plot a marked point process
Description
Plot a marked point process
Usage
plot_mpp(mpp_data, pattern_type = c("reference", "simulated"))
Arguments
mpp_data |
|
pattern_type |
type of pattern to plot ("reference" or "simulated"). |
Value
a ggplot object of the marked point process.
Examples
# Load example data
data(small_example_data)
mpp_data <- generate_mpp(
locations = small_example_data %>% dplyr::select(x, y),
marks = small_example_data$size,
xy_bounds = c(0, 25, 0, 25)
)
# Plot the marked point process
plot_mpp(mpp_data, pattern_type = "reference")
Gentle decay (power-law) mapping function from sizes to arrival times
Description
Gentle decay (power-law) mapping function from sizes to arrival times
Usage
power_law_mapping(sizes, delta)
Arguments
sizes |
vector of sizes to be mapped to arrival times. |
delta |
numeric value (greater than 0) for the exponent in the mapping function. |
Value
vector of arrival times.
Examples
# Generate a vector of sizes
sizes <- runif(100, 0, 100)
# Map the sizes to arrival times using a power-law mapping with delta = .5
power_law_mapping(sizes, .5)
Predict values from the mark distribution
Description
Predict values from the mark distribution
Usage
predict_marks(
sim_realization,
raster_list = NULL,
scaled_rasters = FALSE,
mark_model = NULL,
xy_bounds = NULL,
include_comp_inds = FALSE,
competition_radius = 15,
edge_correction = "none"
)
Arguments
sim_realization |
a data.frame containing a thinned or unthinned realization from |
raster_list |
a list of raster objects. |
scaled_rasters |
|
mark_model |
a mark model object. May be a |
xy_bounds |
a vector of domain bounds (2 for x, 2 for y). |
include_comp_inds |
|
competition_radius |
distance for competition radius if |
edge_correction |
type of edge correction to apply ( |
Value
a vector of predicted mark values.
Examples
# Simulate a realization
generating_parameters <- c(2, 8, .02, 2.5, 3, 1, 2.5, .2)
M_n <- c(10, 14)
generated_locs <- simulate_sc(
t_min = 0,
t_max = 1,
sc_params = generating_parameters,
anchor_point = M_n,
xy_bounds = c(0, 25, 0, 25)
)
# Load the raster files
raster_paths <- list.files(system.file("extdata", package = "ldmppr"),
pattern = "\\.tif$", full.names = TRUE
)
raster_paths <- raster_paths[!grepl("_med\\.tif$", raster_paths)]
rasters <- lapply(raster_paths, terra::rast)
# Scale the rasters
scaled_raster_list <- scale_rasters(rasters)
# Load the example mark model
file_path <- system.file("extdata", "example_mark_model.rds", package = "ldmppr")
mark_model <- load_mark_model(file_path)
# Predict the mark values
predict_marks(
sim_realization = generated_locs$thinned,
raster_list = scaled_raster_list,
scaled_rasters = TRUE,
mark_model = mark_model,
xy_bounds = c(0, 25, 0, 25),
include_comp_inds = TRUE,
competition_radius = 10,
edge_correction = "none"
)
Scale a set of rasters
Description
Scale a set of rasters
Usage
scale_rasters(raster_list, reference_resolution = NULL)
Arguments
raster_list |
a list of raster objects. |
reference_resolution |
the resolution to resample the rasters to. |
Value
a list of scaled raster objects.
Examples
# Create two example rasters
rast_a <- terra::rast(
ncol = 10, nrow = 10,
xmin = 0, xmax = 10,
ymin = 0, ymax = 10,
vals = runif(100)
)
rast_b <- terra::rast(
ncol = 10, nrow = 10,
xmin = 0, xmax = 10,
ymin = 0, ymax = 10,
vals = runif(100)
)
# Scale example rasters in a list
rast_list <- list(rast_a, rast_b)
scale_rasters(rast_list)
Simulate the spatial component of the self-correcting model
Description
Simulate the spatial component of the self-correcting model
Usage
sim_spatial_sc(M_n, params, nsim_t, xy_bounds)
Arguments
M_n |
a vector of (x,y)-coordinates for largest point. |
params |
a vector of parameters (alpha_2, beta_2). |
nsim_t |
number of points to simulate. |
xy_bounds |
vector of lower and upper bounds for the domain (2 for x, 2 for y). |
Value
a matrix of point locations in the (x,y)-plane.
Simulate the temporal component of the self-correcting model
Description
Simulate the temporal component of the self-correcting model
Usage
sim_temporal_sc(Tmin = 0, Tmax = 1, params = as.numeric(c(0, 0, 0)))
Arguments
Tmin |
minimum time value. |
Tmax |
maximum time value. |
params |
a vector of parameters (alpha_1, beta_1, gamma_1). |
Value
a vector of thinned and unthinned temporal samples.
Simulate a realization of a location dependent marked point process
Description
Simulate a realization of a location dependent marked point process
Usage
simulate_mpp(
process = c("self_correcting"),
process_fit = NULL,
t_min = 0,
t_max = 1,
anchor_point = NULL,
raster_list = NULL,
scaled_rasters = FALSE,
mark_model = NULL,
xy_bounds = NULL,
include_comp_inds = FALSE,
competition_radius = 15,
edge_correction = "none",
thinning = TRUE,
seed = NULL
)
Arguments
process |
type of process used (currently supports |
process_fit |
either (1) a |
t_min |
minimum value for time. |
t_max |
maximum value for time. |
anchor_point |
(optional) vector of (x,y) coordinates of the point to condition on.
If |
raster_list |
a list of raster objects used for predicting marks. |
scaled_rasters |
|
mark_model |
a mark model object. May be a |
xy_bounds |
(optional) vector of bounds as |
include_comp_inds |
|
competition_radius |
distance for competition radius if |
edge_correction |
type of edge correction to apply ( |
thinning |
|
seed |
integer seed for reproducibility. |
Value
an object of class "ldmppr_sim".
Examples
# Specify the generating parameters of the self-correcting process
generating_parameters <- c(2, 8, .02, 2.5, 3, 1, 2.5, .2)
# Specify an anchor point
M_n <- c(10, 14)
# Load the raster files
raster_paths <- list.files(system.file("extdata", package = "ldmppr"),
pattern = "\\.tif$", full.names = TRUE
)
raster_paths <- raster_paths[!grepl("_med\\.tif$", raster_paths)]
rasters <- lapply(raster_paths, terra::rast)
# Scale the rasters
scaled_raster_list <- scale_rasters(rasters)
# Load the example mark model
file_path <- system.file("extdata", "example_mark_model.rds", package = "ldmppr")
mark_model <- load_mark_model(file_path)
# Simulate a realization
example_mpp <- simulate_mpp(
process = "self_correcting",
process_fit = generating_parameters,
t_min = 0,
t_max = 1,
anchor_point = M_n,
raster_list = scaled_raster_list,
scaled_rasters = TRUE,
mark_model = mark_model,
xy_bounds = c(0, 25, 0, 25),
include_comp_inds = TRUE,
competition_radius = 10,
edge_correction = "none",
thinning = TRUE
)
# Plot the realization and provide a summary
plot(example_mpp, pattern_type = "simulated")
summary(example_mpp)
Simulate from the self-correcting model
Description
Allows the user to simulate a realization from the self-correcting model given a set of parameters and a point to condition on.
Usage
simulate_sc(
t_min = 0,
t_max = 1,
sc_params = NULL,
anchor_point = NULL,
xy_bounds = NULL
)
Arguments
t_min |
minimum value for time. |
t_max |
maximum value for time. |
sc_params |
Vector of parameter values corresponding to
|
anchor_point |
vector of (x,y) coordinates of point to condition on. |
xy_bounds |
a vector of domain bounds (2 for x, 2 for y). |
Value
a list containing the thinned and unthinned simulation realizations.
Examples
# Specify the generating parameters of the self-correcting process
generating_parameters <- c(2, 8, .02, 2.5, 3, 1, 2.5, .2)
# Specify an anchor point
M_n <- c(10, 14)
# Simulate the self-correcting process
generated_locs <- simulate_sc(
t_min = 0,
t_max = 1,
sc_params = generating_parameters,
anchor_point = M_n,
xy_bounds = c(0, 25, 0, 25)
)
Small Example Data
Description
A small example dataset for testing and examples consisting of 121 observations in a (25m x 25m) square domain.
Usage
data("small_example_data")
Format
## small_example_data
A data frame with 121 rows and 3 columns:
- x
x coordinate
- y
y coordinate
- size
Size
...
Details
The dataset was generated using the example raster data and an exponential decay size function.
The full code to generate this dataset is available in the package's data_raw directory.
Source
Simulated dataset. Code to generate it can be found in data_raw/small_example_data.R.
calculates spatial interaction
Description
calculates spatial interaction
Usage
spat_interaction(Hist, newp, params)
Arguments
Hist |
a matrix of points. |
newp |
a new point vector. |
params |
a vector of parameters. |
Value
calculated probability of new point.
calculates temporal likelihood
Description
calculates temporal likelihood
Usage
temporal_sc(params, eval_t, obs_t)
Arguments
params |
a vector of parameters (alpha_1, beta_1, gamma_1). |
eval_t |
a t value. |
obs_t |
a vector of t values. |
Value
evaluation of full temporal likelihood.
Optimized function to compute toroidal distance matrix over a rectangular domain
Description
Optimized function to compute toroidal distance matrix over a rectangular domain
Usage
toroidal_dist_matrix_optimized(location_matrix, x_bound, y_bound)
Arguments
location_matrix |
a 2 column matrix of (x,y) coordinates. |
x_bound |
the upper bound for the x dimension. |
y_bound |
the upper bound for the y dimension. |
Value
a matrix of toroidal distances.
Examples
# Generate a matrix of locations
location_matrix <- matrix(c(1, 2, 3, 4, 5, 6), ncol = 2)
x_bound <- 10
y_bound <- 10
# Compute the toroidal distance matrix
toroidal_dist_matrix_optimized(location_matrix, x_bound, y_bound)
Train a flexible model for the mark distribution
Description
Trains a predictive model for the mark distribution of a spatio-temporal process.
data may be either (1) a data.frame containing columns x, y, size and time,
(2) a data.frame containing x, y, size (time will be derived via delta),
or (3) a ldmppr_fit object returned by estimate_process_parameters.
Allows the user to incorporate location specific information and competition indices as covariates in the mark model.
Usage
train_mark_model(
data,
raster_list = NULL,
scaled_rasters = FALSE,
model_type = "xgboost",
xy_bounds = NULL,
delta = NULL,
save_model = FALSE,
save_path = NULL,
parallel = TRUE,
n_cores = NULL,
include_comp_inds = FALSE,
competition_radius = 15,
edge_correction = "none",
selection_metric = "rmse",
cv_folds = 5,
tuning_grid_size = 200,
verbose = TRUE
)
Arguments
data |
a data.frame or a |
raster_list |
a list of raster objects. |
scaled_rasters |
|
model_type |
the machine learning model type ( |
xy_bounds |
a vector of domain bounds (2 for x, 2 for y). If |
delta |
(optional) numeric scalar used only when |
save_model |
|
save_path |
path for saving the generated model. |
parallel |
|
n_cores |
number of cores to use in parallel model training (if |
include_comp_inds |
|
competition_radius |
distance for competition radius if |
edge_correction |
type of edge correction to apply ( |
selection_metric |
metric to use for identifying the optimal model ( |
cv_folds |
number of cross-validation folds to use in model training.
If |
tuning_grid_size |
size of the tuning grid for hyperparameter tuning. |
verbose |
|
Value
an object of class "ldmppr_mark_model" containing the trained mark model.
Examples
# Load the small example data
data(small_example_data)
# Load example raster data
raster_paths <- list.files(system.file("extdata", package = "ldmppr"),
pattern = "\\.tif$", full.names = TRUE
)
raster_paths <- raster_paths[!grepl("_med\\.tif$", raster_paths)]
rasters <- lapply(raster_paths, terra::rast)
# Scale the rasters
scaled_raster_list <- scale_rasters(rasters)
# Train the model
mark_model <- train_mark_model(
data = small_example_data,
raster_list = scaled_raster_list,
scaled_rasters = TRUE,
model_type = "xgboost",
xy_bounds = c(0, 25, 0, 25),
delta = 1,
parallel = FALSE,
include_comp_inds = FALSE,
competition_radius = 10,
edge_correction = "none",
selection_metric = "rmse",
cv_folds = 3,
tuning_grid_size = 2,
verbose = TRUE
)
print(mark_model)
calculates euclidean distance
Description
calculates euclidean distance
Usage
vec_dist(x, y)
Arguments
x |
a vector of x values. |
y |
a vector of y values. |
Value
the distance between the two vectors.
calculates euclidean distance between a vector and a matrix
Description
calculates euclidean distance between a vector and a matrix
Usage
vec_to_mat_dist(eval_u, x_col, y_col)
Arguments
eval_u |
a vector of x and y coordinates. |
x_col |
a vector of x coordinates. |
y_col |
a vector of y coordinates. |
Value
a vector of distances between a vector and each row of a matrix.