| Title: | Geometric Spatial Point Analysis |
|---|---|
| Description: | The implementation to perform the geometric spatial point analysis developed in Hernández & Solís (2022) <doi:10.1007/s00180-022-01244-1>. It estimates the geometric goodness-of-fit index for a set of variables against a response one based on the 'sf' package. The package has methods to print and plot the results. |
| Authors: | Maikol Solís [aut, cre], Alberto Hernández [ctb], Carlos Pasquier [ctb] |
| Maintainer: | Maikol Solís <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3.0 |
| Built: | 2026-06-02 11:05:41 UTC |
| Source: | https://github.com/maikol-solis/spatgeom |
spatgeom objectPerforms a global envelope test for Complete Spatial Randomness (CSR) on a
spatgeom object that was computed with envelope = TRUE. Two
test statistics are supported: Maximum Absolute Deviation (MAD) and
Diggle-Cressie-Loosmore-Ford (DCLF).
csr_test( spatgeom_obj, significance_level = 0.05, r = 0.5, method = c("MAD", "DCLF") )csr_test( spatgeom_obj, significance_level = 0.05, r = 0.5, method = c("MAD", "DCLF") )
spatgeom_obj |
an object of class |
significance_level |
a numeric value for the significance level of the
test. Default |
r |
a numeric scaling parameter used in the theoretical CSR curve
|
method |
a character string, one of |
A named list with two elements:
A data frame (one row per alpha
value per variable) with columns x (alpha grid), mean
(mean of simulated CSR curves), alpha, geom_survival
(observed), theor (theoretical CSR curve),
upper_mean, lower_mean (confidence band around the mean
curve), upper_theor, lower_theor (confidence band around
the theoretical curve), and variable.
A named list, one entry per variable, each a data frame summarising the test: null hypothesis, variable name, test type, number of Monte Carlo simulations, observed and maximum test statistics, and p-values against both the mean and theoretical reference curves.
xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Compute a spatgeom object with the Monte Carlo envelope: est <- spatgeom(y = xy[, 1], x = xy[, -1], envelope = TRUE) # Test with the MAD (Maximum Absolute Deviation) statistic: result_mad <- csr_test(est, method = "MAD") result_mad$details # Test with the DCLF statistic: result_dclf <- csr_test(est, method = "DCLF") result_dclf$detailsxy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Compute a spatgeom object with the Monte Carlo envelope: est <- spatgeom(y = xy[, 1], x = xy[, -1], envelope = TRUE) # Test with the MAD (Maximum Absolute Deviation) statistic: result_mad <- csr_test(est, method = "MAD") result_mad$details # Test with the DCLF statistic: result_dclf <- csr_test(est, method = "DCLF") result_dclf$details
Generate data points with the shape of a donut.
donut_data(n, a, b, theta)donut_data(n, a, b, theta)
n |
Number of data points. |
a |
Lower bound of the second variable. |
b |
Upper bound of the second variable. |
theta |
Angle of the donut. |
A data frame with three variables. Variable 'y' is the response, variable 'x1' makes the donut shape with 'y', and 'x2' is a uniform random variable between a and b. '
xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi)xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi)
Generate data points with a linear relationship.
linear_data(n = 100, a = -3, b = 3)linear_data(n = 100, a = -3, b = 3)
n |
Number of data points. |
a, b
|
Lower and upper bound of the uniform distribution. |
A data frame with four variables. Variable y = 0.6 * x1 + 0.3
* x2 + 0.1 * x3 is the response, and x1, x2 and x3
are independent uniform random variables between a and b.
xy <- linear_data(n = 30, a = -1, b = 1)xy <- linear_data(n = 30, a = -1, b = 1)
spatgeom objectsPlot alpha-shape for spatgeom objects.
plot_alpha_shape(x, alpha, font_size = 12)plot_alpha_shape(x, alpha, font_size = 12)
x |
an object of class |
alpha |
value of |
font_size |
a integer that increases the font size in the plot. |
a ggplot object with the raw alpha-shape for
the original data at resolution alpha
xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) plot_alpha_shape(estimation, alpha = c(0.9, 1.2))xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) plot_alpha_shape(estimation, alpha = c(0.9, 1.2))
S3 generic for plotting spatial geometry analysis results. Dispatches to
plot_curve.spatgeom for spatgeom objects (producing a
ggplot of survival curves or their derivatives), and to
plot_curve.spatgeom_group for spatgeom_group objects
(producing a side-by-side cowplot grid, one panel per group).
plot_curve(x, ...) ## S3 method for class 'spatgeom' plot_curve(x, type = "curve", font_size = 12, ...)plot_curve(x, ...) ## S3 method for class 'spatgeom' plot_curve(x, type = "curve", font_size = 12, ...)
x |
an object of class |
... |
further arguments passed to the appropriate method. |
type |
a string: either |
font_size |
an integer controlling the font size in the plot. |
a ggplot object for spatgeom inputs, or a
cowplot grid for spatgeom_group inputs.
xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Basic plots — no envelope, no hypothesis testing: estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) plot_curve(estimation, type = "curve") plot_curve(estimation, type = "deriv") # Curve with Monte Carlo envelope ribbon (grey) and theoretical CSR curve # (red dashed): est_env <- spatgeom(y = xy[, 1], x = xy[, -1], envelope = TRUE) plot_curve(est_env, type = "curve") # Curve with hypothesis testing results: additionally shows the CSR mean # (blue dotted) and its confidence band (blue ribbon): est_ht <- spatgeom( y = xy[, 1], x = xy[, -1], hypothesis_testing = TRUE, method = "MAD" ) plot_curve(est_ht, type = "curve")xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Basic plots — no envelope, no hypothesis testing: estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) plot_curve(estimation, type = "curve") plot_curve(estimation, type = "deriv") # Curve with Monte Carlo envelope ribbon (grey) and theoretical CSR curve # (red dashed): est_env <- spatgeom(y = xy[, 1], x = xy[, -1], envelope = TRUE) plot_curve(est_env, type = "curve") # Curve with hypothesis testing results: additionally shows the CSR mean # (blue dotted) and its confidence band (blue ribbon): est_ht <- spatgeom( y = xy[, 1], x = xy[, -1], hypothesis_testing = TRUE, method = "MAD" ) plot_curve(est_ht, type = "curve")
spatgeom_group objectPlot method for objects of class spatgeom_group.
Produces a side-by-side grid of survival curve (or derivative) panels,
one column per group.
## S3 method for class 'spatgeom_group' plot_curve(x, type = "curve", font_size = 12, ...)## S3 method for class 'spatgeom_group' plot_curve(x, type = "curve", font_size = 12, ...)
x |
an object of class |
type |
a string: either |
font_size |
an integer controlling the font size. Default |
... |
further arguments passed to |
A cowplot grid object (produced by
cowplot::plot_grid).
set.seed(1) xy <- donut_data(n = 60, a = -1, b = 1, theta = 2 * pi) grp <- sample(c("A", "B", "C"), nrow(xy), replace = TRUE) sg <- spatgeom_group(x = xy[, -1], by = grp, y = xy[, 1]) plot_curve(sg) plot_curve(sg, type = "deriv")set.seed(1) xy <- donut_data(n = 60, a = -1, b = 1, theta = 2 * pi) grp <- sample(c("A", "B", "C"), nrow(xy), replace = TRUE) sg <- spatgeom_group(x = xy[, -1], by = grp, y = xy[, 1]) plot_curve(sg) plot_curve(sg, type = "deriv")
spatgeom objectPrint method for objects of class spatgeom.
## S3 method for class 'spatgeom' print(x, return_table = FALSE, ...)## S3 method for class 'spatgeom' print(x, return_table = FALSE, ...)
x |
an object of class |
return_table |
if |
... |
further arguments passed to the |
Print the estimate given by spatgeom.
xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Basic print — shows alpha and geom_survival ranges per variable: estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) print(estimation) # Print a spatgeom object that includes hypothesis testing results: est_ht <- spatgeom( y = xy[, 1], x = xy[, -1], hypothesis_testing = TRUE, method = "MAD" ) print(est_ht) # Return the underlying data frame instead of printing: tbl <- print(est_ht, return_table = TRUE) head(tbl)xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Basic print — shows alpha and geom_survival ranges per variable: estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) print(estimation) # Print a spatgeom object that includes hypothesis testing results: est_ht <- spatgeom( y = xy[, 1], x = xy[, -1], hypothesis_testing = TRUE, method = "MAD" ) print(est_ht) # Return the underlying data frame instead of printing: tbl <- print(est_ht, return_table = TRUE) head(tbl)
spatgeom_group objectPrint method for objects of class spatgeom_group.
Displays a summary for each group in turn.
## S3 method for class 'spatgeom_group' print(x, ...)## S3 method for class 'spatgeom_group' print(x, ...)
x |
an object of class |
... |
further arguments passed to |
x invisibly.
spatgeom_group, print.spatgeom
set.seed(1) xy <- donut_data(n = 60, a = -1, b = 1, theta = 2 * pi) grp <- sample(c("A", "B", "C"), nrow(xy), replace = TRUE) sg <- spatgeom_group(x = xy[, -1], by = grp, y = xy[, 1]) print(sg)set.seed(1) xy <- donut_data(n = 60, a = -1, b = 1, theta = 2 * pi) grp <- sample(c("A", "B", "C"), nrow(xy), replace = TRUE) sg <- spatgeom_group(x = xy[, -1], by = grp, y = xy[, 1]) print(sg)
Reduce an matrix to
(d = n_components, typically 2) using a choice of static
embedding methods. The result can be passed directly to
spatgeom or spatgeom_group when the original
feature matrix has more than 2 columns.
reduce_dim(x, method = c("pca", "umap", "tsne"), n_components = 2L, ...)reduce_dim(x, method = c("pca", "umap", "tsne"), n_components = 2L, ...)
x |
a numeric matrix or data frame with |
method |
character; the reduction method. One of:
|
n_components |
integer; number of output dimensions. Default |
... |
additional arguments forwarded to the underlying reduction
function ( |
A numeric matrix with nrow(x) rows and n_components
columns. Column names are Dim_1, Dim_2, etc.
set.seed(1) xy <- donut_data(n = 50, a = -1, b = 1, theta = 2 * pi) # PCA — no extra packages needed emb <- reduce_dim(xy[, -1], method = "pca") dim(emb) # 50 x 2 # UMAP — requires uwot if (requireNamespace("uwot", quietly = TRUE)) { emb_umap <- reduce_dim(xy[, -1], method = "umap") dim(emb_umap) } # t-SNE — requires Rtsne; perplexity must be < n/3 if (requireNamespace("Rtsne", quietly = TRUE)) { emb_tsne <- reduce_dim(xy[, -1], method = "tsne", perplexity = 10) dim(emb_tsne) }set.seed(1) xy <- donut_data(n = 50, a = -1, b = 1, theta = 2 * pi) # PCA — no extra packages needed emb <- reduce_dim(xy[, -1], method = "pca") dim(emb) # 50 x 2 # UMAP — requires uwot if (requireNamespace("uwot", quietly = TRUE)) { emb_umap <- reduce_dim(xy[, -1], method = "umap") dim(emb_umap) } # t-SNE — requires Rtsne; perplexity must be < n/3 if (requireNamespace("Rtsne", quietly = TRUE)) { emb_tsne <- reduce_dim(xy[, -1], method = "tsne", perplexity = 10) dim(emb_tsne) }
Function to estimate the geometric correlation between variables.
spatgeom( x, y, scale_pts = FALSE, nalphas = 100, envelope = FALSE, domain_type = c("bounding-box", "convex-hull"), hypothesis_testing = FALSE, significance_level = 0.05, mc_cores = 1, r = 0.5, method = c("MAD", "DCLF"), reduce = c("none", "pca", "umap", "tsne"), reduce_args = list() )spatgeom( x, y, scale_pts = FALSE, nalphas = 100, envelope = FALSE, domain_type = c("bounding-box", "convex-hull"), hypothesis_testing = FALSE, significance_level = 0.05, mc_cores = 1, r = 0.5, method = c("MAD", "DCLF"), reduce = c("none", "pca", "umap", "tsne"), reduce_args = list() )
x |
numeric matrix or data.frame. Either a matrix of covariables
(paired with |
y |
numeric vector of responses. Optional: when omitted, |
scale_pts |
boolean to make the estimations with scaled variables.
Default |
nalphas |
number of alphas generated for creating the geometric measure of fit index. Default 100. |
envelope |
boolean to determine if the Monte-Carlo should be estimated.
Default |
domain_type |
character with the type of domain to use. It can be either "bounding-box" or "convex-hull". Default "bounding-box". |
hypothesis_testing |
logical. If |
significance_level |
a numeric significance level passed to
|
mc_cores |
integer with the number of parallel process to run (if
available). Default |
r |
numeric scaling parameter for the theoretical CSR curve used in
|
method |
character, one of |
reduce |
character; dimensionality reduction method to apply when
|
reduce_args |
a named list of additional arguments forwarded to
|
A list of class spatgeom with the following elements:
The function call.
x input.
y output.
A list of size ncol(x) corresponding to each
column of x. Each element of the list has:
a data frame of class sfc (see
sf::st_sf())with columns geometry, segments,
max_length and alpha. The data.frame contains the whole
Delanauy triangulation for the corresponding column of x and y.
The segments column are the segments of each individual triangle and
max_length is the maximum length of them.
a data frame with columns alpha and
geom_survival. The alpha column is a numeric vector of size
nalphas from the minimum to the maximum distance between points
estimated in the data. The geom_survival column is the value 1
- (alpha shape Area)/(containing box Area).
the intensity estimated for the corresponding
column of x and y.
the mean number of points in the point process.
a data frame in tidy format with 40 runs of a
CSR process, if envelope=TRUE, The CSR is created by generating
n uniform points in the plane, where n is drawn from Poisson
distribution with parameter mean_n.
Only present when
hypothesis_testing = TRUE. A list returned by
csr_test with elements hypothesis_testing_df (a data
frame of test statistics and confidence bands for all variables) and
details (per-variable summary tables of the test results).
Only present when reduce != "none"
and x had more than 2 columns. The numeric matrix
produced by reduce_dim, whose columns were used as the 2D
point cloud.
Hernández, A.J., Solís, M. Geometric goodness of fit measure to detect patterns in data point clouds. Comput Stat (2022). https://doi.org/10.1007/s00180-022-01244-1
xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Basic usage: estimate the geometric survival curves only. estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) print(estimation) # With Monte Carlo envelope (takes a few seconds): est_env <- spatgeom(y = xy[, 1], x = xy[, -1], envelope = TRUE) plot_curve(est_env, type = "curve") # With integrated CSR hypothesis testing using the MAD statistic: est_ht <- spatgeom( y = xy[, 1], x = xy[, -1], hypothesis_testing = TRUE, method = "MAD" ) print(est_ht) plot_curve(est_ht, type = "curve") # Inspect the per-variable test results: est_ht$hypothesis_testing_results$details # Wide matrix: reduce to 2D with PCA before running the analysis: est_pca <- spatgeom(y = xy[, 1], x = xy[, -1], reduce = "pca") plot_curve(est_pca, type = "curve") # Wide matrix: reduce with UMAP (requires the 'uwot' package): if (requireNamespace("uwot", quietly = TRUE)) { est_umap <- spatgeom(y = xy[, 1], x = xy[, -1], reduce = "umap") plot_curve(est_umap, type = "curve") }xy <- donut_data(n = 30, a = -1, b = 1, theta = 2 * pi) # Basic usage: estimate the geometric survival curves only. estimation <- spatgeom(y = xy[, 1], x = xy[, -1]) print(estimation) # With Monte Carlo envelope (takes a few seconds): est_env <- spatgeom(y = xy[, 1], x = xy[, -1], envelope = TRUE) plot_curve(est_env, type = "curve") # With integrated CSR hypothesis testing using the MAD statistic: est_ht <- spatgeom( y = xy[, 1], x = xy[, -1], hypothesis_testing = TRUE, method = "MAD" ) print(est_ht) plot_curve(est_ht, type = "curve") # Inspect the per-variable test results: est_ht$hypothesis_testing_results$details # Wide matrix: reduce to 2D with PCA before running the analysis: est_pca <- spatgeom(y = xy[, 1], x = xy[, -1], reduce = "pca") plot_curve(est_pca, type = "curve") # Wide matrix: reduce with UMAP (requires the 'uwot' package): if (requireNamespace("uwot", quietly = TRUE)) { est_umap <- spatgeom(y = xy[, 1], x = xy[, -1], reduce = "umap") plot_curve(est_umap, type = "curve") }
Apply spatgeom independently to each level of a
grouping variable. The result is an object of class spatgeom_group
that bundles the per-group spatgeom objects and supports
print and plot_curve methods for side-by-side
comparison.
spatgeom_group(x, by, ...)spatgeom_group(x, by, ...)
x |
a numeric matrix or data frame of covariates (passed as the
|
by |
a vector of group labels with the same length as |
... |
additional arguments forwarded verbatim to
|
When y is supplied via ..., the corresponding rows of y
are subsetted to match each group. When y is absent, each group's
subset of x is treated as a 2-D point cloud (or reduced according to
reduce with options supplied via reduce_args).
Groups are processed in the order returned by unique(by), which
preserves the order of first appearance in by.
An object of class spatgeom_group, which is a list with:
The matched function call.
Character vector of unique group labels in appearance order.
Named list of spatgeom objects, one per
group. Names match groups.
spatgeom, plot_curve,
print.spatgeom_group
set.seed(1) xy <- donut_data(n = 60, a = -1, b = 1, theta = 2 * pi) grp <- sample(c("A", "B", "C"), nrow(xy), replace = TRUE) sg <- spatgeom_group(x = xy[, -1], by = grp, y = xy[, 1]) print(sg) plot_curve(sg) # With hypothesis testing per group (slow — runs Monte Carlo per group): sg_ht <- spatgeom_group( x = xy[, -1], by = grp, y = xy[, 1], hypothesis_testing = TRUE, method = "MAD" ) plot_curve(sg_ht, type = "curve")set.seed(1) xy <- donut_data(n = 60, a = -1, b = 1, theta = 2 * pi) grp <- sample(c("A", "B", "C"), nrow(xy), replace = TRUE) sg <- spatgeom_group(x = xy[, -1], by = grp, y = xy[, 1]) print(sg) plot_curve(sg) # With hypothesis testing per group (slow — runs Monte Carlo per group): sg_ht <- spatgeom_group( x = xy[, -1], by = grp, y = xy[, 1], hypothesis_testing = TRUE, method = "MAD" ) plot_curve(sg_ht, type = "curve")