AodhaFFNSpatialRelationLocationEncoder¶
Overview¶
The AodhaFFNSpatialRelationLocationEncoder is designed to process spatial relations between locations using Fourier Feature Transform (FFT) adapted for spatial encoding. It utilizes the AodhaFFTSpatialRelationPositionEncoder to transform spatial coordinates into a high-dimensional space, enhancing the model’s ability to capture and interpret spatial relationships across various scales.
Features¶
Position Encoding (
self.position_encoder): UtilizesAodhaFFTSpatialRelationPositionEncoderfor transforming spatial differences into frequency-based representations using Fourier Feature Transform.Feed-Forward Neural Network (
self.ffn): Processes the FFT-based data through a multi-layer neural network to generate final spatial embeddings.
Configuration Parameters¶
spa_embed_dim: Dimensionality of the output spatial embeddings.extent: The extent of the coordinate space (x_min, x_max, y_min, y_max).coord_dim: Dimensionality of the coordinate space (typically 2D).do_pos_enc: Whether to perform position encoding.do_global_pos_enc: Whether to normalize coordinates globally.device: Computation device used (e.g., ‘cuda’ for GPU acceleration).ffn_act: Activation function for the neural network layers.ffn_num_hidden_layers: Number of hidden layers in the neural network.ffn_dropout_rate: Dropout rate to prevent overfitting during training.ffn_hidden_dim: Dimension of each hidden layer within the network.ffn_use_layernormalize: Whether to use layer normalization.ffn_skip_connection: Whether to include skip connections within the network layers.ffn_context_str: Context string for debugging and detailed logging within the network.
Methods¶
forward(coords)¶
Purpose: Processes input coordinates through the encoder to produce spatial embeddings.
Parameters:
coords(List or np.ndarray): Coordinates to process, formatted as(batch_size, num_context_pt, coord_dim).
Returns:
sprenc(Tensor): The final spatial relation embeddings, shaped(batch_size, num_context_pt, spa_embed_dim).
AodhaFFTSpatialRelationPositionEncoder¶
Overview¶
The AodhaFFTSpatialRelationPositionEncoder leverages Fourier Feature Transform (FFT) to encode spatial coordinates into high-dimensional representations. This method divides the space into grids and uses grid embeddings to represent the spatial relations.
Features¶
Fourier Feature Transform Encoding: Transforms spatial data into a frequency-based representation, capturing inherent spatial frequencies and patterns effectively.
Adaptable to Different Spatial Extents: Can normalize input coordinates based on the provided spatial extent.
Theory¶
Fourier Feature Transform (FFT) Encoding¶
Fourier Feature Transform provides an approximation to continuous signals by mapping the input data into a randomized low-dimensional feature space using sine and cosine functions.
Fourier Transform Basis Functions¶
The Fourier Transform uses sine and cosine functions as basis functions to represent the original signal. In spatial encoding, this can be represented as: $\text{sin}(\pi \cdot x), \text{cos}(\pi \cdot x), \text{sin}(\pi \cdot y), \text{cos}(\pi \cdot y)$
Formulas¶
Normalization:
Normalize coordinates based on the extent of the coordinate space.
$x_{\text{norm}} = \frac{x - x_{\min}}{x_{\max} - x_{\min}}$
$y_{\text{norm}} = \frac{y - y_{\min}}{y_{\max} - y_{\min}}$
Fourier Feature Transform:
Apply sine and cosine functions to the normalized coordinates: $[\text{sin}(\pi \cdot x_{\text{norm}}), \text{cos}(\pi \cdot x_{\text{norm}}),$ $\text{sin}(\pi \cdot y_{\text{norm}}), \text{cos}(\pi \cdot y_{\text{norm}})]$
Implementation¶
make_output_embeds(coords)¶
Purpose: Converts input coordinates into FFT-based high-dimensional embeddings.
Parameters:
coords: Input coordinates.
Returns:
High-dimensional embeddings representing the input data in the FFT feature space.
Usage Example¶
# Initialize the encoder
encoder = AodhaFFNSpatialRelationLocationEncoder(
spa_embed_dim=64,
extent=(0, 100, 0, 100), # Example extent
coord_dim=2,
do_pos_enc=True,
do_global_pos_enc=True,
device="cuda",
ffn_act="relu",
ffn_num_hidden_layers=1,
ffn_dropout_rate=0.5,
ffn_hidden_dim=256,
ffn_use_layernormalize=True,
ffn_skip_connection=True,
ffn_context_str="AodhaFFTSpatialRelationEncoder"
)
coords = np.array([[34.0522, -118.2437], [40.7128, -74.0060]]) # Example coordinate data
embeddings = encoder.forward(coords)