Components of UDFs

User-Defined Functions (UDFs) are essential for extending and customizing the behavior of the indexing process.

Below are the core components of a UDF and the folder structure typically used for organizing these components.

Models (Database Tables)

  • Purpose: Models define the structure of the data that will be stored in the database and correspond to the schema of the tables in the database.

  • Usage: You create models for storing processed data from UDFs.

  • Location: modules/custom/{feature_name}/models

Data Classes

  • Purpose: Data Classes provide an easily accessible in-memory data structure within your code. They are similar to models but are used in memory during data processing and transformation.

  • Usage: Data Classes are used to represent the processed data before storing it in the database.

  • Difference from Models: While models represent the structure of data in the database, Data Classes represent data structures used in code.

  • Location: modules/custom/{feature_name}/domains

Filter Configuration

Purpose: Filters enable developers to specify conditions for the data a UDF will capture and ensures that only relevant data is processed and stored.

  • Customizable Filter Methods:

    • Each UDF can define its own methods for filtering data by creating get_filter job functions. This allows developers to set custom criteria for data selection, focusing on specific transaction types, events, or data points essential to the UDF’s purpose.

    • Example: A filter might select only transactions involving specific addresses, token types, or contract interactions.

Job Logic

  • Purpose: Job Logic encapsulates the core functionality of the UDF, handling the process of transforming the data and persisting the results into the database.

  • Key Functions:

    • collapse: The core logic of the UDF, responsible for processing and transforming input data.

    • collect_domain: Used to save the processed data into the database.

Folder Structure

UDF components are typically organized in the following folder structure:

project_root/
└── modules/
    └── custom/
        └── {feature_name}/
            β”œβ”€β”€ models/
            β”‚   └── (model files)
            β”œβ”€β”€ domains/
            β”‚   └── (dataclass files)
            β”œβ”€β”€ abi.py
            └── job.py
└── config/
    └── indexer-config.yaml
  • common/models/: Contains the database models.

  • indexes/models/: Contains additional index-related models.

  • customs/domains/: Contains domain-specific data and logic.


UDF Logic Flow

  • Extract Data: Retrieve relevant data from blocks, transactions, or logs as defined by the UDF configuration and ABI.

  • Process the Data: Apply transformations and logic to the extracted data in collapse within job.py.

  • Create Data Instances: Instantiate Data Classes with the processed information, representing data in-memory before storing it.

  • Save to Database: Use collect_domain to save the transformed data to the database tables as defined by your model.

Types of Data Models

  1. Address-Level Data:

    • Requires periodic updates as new transactions occur.

    • For consistency, the indexer should be run in a specific order to avoid overwriting data inconsistently

    • Use Case: Useful for tracking data directly tied to specific wallet addresses, such as balances and transaction counts.

  2. Event-Level Data:

    • Event data is easier to handle as it typically has unique keys for each event.

    • These can be processed multiple times without issues, as the data is inherently immutable.

    • Use Case: Ideal for capturing unique, discrete events that do not require historical consistency, such as contract-specific events (e.g., transfers).

Indexing Historical Data

To index historical data, the indexer allows you to specify a starting block number in the environment configuration. The indexer will process the blockchain data from the specified block onward.

  • Address-Level Data:Address-Level Data: Running the indexer in a specific order is critical to maintain consistency when dealing with past transactions. Use the scheduler with precise block ranges (e.g., --block=1,2,10,50-100) to ensure sequential processing.

  • Event-Level Data: This type of data can be processed on any block range since events are usually unique and do not require strict ordering.

Configuration changes in UDF

Configuration changes in the UDF, such as trigger conditions, directly impact database updates. These settings determine the frequency and type of data capture, affecting how and when new information is processed and stored. This ensures that developers have control over data update intervals based on their UDF requirements.

Utility Methods

The UDF ecosystem in Hemera relies on a set of utility methods for interacting with blockchain data, facilitating data transformation, encoding, and decoding.

  • ABI Utilities:

    • Use Web3.contract to define contract objects and establish functions and events relevant to your UDF.

    • Define related functions and events in modules/custom/{feature_name}/abi.py and import these definitions within job.py to ensure modular consistency.

  • Encoding and Decoding:

    • Use encoding utilities to convert data into formats suitable for storage.

    • Decode transaction inputs, logs, and other data using the decoding methods provided in Hemera’s utility package.

  • String and Hex Conversions:

    • For consistent handling of blockchain addresses and transaction data, use utility functions for hex-string conversions.

Last updated