Skip to content

Calculate and return logsBloom on eth_getBlock* API calls #13546

@rvagg

Description

@rvagg

Problem

Lotus returns a block-level logsBloom with all 2048 bits set to 1 for every block returned by eth_getBlockByNumber and eth_getBlockByHash. This makes the bloom filter a universal match, completely defeating its purpose as a client-side optimisation for skipping blocks without relevant events.

EVM indexing frameworks (Ponder, The Graph, etc.) use the bloom field to decide whether to call eth_getLogs for a given block. When every block's bloom matches everything, every block triggers an eth_getLogs call regardless of whether it contains matching events.

Transaction receipt blooms are correct. eth_getTransactionReceipt returns properly computed per-receipt bloom filters. The issue is only at the block level.

Root cause

NewEthBlock() initialises LogsBloom to NewFullEthBloom() (all 0xff bytes):

func NewEthBlock(hasTransactions bool, tipsetLen int) EthBlock {
    b := EthBlock{
        // ...
        LogsBloom: NewFullEthBloom(),
        // ...
    }
    return b
}

The block construction function newEthBlockFromFilecoinTipSet() iterates messages and receipts to build the transaction list and sum gas, but never aggregates bloom filters. The all-1s default is never overwritten.

Proposed fix: precomputed bloom in the chain index

Computing the bloom at query time (fetching all events just to build the filter) would add significant overhead to every block query. Instead, compute the bloom once during event indexing and store it in the chain index.

The chain indexer (chain/index/events.go, indexEvents()) already processes every event for every tipset during indexing. At that point it has:

  • Emitter addresses: already resolved to delegated addresses for storage
  • Event entries: the t1-t4 keys contain the 32-byte topic hashes that go into the bloom

Schema

New table via the existing migration system (lib/sqlite/sqlite.go):

CREATE TABLE IF NOT EXISTS tipset_bloom (
    tipset_key_cid BLOB PRIMARY KEY,
    height INTEGER NOT NULL,
    bloom BLOB NOT NULL  -- 256 bytes
)
CREATE INDEX IF NOT EXISTS idx_tipset_bloom_height ON tipset_bloom (height)

Storage cost: 256 bytes per tipset. ~720 KB/day.

Write path

In indexEvents(), after processing all events for a tipset:

  1. Initialise an empty bloom (all zeros)
  2. For each event with a delegated emitter address:
    • Convert to EthAddress, call EthBloomSet(bloom, addr[:])
    • For each entry with Codec == cid.Raw and key matching t1-t4: call EthBloomSet(bloom, entry.Value)
  3. Store the bloom in tipset_bloom

Tipsets with no EVM events get an all-zeros bloom stored. This is the most valuable case, as it tells clients "definitely nothing here."

The bloom computation should match ethLogFromEvent() filtering (skip non-raw codec entries, skip entries with keys other than t1-t4 and d), so the bloom accurately reflects what eth_getLogs would return.

Read path

In newEthBlockFromFilecoinTipSet():

  1. Query tipset_bloom by tipset key CID
  2. If found: set block.LogsBloom to the stored bloom
  3. If not found (pre-migration data): return NewFullEthBloom() (current behavior, no regression)

This requires:

  • A new method on the Indexer interface: GetTipsetBloom(ctx context.Context, tipsetKeyCid cid.Cid) ([]byte, bool, error)
  • Passing the chain indexer into newEthBlockFromFilecoinTipSet() (callers already have access to it)

Migration

Add a migration function to the currently empty []sqlite.MigrationFunc{}:

  1. Create the tipset_bloom table
  2. Do not backfill existing data during migration

Pre-migration tipsets return the full bloom (no regression). New tipsets get correct blooms going forward. A separate backfill mechanism could be added later (similar to ChainValidateIndex with backfill=true) to recompute blooms for historical data from existing indexed events.

Revert handling

The existing indexer uses a soft-delete pattern: tipset_message rows get reverted = 1, and event rows are similarly flagged. The simplest approach for us is to delete the bloom row on revert. When Revert() marks events as reverted for a tipset, also DELETE FROM tipset_bloom WHERE tipset_key_cid = ?. On re-application, indexEvents() recomputes and re-inserts it. Simple and correct. The read path already falls back to full bloom when no row exists, so the window between revert and re-application is safe.

GC / compaction

The chain indexer runs a GC loop every 4 hours (gc.go) that deletes tipset_message rows older than gcRetentionEpochs (configurable, default is typically a few days worth). The event and event_entry tables cascade-delete via foreign keys on tipset_message.id.

tipset_bloom is keyed by tipset_key_cid rather than by a tipset_message foreign key, so it won't cascade automatically. We could build a trigger that matches but the easiest option would be just to do an explicit GC statement: add DELETE FROM tipset_bloom WHERE height < ? using the same removalEpoch as the existing removeTipsetsBeforeHeightStmt. This is consistent with how eth_tx_hash cleanup is handled (separate DELETE in the same gc() function, same retention window).

Testing

Unit tests for EthBloomSet correctness

Lotus has no unit tests for EthBloomSet itself, only the integration test in itests/eth_filter_test.go:TestTxReceiptBloom which validates a receipt bloom against a hardcoded expected value from Remix.

The bloom algorithm (Yellow Paper Section 4.3.1) is identical to go-ethereum's implementation in core/types/bloom9.go. We should add unit tests in chain/types/ethtypes/ that validate:

  1. Known-input vectors. Compute EthBloomSet for a known address and topic, verify the resulting bytes match go-ethereum's output. The log1_correct.json test from ethereum/tests provides a concrete vector: address 0x095e7baea6a6c7c4c2dfeb977efac326af552d87 with topic 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff produces a known 256-byte bloom.
  2. Bloom aggregation (OR). Compute individual blooms for two different addresses, OR them together, verify the result has the correct bits set from both. This validates the block-level aggregation logic.
  3. Empty bloom. A tipset with no EVM events should produce an all-zeros bloom.
  4. Cross-validation with go-ethereum. go-ethereum's TestBloomExtensively adds 100 formatted strings and checks that keccak256(resulting_bloom) equals 0xc8d3ca65cdb4874300a9e39475508f23ed6da09fdbc487f89a2dcf50b09eb263. We can do the same to confirm algorithmic equivalence.

Integration test for block-level bloom

For integration, deploy a contract, emit events, then verify that eth_getBlockByNumber returns a block whose logsBloom is the OR of all receipt blooms in that block.

The existing TestTxReceiptBloom test deploys EventMatrix.hex and validates the receipt bloom. We can extend this pattern:

  1. Deploy contract, invoke a function that emits events
  2. Fetch the block via eth_getBlockByNumber
  3. Fetch all receipts for transactions in that block
  4. Compute the expected block bloom by OR-ing all receipt blooms
  5. Assert block.LogsBloom == expectedBloom
  6. Also verify that the bloom is NOT the full bloom (i.e., the fix is active)

This validates the full pipeline: event indexing -> bloom computation -> storage -> retrieval -> block construction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    📌 Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions