Problem
Lotus returns a block-level logsBloom with all 2048 bits set to 1 for every block returned by eth_getBlockByNumber and eth_getBlockByHash. This makes the bloom filter a universal match, completely defeating its purpose as a client-side optimisation for skipping blocks without relevant events.
EVM indexing frameworks (Ponder, The Graph, etc.) use the bloom field to decide whether to call eth_getLogs for a given block. When every block's bloom matches everything, every block triggers an eth_getLogs call regardless of whether it contains matching events.
Transaction receipt blooms are correct. eth_getTransactionReceipt returns properly computed per-receipt bloom filters. The issue is only at the block level.
Root cause
NewEthBlock() initialises LogsBloom to NewFullEthBloom() (all 0xff bytes):
func NewEthBlock(hasTransactions bool, tipsetLen int) EthBlock {
b := EthBlock{
// ...
LogsBloom: NewFullEthBloom(),
// ...
}
return b
}
The block construction function newEthBlockFromFilecoinTipSet() iterates messages and receipts to build the transaction list and sum gas, but never aggregates bloom filters. The all-1s default is never overwritten.
Proposed fix: precomputed bloom in the chain index
Computing the bloom at query time (fetching all events just to build the filter) would add significant overhead to every block query. Instead, compute the bloom once during event indexing and store it in the chain index.
The chain indexer (chain/index/events.go, indexEvents()) already processes every event for every tipset during indexing. At that point it has:
- Emitter addresses: already resolved to delegated addresses for storage
- Event entries: the
t1-t4 keys contain the 32-byte topic hashes that go into the bloom
Schema
New table via the existing migration system (lib/sqlite/sqlite.go):
CREATE TABLE IF NOT EXISTS tipset_bloom (
tipset_key_cid BLOB PRIMARY KEY,
height INTEGER NOT NULL,
bloom BLOB NOT NULL -- 256 bytes
)
CREATE INDEX IF NOT EXISTS idx_tipset_bloom_height ON tipset_bloom (height)
Storage cost: 256 bytes per tipset. ~720 KB/day.
Write path
In indexEvents(), after processing all events for a tipset:
- Initialise an empty bloom (all zeros)
- For each event with a delegated emitter address:
- Convert to
EthAddress, call EthBloomSet(bloom, addr[:])
- For each entry with
Codec == cid.Raw and key matching t1-t4: call EthBloomSet(bloom, entry.Value)
- Store the bloom in
tipset_bloom
Tipsets with no EVM events get an all-zeros bloom stored. This is the most valuable case, as it tells clients "definitely nothing here."
The bloom computation should match ethLogFromEvent() filtering (skip non-raw codec entries, skip entries with keys other than t1-t4 and d), so the bloom accurately reflects what eth_getLogs would return.
Read path
In newEthBlockFromFilecoinTipSet():
- Query
tipset_bloom by tipset key CID
- If found: set
block.LogsBloom to the stored bloom
- If not found (pre-migration data): return
NewFullEthBloom() (current behavior, no regression)
This requires:
- A new method on the
Indexer interface: GetTipsetBloom(ctx context.Context, tipsetKeyCid cid.Cid) ([]byte, bool, error)
- Passing the chain indexer into
newEthBlockFromFilecoinTipSet() (callers already have access to it)
Migration
Add a migration function to the currently empty []sqlite.MigrationFunc{}:
- Create the
tipset_bloom table
- Do not backfill existing data during migration
Pre-migration tipsets return the full bloom (no regression). New tipsets get correct blooms going forward. A separate backfill mechanism could be added later (similar to ChainValidateIndex with backfill=true) to recompute blooms for historical data from existing indexed events.
Revert handling
The existing indexer uses a soft-delete pattern: tipset_message rows get reverted = 1, and event rows are similarly flagged. The simplest approach for us is to delete the bloom row on revert. When Revert() marks events as reverted for a tipset, also DELETE FROM tipset_bloom WHERE tipset_key_cid = ?. On re-application, indexEvents() recomputes and re-inserts it. Simple and correct. The read path already falls back to full bloom when no row exists, so the window between revert and re-application is safe.
GC / compaction
The chain indexer runs a GC loop every 4 hours (gc.go) that deletes tipset_message rows older than gcRetentionEpochs (configurable, default is typically a few days worth). The event and event_entry tables cascade-delete via foreign keys on tipset_message.id.
tipset_bloom is keyed by tipset_key_cid rather than by a tipset_message foreign key, so it won't cascade automatically. We could build a trigger that matches but the easiest option would be just to do an explicit GC statement: add DELETE FROM tipset_bloom WHERE height < ? using the same removalEpoch as the existing removeTipsetsBeforeHeightStmt. This is consistent with how eth_tx_hash cleanup is handled (separate DELETE in the same gc() function, same retention window).
Testing
Unit tests for EthBloomSet correctness
Lotus has no unit tests for EthBloomSet itself, only the integration test in itests/eth_filter_test.go:TestTxReceiptBloom which validates a receipt bloom against a hardcoded expected value from Remix.
The bloom algorithm (Yellow Paper Section 4.3.1) is identical to go-ethereum's implementation in core/types/bloom9.go. We should add unit tests in chain/types/ethtypes/ that validate:
- Known-input vectors. Compute
EthBloomSet for a known address and topic, verify the resulting bytes match go-ethereum's output. The log1_correct.json test from ethereum/tests provides a concrete vector: address 0x095e7baea6a6c7c4c2dfeb977efac326af552d87 with topic 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff produces a known 256-byte bloom.
- Bloom aggregation (OR). Compute individual blooms for two different addresses, OR them together, verify the result has the correct bits set from both. This validates the block-level aggregation logic.
- Empty bloom. A tipset with no EVM events should produce an all-zeros bloom.
- Cross-validation with go-ethereum. go-ethereum's
TestBloomExtensively adds 100 formatted strings and checks that keccak256(resulting_bloom) equals 0xc8d3ca65cdb4874300a9e39475508f23ed6da09fdbc487f89a2dcf50b09eb263. We can do the same to confirm algorithmic equivalence.
Integration test for block-level bloom
For integration, deploy a contract, emit events, then verify that eth_getBlockByNumber returns a block whose logsBloom is the OR of all receipt blooms in that block.
The existing TestTxReceiptBloom test deploys EventMatrix.hex and validates the receipt bloom. We can extend this pattern:
- Deploy contract, invoke a function that emits events
- Fetch the block via
eth_getBlockByNumber
- Fetch all receipts for transactions in that block
- Compute the expected block bloom by OR-ing all receipt blooms
- Assert
block.LogsBloom == expectedBloom
- Also verify that the bloom is NOT the full bloom (i.e., the fix is active)
This validates the full pipeline: event indexing -> bloom computation -> storage -> retrieval -> block construction.
Problem
Lotus returns a block-level
logsBloomwith all 2048 bits set to 1 for every block returned byeth_getBlockByNumberandeth_getBlockByHash. This makes the bloom filter a universal match, completely defeating its purpose as a client-side optimisation for skipping blocks without relevant events.EVM indexing frameworks (Ponder, The Graph, etc.) use the bloom field to decide whether to call
eth_getLogsfor a given block. When every block's bloom matches everything, every block triggers aneth_getLogscall regardless of whether it contains matching events.Transaction receipt blooms are correct.
eth_getTransactionReceiptreturns properly computed per-receipt bloom filters. The issue is only at the block level.Root cause
NewEthBlock()initialisesLogsBloomtoNewFullEthBloom()(all0xffbytes):The block construction function
newEthBlockFromFilecoinTipSet()iterates messages and receipts to build the transaction list and sum gas, but never aggregates bloom filters. The all-1s default is never overwritten.Proposed fix: precomputed bloom in the chain index
Computing the bloom at query time (fetching all events just to build the filter) would add significant overhead to every block query. Instead, compute the bloom once during event indexing and store it in the chain index.
The chain indexer (
chain/index/events.go,indexEvents()) already processes every event for every tipset during indexing. At that point it has:t1-t4keys contain the 32-byte topic hashes that go into the bloomSchema
New table via the existing migration system (
lib/sqlite/sqlite.go):Storage cost: 256 bytes per tipset. ~720 KB/day.
Write path
In
indexEvents(), after processing all events for a tipset:EthAddress, callEthBloomSet(bloom, addr[:])Codec == cid.Rawand key matchingt1-t4: callEthBloomSet(bloom, entry.Value)tipset_bloomTipsets with no EVM events get an all-zeros bloom stored. This is the most valuable case, as it tells clients "definitely nothing here."
The bloom computation should match
ethLogFromEvent()filtering (skip non-raw codec entries, skip entries with keys other thant1-t4andd), so the bloom accurately reflects whateth_getLogswould return.Read path
In
newEthBlockFromFilecoinTipSet():tipset_bloomby tipset key CIDblock.LogsBloomto the stored bloomNewFullEthBloom()(current behavior, no regression)This requires:
Indexerinterface:GetTipsetBloom(ctx context.Context, tipsetKeyCid cid.Cid) ([]byte, bool, error)newEthBlockFromFilecoinTipSet()(callers already have access to it)Migration
Add a migration function to the currently empty
[]sqlite.MigrationFunc{}:tipset_bloomtablePre-migration tipsets return the full bloom (no regression). New tipsets get correct blooms going forward. A separate backfill mechanism could be added later (similar to
ChainValidateIndexwithbackfill=true) to recompute blooms for historical data from existing indexed events.Revert handling
The existing indexer uses a soft-delete pattern:
tipset_messagerows getreverted = 1, andeventrows are similarly flagged. The simplest approach for us is to delete the bloom row on revert. WhenRevert()marks events as reverted for a tipset, alsoDELETE FROM tipset_bloom WHERE tipset_key_cid = ?. On re-application,indexEvents()recomputes and re-inserts it. Simple and correct. The read path already falls back to full bloom when no row exists, so the window between revert and re-application is safe.GC / compaction
The chain indexer runs a GC loop every 4 hours (
gc.go) that deletestipset_messagerows older thangcRetentionEpochs(configurable, default is typically a few days worth). Theeventandevent_entrytables cascade-delete via foreign keys ontipset_message.id.tipset_bloomis keyed bytipset_key_cidrather than by atipset_messageforeign key, so it won't cascade automatically. We could build a trigger that matches but the easiest option would be just to do an explicit GC statement: addDELETE FROM tipset_bloom WHERE height < ?using the sameremovalEpochas the existingremoveTipsetsBeforeHeightStmt. This is consistent with howeth_tx_hashcleanup is handled (separateDELETEin the samegc()function, same retention window).Testing
Unit tests for
EthBloomSetcorrectnessLotus has no unit tests for
EthBloomSetitself, only the integration test initests/eth_filter_test.go:TestTxReceiptBloomwhich validates a receipt bloom against a hardcoded expected value from Remix.The bloom algorithm (Yellow Paper Section 4.3.1) is identical to go-ethereum's implementation in
core/types/bloom9.go. We should add unit tests inchain/types/ethtypes/that validate:EthBloomSetfor a known address and topic, verify the resulting bytes match go-ethereum's output. Thelog1_correct.jsontest fromethereum/testsprovides a concrete vector: address0x095e7baea6a6c7c4c2dfeb977efac326af552d87with topic0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffproduces a known 256-byte bloom.TestBloomExtensivelyadds 100 formatted strings and checks thatkeccak256(resulting_bloom)equals0xc8d3ca65cdb4874300a9e39475508f23ed6da09fdbc487f89a2dcf50b09eb263. We can do the same to confirm algorithmic equivalence.Integration test for block-level bloom
For integration, deploy a contract, emit events, then verify that
eth_getBlockByNumberreturns a block whoselogsBloomis the OR of all receipt blooms in that block.The existing
TestTxReceiptBloomtest deploysEventMatrix.hexand validates the receipt bloom. We can extend this pattern:eth_getBlockByNumberblock.LogsBloom == expectedBloomThis validates the full pipeline: event indexing -> bloom computation -> storage -> retrieval -> block construction.