Enable parquet prefetch#522
Conversation
|
I will take a look at it tmrw. Seems like It would need changes on the thrift side as well |
|
Ah right. I made the changes on the thrift side as well, but it didn't help on the Let me look into recreating it. |
302d241 to
64f42ff
Compare
|
The code had:
That computes the intersection-ish end, not the union end. Example: existing: [1000, 2000) wrong merged range: [1000, 2000) With the old code, the new range could be partially dropped from the read-ahead buffer. Correctness of query results usually survived because a later read that missed the prefetch buffer fell back to readFromFile, but the prefetch registration no longer meant “this full range is buffered.” For remote files, that causes extra range reads and defeats part of prefetch. |
|
The thrift change didn't change query performance. Improving further requires either changes to query plan or more aggressive local caching. Probably post 0.17.0 release work. |
Summary
Validation