Optimize LiveJournal degree count queries#512
Conversation
|
How does it work?
|
a9a95a1 to
8293095
Compare
8293095 to
352cca4
Compare
| nodeKeyVector->state->getSelVectorUnsafe().setToUnfiltered(entries.size()); | ||
| } | ||
| for (sel_t i = 0; i < entries.size(); ++i) { | ||
| writeNodeKey(entries[i].first, i); |
There was a problem hiding this comment.
shouldn't we sum for a node appearing in multiple tables?
| if (auto* localTable = transaction->getLocalStorage()->getLocalTable(tableID)) { | ||
| auto& localRelTable = localTable->cast<LocalRelTable>(); | ||
| for (const auto& [nodeOffset, rowIndices] : localRelTable.getCSRIndex(direction)) { | ||
| pushDegree(nodeOffset, rowIndices.size()); |
There was a problem hiding this comment.
what if a nodeOffset appears in both local and persistent?
| if (auto* localTable = transaction->getLocalStorage()->getLocalTable(tableID)) { | ||
| auto& localRelTable = localTable->cast<LocalRelTable>(); | ||
| for (const auto& [_, rowIndices] : localRelTable.getCSRIndex(direction)) { | ||
| result += !rowIndices.empty(); |
| std::vector<std::pair<offset_t, row_idx_t>> ArrowRelTable::getTopKDegreeEntries( | ||
| const transaction::Transaction* transaction, RelDataDirection direction, idx_t k) const { | ||
| if (layout != ArrowRelTableLayout::CSR || direction == RelDataDirection::BWD || k == 0) { | ||
| return const_cast<ArrowRelTable*>(this)->RelTable::getTopKDegrees(transaction, direction, |
There was a problem hiding this comment.
this would result in crash right? same for IceDiskRel tables
352cca4 to
b42aeda
Compare
|
|
voops tests are failing |
b42aeda to
d5bbf26
Compare
d5bbf26 to
b022401
Compare
What Changed Summary
REL_DEGREE_TABLElogical/physical source operator.MATCH (u)-[:follows]->(v) RETURN count(DISTINCT u.id)now becomesREL_DEGREE_TABLEinACTIVE_BOUND_COUNTmode.MATCH (u)-[:follows]->(v) RETURN u.id, count(v) AS deg ORDER BY deg DESC LIMIT 10now becomesREL_DEGREE_TABLEinTOP_K_DEGREESmode after top-k optimization.COUNT_REL_TABLErewrite intact.Why
LiveJournal benchmark q06 and q07 were still executing scan-based count plans. These rewrites let the planner answer the unfiltered degree/count shapes from CSR metadata instead of scanning all
followsedges.Validation
cmake --build build/release --target lbug_shellbuild/release/tools/shell/lbug :memory: -b -i ../live-journal-benchmark/icebug_disk/schema.cypher:4004103in ~0.94ms executing.69362378in ~0.09ms executing.