Skip to content

Commit a6379fe

Browse files
piece-of-work
1 parent 4963566 commit a6379fe

1 file changed

Lines changed: 46 additions & 0 deletions

File tree

docs/data_flow.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
2+
# Data flow analysis in stackql
3+
4+
Data flow analysis is impplmented as multiple passes on:
5+
6+
- An inital abstract syntax tree (AST) from the parser.
7+
- Annotated derivatives of the AST.
8+
- `any-sdk` `{ provider, service, resource, method, schema... }` graphs.
9+
- `gonum` DAG adaptations with data flow dependencies representing edges.
10+
11+
Some other aspects of data flow analysis:
12+
13+
- Relational algebra is implemented in a coupled RDBMS (embedded `sqlite` or `postgres` over TCP). There is a query rewriting process to stringify "containers" for this.
14+
- There are `transaction control counter` objects and corresponding RDBMS columns to bound relational algebra "containers" and future proof for gargage collection. Some mutex protection is in place.
15+
- Views in `stackql` permit clobbering of where clause arguments from outside the view. The canonical case is a document-based view in a provider document. A good example are in [test/registry/src/aws/v0.1.0/services/pseudo_s3.yaml](/test/registry/src/aws/v0.1.0/services/pseudo_s3.yaml)at `...s3_bucket_list_and_detail.config.views.select`; one can overwrite `region` here.
16+
- Views, subqueries, materialized views and user space tables are modelled as "indirections".
17+
18+
19+
## Open Issues
20+
21+
## Indirection Data Flow Analysis and Query Execution
22+
23+
Data flow analysis for indirections is not composable:
24+
25+
- It it impossible to join heterogenous collections of these with each other or conventional resources. There is no recusrsive and stable data flow analysis.
26+
- While `stackql` does have a `max depth` parameter, I do not believe it is stable enfoced eagerly. Ie: queries too complex should fail at analysis time. Cannot remember param name of=r default.
27+
28+
The expected fix for this issue:
29+
30+
- Joins, unions etc on indirections work to arbitrary and configurable depth. For depth violations, failure is eager in the analysis phase and error message is plain and in the canonical err stream already widely used.
31+
- Data flow analysis includes assurance on reuired poarams and viability of projections, joins, etc.
32+
- Support for CTEs internal to these indirections is in place.
33+
- Mocked robot tests are added to the canonical test suite, covering off this function.
34+
35+
36+
## Glossary of terms
37+
38+
| Term | Expansion |
39+
|---|---|
40+
| AST | Abstract Syntax Tree |
41+
| CTE | Common Table Expression |
42+
| DAG | Directed Acyclic Graph |
43+
| GC | Garbage Collection |
44+
| RDBMS | Relational Database Management System |
45+
| TCP | Transmission Control Protocol |
46+
| | |

0 commit comments

Comments
 (0)