Skip to content

Commit 6aed6ac

Browse files
committed
Add article about apriori algorithm with anonymized data
1 parent ab39960 commit 6aed6ac

9 files changed

Lines changed: 3295 additions & 1 deletion

deps.edn

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@
1313
clj-thamil/clj-thamil {:mvn/version "0.2.0"}
1414
org.scicloj/clay {#_#_:mvn/version "2-beta54"
1515
:git/url "https://github.com/scicloj/clay.git"
16-
:git/sha "26fb97b5ea2f72a98f8a64d5b8b8abf705767a39"}
16+
:git/sha "c4fa35c11034614d42c48c1223d8767267cad385"}
17+
org.scicloj/kindly {:mvn/version "4-beta20"}
1718
thi.ng/geom {:mvn/version "1.0.1"}
1819
org.eclipse.elk/org.eclipse.elk.core {:mvn/version "0.10.0"}
1920
org.eclipse.elk/org.eclipse.elk.graph {:mvn/version "0.10.0"}

src/data_analysis/book_sales_analysis/about_apriori.clj

Lines changed: 325 additions & 0 deletions
Large diffs are not rendered by default.

src/data_analysis/book_sales_analysis/anonymized-all-sliced-sharing-V2.csv

Lines changed: 2001 additions & 0 deletions
Large diffs are not rendered by default.

src/data_analysis/book_sales_analysis/core_helpers_v2.clj

Lines changed: 412 additions & 0 deletions
Large diffs are not rendered by default.
203 KB
Binary file not shown.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
(ns data-analysis.book-sales-analysis.data-sources-v2
2+
(:require [tablecloth.api :as tc]
3+
[data-analysis.book-sales-analysis.core-helpers-v2 :as helpers]))
4+
5+
;; ## Central Data Loading
6+
;; All CSV datasets are loaded here with consistent names and simple tc/dataset calls
7+
8+
;; ### Main Orders Data (WooCommerce exports)
9+
10+
#_(def anonymized-presentation-ds
11+
"Anonymized customers full dataset of all orders - for presentation purposes
12+
❗ NOT FOR SHARING"
13+
(tc/dataset
14+
(helpers/merge-csvs
15+
["data/anonymized-customers-only-presentation-v2.csv"]
16+
{:header? true
17+
:separator ","
18+
:key-fn #(keyword (helpers/sanitize-str %))})))
19+
20+
(def anonymized-shareable-ds
21+
"Fully anonymized slice of db - for sharing purposes
22+
✅ SAFE TO SHARE"
23+
(tc/dataset
24+
(helpers/merge-csvs
25+
["src/data_analysis/book_sales_analysis/anonymized-all-sliced-sharing-V2.csv"]
26+
{:header? true
27+
:separator ","
28+
:key-fn #(keyword (helpers/sanitize-str %))})))
29+
30+
;; ### Book Metadata
31+
32+
#_(def enriched-book-metadata-ds
33+
(helpers/enrich-metadata-csv "data/summary-all-time-all-books-300725.csv"))
34+
35+
;; ## Quick Access
36+
;; Most commonly used datasets with short aliases
37+
38+
(def orders-share anonymized-shareable-ds)
39+
#_(def orders-slides anonymized-presentation-ds)
40+
41+
(def corr-matrix-precalculated
42+
(tc/dataset "src/data_analysis/book_sales_analysis/corr-matrix.nippy"))
43+
44+
corr-matrix-precalculated
45+
46+
(prn "Data sources loaded.")
Binary file not shown.

src/data_analysis/book_sales_analysis/market_basket_analysis_v2.clj

Lines changed: 509 additions & 0 deletions
Large diffs are not rendered by default.
Binary file not shown.

0 commit comments

Comments
 (0)