Skip to content

Commit 32c8373

Browse files
committed
Made some edits
1 parent 6aed6ac commit 32c8373

1 file changed

Lines changed: 18 additions & 16 deletions

File tree

src/data_analysis/book_sales_analysis/about_apriori.clj

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -44,12 +44,14 @@
4444

4545
;; The transformation from raw orders to an analysis-ready format was crucial. Using Tablecloth, the transformation pipeline was surprisingly readable:
4646

47-
^:kindly/hide-code ;; FIXME this is nonsense
47+
;; ❗ FIXME this is too simplified, I have to change this ❗
48+
49+
^:kindly/hide-code
4850
(kind/code
4951
";; From customer orders with book lists...
5052
(-> orders
5153
(tc/group-by :zakaznik) ;; Group by customer
52-
(tc/aggregate ;; Aggregate their purchases
54+
(tc/aggregate ;; Aggregate their purchases
5355
{:books #(distinct-books %)})
5456
;; ...to binary matrix where each column is a book")
5557

@@ -85,12 +87,9 @@
8587
:xaxis {:tickangle 45}
8688
:margin {:l 200 :b 50}
8789
:width 800 :height 600
88-
:shapes [{:type "rect"
89-
:x0 -0.5 :y0 -0.5
90-
:x1 80 :y1 80
90+
:shapes [{:type "rect" :x0 -0.5 :y0 -0.5 :x1 80 :y1 80
9191
:line {:color "yellow" :width 3}}]
92-
:annotations [{:x 40 :y 80
93-
:text "Recently published books show <br>much stronger co-purchase patterns"
92+
:annotations [{:x 40 :y 80 :text "Recently published books show <br>much stronger co-purchase patterns"
9493
:showarrow true :arrowhead 2 :arrowsize 1 :arrowwidth 2 :arrowcolor "yellow"
9594
:ax 60 :ay -60
9695
:font {:size 12 :color "black"}
@@ -145,9 +144,9 @@
145144

146145
scatter-plot
147146

148-
;; Foreign bestsellers (marked in orange) showed consistently higher correlations with other books. They had broad appeal and were purchased alongside many different titles. Czech authors (in blue), however, showed lower correlations, suggesting their readers were more focused. Many customers would buy just one Czech title, often using it as a "gateway" into our catalog, while foreign bestsellers were part of larger, more diverse purchases.
147+
;; Foreign bestsellers (marked in orange) showed consistently higher correlations with other books. They had broad appeal and were purchased alongside many different titles. Czech authors (in blue), however, showed lower correlations, suggesting their readers were more focused. Many customers would buy just one Czech title, often using it as a "gateway" into our catalog (strongly supported by Czech author's local campaigns), while foreign bestsellers were part of larger, more diverse purchases.
149148

150-
;; This insight immediately changed our marketing approach. We stopped using generic cross-sell recommendations for Czech authors and instead focused on building author-specific communities through social media campaigns.
149+
;; This insight immediately changed our marketing approach. We stopped using generic cross-sell recommendations for Czech authors and instead focused on building author-specific communities and started to cooperate with easily reachable Czech authors on cross-selling approach.
151150

152151
;; ## The Limitation: Correlations Weren't Enough
153152

@@ -245,6 +244,8 @@ scatter-plot
245244

246245
;; This visualization reveals clusters of books that customers buy together, forming natural "reading paths" through our catalog. The thickness of edges represents lift (stronger associations), while node darkness indicates support (popularity).
247246

247+
;; (Remember this data comes from part of the dataset with particular parameters and tresholds.)
248+
248249
;; ## From Analysis to Production
249250

250251
;; The final piece was building a prediction function that could recommend books based on a customer's purchase history:
@@ -276,13 +277,13 @@ scatter-plot
276277
:min-confidence 0.1)
277278
8))
278279

279-
;; These recommendations are now powering a new "Customers Also Bought" section on our website, complementing our existing "Topically Similar" recommendations with data-driven insights.
280+
;; These recommendations are now powering a new "Customers Also Bought" section on our website (even still in "manual" mode :), complementing our existing "Topically Similar" recommendations with data-driven insights.
280281

281282
;; ## Why This Matters for the Clojure Community
282283

283284
;; This project demonstrates several strengths of Clojure and the SciCloj ecosystem for real-world data science:
284285

285-
;; **1. Readable transformations:** The threading macro (`->`) made complex data pipelines read like narratives. Each step tells a story, making the code understandable to both technical and business stakeholders.
286+
;; **1. Readable transformations:** The threading macro (`->`) made complex data pipelines read like narratives. Each step tells a story, making the code understandable to both technical and even business stakeholders.
286287

287288
^:kindly/hide-code
288289
(kind/code
@@ -296,7 +297,7 @@ scatter-plot
296297

297298
;; **3. A complete stack:** From data manipulation (Tablecloth) to visualization (Tableplot) to presentation (Clay and Kindly), the SciCloj ecosystem provided everything I needed without leaving Clojure.
298299

299-
;; **4. Production-ready code:** The same code that powers my exploratory analysis can run in production, generating live recommendations for our website.
300+
;; **4. Production-ready code:** The same code that powers my exploratory analysis can be run in production later, generating live recommendations for our website (I hope!).
300301

301302
;; ## The Impact
302303

@@ -305,21 +306,22 @@ scatter-plot
305306
;; - We are already stopping less effective cross-selling campaigns and starting target author communities
306307
;; - Our website now features more data-driven "Customers Also Bought" recommendations
307308
;; - We use these insights to optimize B2B offers for corporate clients
308-
;; - Social media campaigns are more targeted based on purchase pattern clusters
309+
;; - Our social media campaigns are starting to be better targeted based on purchase pattern clusters
309310

310311
;; Most importantly, I learned that you don't need a data science team or expensive tools to extract value from your data. With curiosity, the right tools, and a supportive community (shout out to the SciCloj folks on Zulip!), even a beginner can turn raw data into actionable insights.
311312

312313
;; ---
313314

314315
;; ## About the Author
315316

316-
;; **Tomáš Baránek** is a publisher at Jan Melvil Publishing and co-founder of Servantes, developing software for publishers worldwide. He's a computer science graduate exploring Clojure and data science, learning by doing on real publishing challenges.
317+
;; **Tomáš Baránek** is a publisher at [Jan Melvil Publishing](https://www.melvil.cz) and co-founder of [Servantes](https://www.servant.es), developing software for publishers worldwide. He's a computer science graduate, Clojure enthusiast exploring data science, learning by doing on real publishing challenges.
317318

318319
;; **Resources:**
320+
;; - Author: https://barys.me
319321
;; - Full presentation code: [github.com/tombarys/??](https://github.com/tombarys/?)
320322
;; - SciCloj community: [scicloj.github.io](https://scicloj.github.io)
321-
;; - Connect: tomas@barys.me
323+
;; - Connect: tom@barys.me
322324

323325
;; ---
324326

325-
;; *This article is based on a presentation at Macroexpand conference, October 2025.*
327+
;; *This article is based on a presentation at Macroexpand conference, October 2025.*

0 commit comments

Comments
 (0)