Update readme

jplitza · jplitza · commit dcacf813f5c4 · 2025-07-31T09:51:03.000+02:00
diff --git a/README.md b/README.md
@@ -6,7 +6,7 @@ It allows you to index your content into the usual Nextcloud database.
 
 ## Compatibility
 
-The extension requires your Nextcloud database to be MySQL (tested) or PostgreSQL (currently untested). SQLite might work as well, but isn't yet implemented.
+The extension requires your Nextcloud database to be MySQL or PostgreSQL.
 
 ## Status
 
@@ -17,21 +17,24 @@ What works:
 * Indexing of text in PDF documents
     * This is done by extracting the text via [Smalot/PdfParser].
     * This app itself does *NOT* do optical chracter recognition (OCR)! If your files don't already contain the extracted text, maybe the [files_fulltextsearch_tesseract] app is for you. I haven't tested it together with this app.
-* MySQL
+* MySQL (tested in CI pipeline and in real world usage)
+* PostgreSQL (tested in CI pipeline)
+    * Plainly assumes "english" configuration (which influences stopwords and normalization)
 * Basic searching
-    * If the database is MySQL, it uses [Boolean Full-Text Searches], so you can use operators like `+`  and `-`, as well as a trailing `*` wildcard
+    * If the database is MySQL, it uses [Boolean Full-Text Searches], so you can use operators like `+` and `-`, as well as a trailing `*` wildcard
+    * If the database is PostgreSQL, the query is converted using [`websearch_to_tsquery`], so you can use `-` for exclusions and quote text to enforce word groups
 * Passing the `occ fulltextsearch:test` harness
 
 [Smalot/PdfParser]: https://github.com/Smalot/PdfParser
 [files_fulltextsearch_tesseract]: https://github.com/nextcloud/files_fulltextsearch_tesseract
+[Boolean Full-Text Searches]: https://dev.mysql.com/doc/refman/8.4/en/fulltext-boolean.html
+[`websearch_to_tsquery`]: https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES
 
 What does *NOT* work:
 * Indexing of Office documents: The upstream [fulltextsearch_elasticsearch] app simply passes the files on to the [Elasticsearch Attachment processor], which in turn uses [Apache Tika] for processing. Since I want to keep this app lean, I don't want to pull in any Java dependencies.
-* "Advanced" features of the full text search framework. There are fields for tags, metatags, subtags, parts, excerpts and whatnot. I have no idea yet what they are used for. The app just stores them on indexing and returns them in search results, but doesn't search those fields.
-* PostgreSQL: Could work, but I haven't tested it. Might need small fixes, and plainly assumes "english" configuration (which influences stopwords and normalization).
+* "Advanced" features of the full text search framework. There are fields for tags, metatags, subtags, parts and whatnot. I have no idea yet what they are used for. The app just stores them on indexing and returns them in search results, but doesn't search those fields.
 * SQLite: Might be implementable, but I haven't spent more time than a quick search for "fulltext search sqlite"
 
 [fulltextsearch_elasticsearch]: https://github.com/nextcloud/fulltextsearch_elasticsearch
-[Boolean Full-Text Searches]: https://dev.mysql.com/doc/refman/8.4/en/fulltext-boolean.html
 [Elasticsearch Attachment processor]: https://www.elastic.co/docs/reference/enrich-processor/attachment
 [Apache Tika]: https://tika.apache.org/