feat: add JSON-LD for episodes, grouped episodes and calendar#1298
feat: add JSON-LD for episodes, grouped episodes and calendar#1298Ziedelth wants to merge 1 commit into
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the application's search engine optimization by implementing JSON-LD structured data for episode, grouped episode, and calendar entries. It introduces a dedicated builder service to construct the JSON-LD objects, integrates this generation into existing data transformation processes, and ensures the structured data is rendered on relevant web pages. This change aims to provide richer search results and better visibility for episode content. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces JSON-LD for episodes, grouped episodes, and calendar entries, which is a valuable addition for SEO. However, it also introduces a Cross-Site Scripting (XSS) vulnerability because JSON strings are rendered directly into <script type="application/ld+json"> tags in several templates without proper HTML escaping by the Gson serializer. This could allow malicious content in fields like titles or descriptions to execute arbitrary JavaScript. Beyond this critical security concern, the implementation has a few areas for improvement, including a bug in JsonLdBuilder where requiresSubscription is always set to true, duplicated and fragile logic for determining subscription platforms, an opportunity to improve exception handling by being more specific, and a logic bug in StringUtils for generating weekly episode strings.
| </a> | ||
|
|
||
| <#if isReleased && release.jsonLd??> | ||
| <script type="application/ld+json">${release.jsonLd}</script> |
There was a problem hiding this comment.
The release.jsonLd string is rendered directly into a <script> tag without escaping. Since the JSON is generated using Gson with disableHtmlEscaping() (in ObjectParser.kt), it may contain raw </script> tags if the anime title or description contains such a sequence. This allows an attacker to terminate the JSON-LD block and inject a malicious script.
To remediate this, you should either remove .disableHtmlEscaping() from the Gson configuration in ObjectParser.kt or escape the output in the template using ${release.jsonLd?html}.
| </a> | ||
|
|
||
| <#if episodeMapping.jsonLd??> | ||
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> |
There was a problem hiding this comment.
| </a> | ||
|
|
||
| <#if groupedEpisode.jsonLd??> | ||
| <script type="application/ld+json">${groupedEpisode.jsonLd}</script> |
There was a problem hiding this comment.
| </a> | ||
|
|
||
| <#if episodeMapping.jsonLd??> | ||
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> |
There was a problem hiding this comment.
| </div> | ||
|
|
||
| <#if episodeMapping.jsonLd??> | ||
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> |
There was a problem hiding this comment.
| category = if (source.platform.url.contains("netflix") || source.platform.url.contains("disneyplus") || source.platform.url.contains("primevideo")) "subscription" else "free", | ||
| availabilityStarts = episode.releaseDateTime, | ||
| availabilityEnds = null, | ||
| requiresSubscription = true, // Simplified |
There was a problem hiding this comment.
The requiresSubscription property is hardcoded to true, which is incorrect for platforms that offer free content. This will lead to inaccurate structured data for search engines.
Additionally, the logic to determine if a platform is subscription-based is duplicated in three different build functions (EpisodeMappingDto, GroupedEpisodeDto, WeeklyAnimeDto) and is fragile because it relies on string matching parts of URLs.
To fix the immediate bug and improve the code, you should determine the subscription status once and use it for both category and requiresSubscription. Consider refactoring this logic to a more robust and centralized solution in the long term, perhaps by adding a property to the platform DTO itself.
Here's a suggested fix for this block:
val isSubscription = listOf("netflix", "disneyplus", "primevideo").any { source.platform.url.contains(it) }
// ...
category = if (isSubscription) "subscription" else "free",
// ...
requiresSubscription = isSubscription,Please apply a similar fix to the other occurrences of this logic.
|
|
||
| fun toWeeklyEpisodeString(episode: WeeklyAnimeDto, showSeason: Boolean = true, separator: Boolean = true): String { | ||
| val season = episode.mappings?.firstOrNull()?.season?.toString() ?: "" | ||
| val number = if (episode.minNumber != null && episode.maxNumber != null && episode.minNumber != episode.maxNumber) "${episode.minNumber} - ${episode.maxNumber}" else episode.number?.toString() ?: "" |
There was a problem hiding this comment.
There is a logic bug here for determining the episode number string. When episode.minNumber and episode.maxNumber are equal, the else branch is executed, which falls back to episode.number. If episode.number is null, this results in an empty string instead of the correct episode number. For example, if minNumber=10, maxNumber=10, and number=null, the result is "" instead of "10".
Using a when statement would make this logic clearer and fix the bug:
val number = when {
episode.minNumber != null && episode.maxNumber != null -> {
if (episode.minNumber == episode.maxNumber) episode.minNumber.toString() else "${episode.minNumber} - ${episode.maxNumber}"
}
episode.number != null -> episode.number.toString()
else -> ""
}| startDate = episode.releaseDateTime, | ||
| dateModified = episode.lastUpdateDateTime | ||
| ) | ||
| } catch (_: Exception) { |
There was a problem hiding this comment.
Catching a generic Exception is too broad and can mask other unexpected errors. It's better practice to catch only the specific exceptions you anticipate. In this case, since you are calling toInt() on episode.season which can be a range string (e.g., "1 - 2"), the expected exception is NumberFormatException.
| } catch (_: Exception) { | |
| } catch (_: NumberFormatException) { |
Qodana for JVMIt seems all right 👌 No new problems were found according to the checks applied ☁️ View the detailed Qodana report Contact Qodana teamContact us at qodana-support@jetbrains.com
|
79ede52 to
53a4964
Compare
53a4964 to
c3158c3
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces JSON-LD structured data for episodes, grouped episodes, and calendar entries to improve SEO. However, a high-severity Cross-Site Scripting (XSS) vulnerability was identified in how the JSON-LD is generated and rendered, allowing for script injection via user-controlled fields. It is strongly recommended to enable HTML escaping in the JSON utility to ensure characters like < are safely encoded as \u003c. Additionally, a critical issue was found in the templates where the JSON-LD content is not correctly rendered, which would break the feature, and minor code improvements were suggested in the JsonLdBuilder.
| </#if> | ||
|
|
||
| <#if !isReleased && release.anime.jsonLd??> | ||
| <script type="application/ld+json">${release.anime.jsonLd}</script> |
There was a problem hiding this comment.
The JSON-LD content will be HTML-escaped by FreeMarker by default, which will break the JSON structure (e.g., " becomes "). You should disable escaping for this variable to ensure the JSON is rendered correctly.
| <script type="application/ld+json">${release.anime.jsonLd}</script> | |
| <script type="application/ld+json">${release.anime.jsonLd?no_esc}</script> |
| </a> | ||
|
|
||
| <#if episodeMapping.jsonLd??> | ||
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> |
There was a problem hiding this comment.
The JSON-LD content will be HTML-escaped by FreeMarker by default, which will break the JSON structure (e.g., " becomes "). You should disable escaping for this variable to ensure the JSON is rendered correctly.
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> | |
| <script type="application/ld+json">${episodeMapping.jsonLd?no_esc}</script> |
| </a> | ||
|
|
||
| <#if groupedEpisode.jsonLd??> | ||
| <script type="application/ld+json">${groupedEpisode.jsonLd}</script> |
There was a problem hiding this comment.
The JSON-LD content will be HTML-escaped by FreeMarker by default, which will break the JSON structure (e.g., " becomes "). You should disable escaping for this variable to ensure the JSON is rendered correctly.
| <script type="application/ld+json">${groupedEpisode.jsonLd}</script> | |
| <script type="application/ld+json">${groupedEpisode.jsonLd?no_esc}</script> |
| </a> | ||
|
|
||
| <#if episodeMapping.jsonLd??> | ||
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> |
There was a problem hiding this comment.
The JSON-LD content will be HTML-escaped by FreeMarker by default, which will break the JSON structure (e.g., " becomes "). You should disable escaping for this variable to ensure the JSON is rendered correctly.
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> | |
| <script type="application/ld+json">${episodeMapping.jsonLd?no_esc}</script> |
| </div> | ||
|
|
||
| <#if episodeMapping.jsonLd??> | ||
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> |
There was a problem hiding this comment.
The JSON-LD content will be HTML-escaped by FreeMarker by default, which will break the JSON structure (e.g., " becomes "). You should disable escaping for this variable to ensure the JSON is rendered correctly.
| <script type="application/ld+json">${episodeMapping.jsonLd}</script> | |
| <script type="application/ld+json">${episodeMapping.jsonLd?no_esc}</script> |
| </a> | ||
|
|
||
| <#if isReleased && release.jsonLd??> | ||
| <script type="application/ld+json">${release.jsonLd}</script> |
There was a problem hiding this comment.
The code review agent suggests disabling HTML escaping (?no_esc) to prevent FreeMarker from breaking the JSON structure. However, the security review agent highlights that rendering release.jsonLd directly into a <script> tag is dangerous due to the JSON generator (ObjectParser.toJson) having HTML escaping disabled. This creates a high-severity Cross-Site Scripting (XSS) vulnerability via </script> injection if any field contains malicious content. While the primary fix should be in the JSON generator, the template must also handle this data safely.
| @@ -95,4 +140,127 @@ class JsonLdBuilder { | |||
|
|
|||
| return ObjectParser.toJson(jsonLd) | |||
There was a problem hiding this comment.
The JSON-LD strings generated here are rendered directly into HTML <script> tags in multiple templates (e.g., calendar.ftl, episodeDetails.ftl). Because ObjectParser.toJson() is configured with disableHtmlEscaping(), characters like < and > are not escaped in the resulting JSON. If an attacker can control fields such as anime names, episode titles, or descriptions, they can inject a </script> tag to break out of the JSON-LD block and execute arbitrary JavaScript (Stored XSS).
To remediate this, you should enable HTML escaping in the Gson configuration within ObjectParser.kt by removing the .disableHtmlEscaping() call. This will cause Gson to escape < as \u003c, which is safe for embedding in HTML script tags while remaining valid JSON.
| dateModified = episode.lastUpdateDateTime, | ||
| anime = anime, | ||
| duration = episode.duration, | ||
| potentialActions = episode.sources.map { buildPotentialAction(it.url, it.platform, episode.releaseDateTime, anime.countryCode?.name ?: "FR") } |
There was a problem hiding this comment.
The anime object is of type AnimeDto, which has a non-nullable countryCode property. Therefore, the safe call ?. and the elvis operator with a default value ?: "FR" are unnecessary. You can directly access anime.countryCode.name for cleaner code.
| potentialActions = episode.sources.map { buildPotentialAction(it.url, it.platform, episode.releaseDateTime, anime.countryCode?.name ?: "FR") } | |
| potentialActions = episode.sources.map { buildPotentialAction(it.url, it.platform, episode.releaseDateTime, anime.countryCode.name) } |
| dateModified = episode.lastUpdateDateTime, | ||
| anime = anime, | ||
| duration = episode.duration, | ||
| potentialActions = episode.sources.map { buildPotentialAction(it.url, it.platform, episode.releaseDateTime, anime.countryCode?.name ?: "FR") } |
There was a problem hiding this comment.
The anime object is of type AnimeDto, which has a non-nullable countryCode property. Therefore, the safe call ?. and the elvis operator with a default value ?: "FR" are unnecessary. You can directly access anime.countryCode.name for cleaner code.
| potentialActions = episode.sources.map { buildPotentialAction(it.url, it.platform, episode.releaseDateTime, anime.countryCode?.name ?: "FR") } | |
| potentialActions = episode.sources.map { buildPotentialAction(it.url, it.platform, episode.releaseDateTime, anime.countryCode.name) } |
| anime = anime, | ||
| duration = mappings.firstOrNull { it.duration > 0 }?.duration, | ||
| potentialActions = mappings.flatMap { mapping -> | ||
| mapping.sources.map { buildPotentialAction(it.url, it.platform, mapping.releaseDateTime, anime.countryCode?.name ?: "FR") } |
There was a problem hiding this comment.
The anime object is of type AnimeDto, which has a non-nullable countryCode property. Therefore, the safe call ?. and the elvis operator with a default value ?: "FR" are unnecessary. You can directly access anime.countryCode.name for cleaner code.
| mapping.sources.map { buildPotentialAction(it.url, it.platform, mapping.releaseDateTime, anime.countryCode?.name ?: "FR") } | |
| mapping.sources.map { buildPotentialAction(it.url, it.platform, mapping.releaseDateTime, anime.countryCode.name) } |
No description provided.