fix: audio api endpoint filetype check

RFC2046 allows the Content-Type field to have additional parameters after the main type/subtype information (Section 1). Following RFC4281, many applications put codec information inside parameters in the Content-Type. This is especially common for formats that support many codecs, such as Ogg (RFC5334, Section 4). The `/api/audio/transcriptions` endpoint is currently rejecting files that contain parameters in the Content-Type field with a bad request error. This commit changes the current check in order to accept any Content-Type field that begins with a supported type/subtype as listed in the `supported_filetypes` tuple. Since Content-Type here is provided by the user, I believe this check is meant to prevent honest mistakes, like posting a PDF to an audio processing endpoint, not as a security measure against possibly malicious use. Therefore, I think it's OK not to validate the rest of the field.
2025-03-27 02:02:31 +01:00 · 2025-03-08 17:29:59 -03:00 · 2025-03-08 17:29:59 -03:00 · e936d7b53d
commit e936d7b53d
parent 3b70cd64d7
1 changed files with 3 additions and 1 deletions
--- a/backend/open_webui/routers/audio.py
+++ b/backend/open_webui/routers/audio.py
@ -625,7 +625,9 @@ def transcription(
 ):
    log.info(f"file.content_type: {file.content_type}")

-    if file.content_type not in ["audio/mpeg", "audio/wav", "audio/ogg", "audio/x-m4a"]:
+    supported_filetypes = ("audio/mpeg", "audio/wav", "audio/ogg", "audio/x-m4a")
+
+    if not file.content_type.startswith(supported_filetypes):
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST,
            detail=ERROR_MESSAGES.FILE_NOT_SUPPORTED,