Feature #7241
openOptimize PDF Processing Pipeline and Large File Uploads to AWS S3
Description
Currently, a single API endpoint is responsible for handling multiple operations, including:
Downloading a PDF file temporarily from AWS S3
Reading and extracting data from the PDF using a Python script
Processing extracted data (e.g., BOM extraction, matching, persistence)
Handling additional unrelated events within the same request lifecycle
This design has led to high API response times, increased memory usage, and poor scalability.
Additionally, large file uploads to AWS S3 are not optimized, resulting in slow uploads and potential failures under load.
Subtasks
Related issues