Development of a deployment platform for ONNX models
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Artificial Intelligence models and specifically Machine Learning models are experiencing increasing adoption in various fields and domains. Consequentially the demand for efficient deployment solutions is becoming urgent. Ensuring seamless model management, reliable deployment and fast inference remains a key challenge. This work presents a solution to the mentioned problem. The solution is a platform for ONNX model deployment, providing a streamlined approach to model versioning, metadata management, and inference execution. To enable the efficient model file storage and their associated metadata, the platform leverages MongoDB alongside GridFS. Additionally the platform manages model versioning, where each model version is stored as a seperate entry, enabling multiple versions of a model to exist without having to delete previous versions. Deployment and inference are tested using performance metrics, like resource utilization and speed. Usability and robustness though are evaluated through structured test cases and user feedback. All in all the goal is to develop a prototype of the platform quickly utilizing rapid prototyping, while iteratively evaluating it with the help of design science. Deployment results indicate efficient resource utilization and rapid inference, with challenges in scalability, especially for large models. Usability testing confirms an intuitive interface, ease of use and general user satisfaction. Robustness testing shows that the platform handles unexpected scenarios effectively without failures, while remaining operable and avoiding complete crashes. Finally the platform successfully addresses ONNX model deployment challenges, while maintaining ease of use, even for non-technical users. Future enhancements could include enhanced model versioning, inference optimizations and integration with external platforms.