Don't waste another minute with repetitive tasks and focus on improving your IT business! Use ActiveEon, award winning SOAP software & ML data pipeline automation.
ProActive Workflows & Scheduling
Drive efficiency, reduce costs and automate operations from simple tasks to complex processes.
ProActive AI Orchestration
Industrialize your processes, accelerate the AI deployment and improve your AI success rate.
Discover all new features, expansions, and hot fixes as released in the products' latest versions.
Orchestrate automation across all ERP, ITSM, ETL/ELT, and across your hydrid environment.
Why ActiveEon is Better & Faster
ActiveEon provides the most advantageous, efficient and fastest solutions.
Powerful and easy workflow management software for IT automation.
Automatically execute IT and business processes on hybrid cloud and on-premises resources, provide self-service portal and reduce support cost.
Hybrid IT Enabler
Maintain a centralized approach to IT governance while exploiting the benefits of cloud computing.
Combine grid computing, parallel computing, and distributed system to outperform your existing results.
Enable process automation across your entire enterprise and hybrid cloud infrastructure.
Minimize the complexity of AI with repeatable & scalable machine learning life-cycle.
Provides a center of excellence for the production of your AI models.
Higher Education & AI
Use your existing ICT infrastructure and improve AI/ML delivery with ActiveEon.
We provide the most reliable tools to make sure you benefit from a smooth and successful migration that matches your needs and gives you a competitive edge.
ActiveEon Smart Migration Process
Learn more about how to easily switch from your old legacy automation solution to ActiveEon.
Migrate from Dollar Universe
Find out how ActiveEon can help you migrate from Dollar Universe and advantageously improve your process automation.
Alternative to Automic
You might consider replacing Broadcom Automic with a more modern automation tool. Factors such as cost or support for customization could drive a decision to switch to a different solution.
Alternative to Airflow
While Apache Airflow is a popular and feature-rich software, users may have specific needs that may not be adequately addressed by the tool.
Discover our dedicated SaaS platform for Job Scheduling and Machine Learning.
Take a look at our documentation to understand the technical aspects of our Workflows & Scheduling and Machine Learning solutions.
Learn how to use our ProActive Workflows & Scheduling and ProActive AI Orchestration in 5 training sessions.
Join our dynamic and supportive community!
Learn more about workload orchestration, job scheduling, IT automation and machine learning!
Explore our use cases and see how we can help your business.
Discover our customers and references.
Download our whitepapers and get free expert insights!
Browse our video library to watch testimonials, demos, use cases and more!
Learn to use our solutions step-by-step.
ActiveEon is the new automation solution of choice. Learn about who we are and what we stand for.
ActiveEon displaces legacy solutions in worklflow automation by designing state-of-the-art software solutions. See more about our company vision and values.
Explore our latest news, events, press releases, webinars and more!
See where we're located.
We believe that great companies are made of great people. Join our international team and become part of a fast-growing and innovative company!
Get in touch if you have any question, concern or ideas to share with us!
How do we compare to our competition?
Get information about our competitors and our migration tools.
Industrialize LLMs at Scale with ProActive
Jul 18, 2023 from Activeeon
Large language models (LLMs) have emerged as powerful tools that can comprehend and generate human-like text. These models, such as OpenAI’s ChatGPT, have opened up new possibilities in natural language processing, enabling applications like chatbots, text generation, and language translation to reach remarkable levels of sophistication.
However, taking advantage of the full potential of LLMs requires more than just building the models themselves. The deployment and monitoring of these complex AI systems present unique challenges. Deploying LLMs involves integrating them into real-world applications and ensuring their seamless operation. It requires addressing concerns such as infrastructure requirements, resource management, and scalability. Additionally, monitoring LLMs is crucial to ensure their performance, and maintain data privacy and security, that’s where ProActive AI Orchestration and its MLOps Dashboard come into play.
ProActive AI Orchestration offers a comprehensive solution for deploying and managing LLMs at scale. This powerful platform provides an environment that streamlines the deployment process, automates resource allocation, and optimizes the utilization of computational resources. With ProActive AI Orchestration, organizations can efficiently deploy LLMs in production environments and leverage their capabilities to drive innovation and enhance customer experiences.
The MLOps Dashboard, an integral component of ProActive AI Orchestration, allows both small and large companies to monitor and manage LLMs effectively. It provides a centralized interface for tracking performance metrics, visualizing resource utilization, and detecting potential issues in real-time. Furthermore, it includes features specifically designed to manage and monitor the underlying model servers.
In this blog, we will explore how to deploy and monitor LLMs using ProActive AI Orchestration and its MLOps Dashboard. We will explore best practices, discuss real-world use cases, and highlight the benefits of adopting this powerful solution. Whether you are a data scientist, a software engineer, or an AI enthusiast, this blog will provide you with practical insights to release the full potential of LLMs in your AI applications.
Hugging Face plays an important role as a key enabler for Generative AI use cases, offering pre-trained Large Language Models (LLMs) in a diverse range of applications. For our blog, we use the pre-trained model GPT-2 from Hugging Face. GPT-2 is a transformer-based language model that has been trained on a massive amount of English text data. It was specifically trained to predict the next word in a sentence given the preceding context . We integrate the code for downloading the GPT-2 model from Hugging Face into our ProActive Workflows Studio. It is specifically designed for creating robust and scalable workflows. With its intuitive visual interface, users can effortlessly design complex workflows by simply dragging and dropping predefined components onto a canvas and connecting them together. By leveraging this powerful tool, you can streamline the development process and create robust workflows that are ready for deployment in a production environment.
Figure 1. Model deployment from Hugging Face using the ProActive Workflows Studio.
Figure 2. Launching GPT-2 model deployment using ProActive Workflows Studio.
We are now monitoring the GPT-2 model using the MLOps dashboard. It provides three distinct tabs - Model Servers Monitoring, Models Resource Usage, and Dashboard Resource Usage.
The “Model Servers Monitoring” tab focuses on overseeing the health and performance of the model servers or serving infrastructure. It consists of two main components: widgets and a table listing the model servers and their characteristics.
Figure 3. Model Servers and Models Monitoring.
The first component of the tab includes six main widgets that offer valuable insights into the overall performance and usage of the serving infrastructure. These widgets cover aspects such as the number of running model servers, GPUs utilization, deployed models count, inference times, and inference rates. The second component of the tab features a detailed table that lists the model servers along with their specific characteristics. This table provides comprehensive information about each model server instance, including its ID, status, start time, node information, GPU allocation, model registry location, and more.
The “Models Resource Usage” tab in the MLOps monitoring dashboard provides users with valuable insights into CPU and GPU resource utilization, enabling them to make decisions and optimize their system’s resource consumption.
Figure 4. Model Resource Usage.
The first part of the tab features ten widgets that offer real-time metrics, such as average CPU and GPU utilization, memory consumption, and available memory. These widgets provide users with a comprehensive overview of resource usage, empowering them to monitor performance trends, identify bottlenecks, and optimize resource allocation effectively. The second part of the tab presents graphs that provide detailed time series data for each model server, focusing on CPU and GPU utilization, memory usage, and power consumption.
The “Dashboard Resource Usage” tab in the monitoring dashboard provides users with valuable insights into the resource consumption of the entire system. This tab offers a comprehensive overview of CPU utilization, memory consumption, disk memory usage, and network traffic, enabling users to effectively monitor and optimize resource allocation.
Figure 5. Dashboard Resource Usage.
The first part of the tab focuses on providing key metrics that reflect the overall system resource consumption. These metrics include CPU utilization, memory consumption, total available memory, used memory, and free memory. By monitoring these metrics, users can assess the workload of the system, ensure sufficient memory resources are available, and identify any performance issues or bottlenecks. The second part of the “Dashboard Resource Usage” tab features time series graphs that provide deeper insights into CPU utilization, memory usage, disk memory, and network traffic. These graphs offer a visual representation of the system’s resource consumption over time, facilitating trend analysis and identification of potential anomalies.
Once deployed, the GPT-2 model becomes accessible via REST APIs for consumption by any program or web application. This flexibility enables users to seamlessly integrate the model into their existing systems and leverage its power for various tasks and use cases. This integration allows users to leverage the power of their deployed machine learning models and make real-time predictions.
In this blog, we demonstrate the consumption of the GPT-2 model by a Streamlit app developed by us . This app was developed with the purpose of demonstrating how to consume the GPT-2 model deployed in production. The app also serves as a template or example of how to build a client that can consume any LLM deployed on the MLOps Dashboard.
When consuming the model, there are two mandatory parameters that the application must consider: the Inference Endpoint and the Model Name. These parameters are crucial for establishing the connection and accessing the deployed model. The Inference Endpoint specifies the URL or endpoint where the model is hosted, while the Model Name identifies the specific model to be used for generating predictions.
Figure 6. Real-time GPT-2 predictions using Streamlit app.
The Streamlit app provides a user-friendly interface where users can input data and obtain predictions generated by the deployed GPT-2 model. With its intuitive design and seamless integration, the app enables users to interact with the model effortlessly, making it accessible to both technical and non-technical individuals.
Through the Streamlit app, users can conveniently enter their desired data, whether it’s text, numerical values, or any other relevant information. The app then leverages the deployed GPT-2 model to process this input and generate insightful predictions. These predictions can cover a wide range of tasks, such as text generation, sentiment analysis, language translation, and more, depending on the specific capabilities of the GPT-2 model that has been deployed.
In summary, the Streamlit app simplifies the interaction with the model, enabling users to input data effortlessly and obtain accurate predictions in a user-friendly manner. With its intuitive interface, real-time feedback, and customization options, the app empowers users to unlock the full potential of the deployed GPT-2 model for a wide range of tasks and applications.
For more information and detailed guidance, please refer to ProActive AI Orchestration documentation or contact us.
See also: Machine Learning Operations (MLOps)
Oct 4, 2023 from Activeeon
Batch processing is used to execute many tasks or non-interactive jobs at once. It is used where a fast response time is not critical or for very large files....
In this blog, we will address the complexity of deploying and monitoring LLMs and show you how to simplify it using ProActive AI Orchestration and its MLOps Dashboard....
Jun 23, 2023 from Activeeon
Dans cet article, nous allons présenter le tableau de bord AI Orchestration et ses fonctionnalités, caractéristiques et interfaces. Le tableau de bord se compose de 3 onglets principaux....
All our articles