Badfish gets the Asyncio treatment. With the move of QUADS to asynchronously provisioning the servers on our lab, much of the functionality from Badfish was already converted to use asyncio in the process. This has now been the first effort in the road to unifying the Badfish standalone tool with the code used from within QUADS.
The most common usage for Badfish standalone tool was against single servers, which functionality doesn’t take much advantage of an asynchronous execution of multiple actions. Although we also managed to reduce the execution time for a single server action, the greatest benefit of this modifications will be reflected with its use against multiple servers. This is the case of QUADS which makes use of Badfish for most of it’s provisioning steps. Furthermore, with the introduction of the
--host-list argument, which takes a plain text file with a list of servers FQDNs, we can now appreciate significant improvements in execution time for actions to be committed against multiple servers at once.
One of the greatest pain points we managed to identify in the performance metrics of Badfish were the repeated HTTP get requests to the Redfish API. Some methods were requesting data from the same entry point that was returning the same data every time it was called. async_lru to the rescue.
With a simple decorator on the base method for all our Redfish API GET requests, now all the calls to this method which have the same parameters, in our case
uri, will now be executed only the first time it’s called, it’s results stored in the alru_cache, and any subsequent calls to pick up the response from this methods cache instead of sending yet another HTTP request to the server.
The logging conundrums
As we mentioned on our previous post, Asyncio is in charge of releasing the python execution lock from heavy I/O tasks and in this case the python logging library is not an asynchronous friendly component as it operates within the realms of I/O.
In comes the queue. Since Python 3.2, a new handler was introduced to the logging library, the QueueHandler class, which comes along with the QueueListener. This QueueListener is now in charge of starting its own thread to watch a queue and send records to the logging handlers it manages. On the other hand, the QueueHandler pushes records into the queue. This now makes the logging I/O entirely non-blocking.
An issue we had to address now was making sense of the disordered output. With the now asynchronous execution of multiple actions on multiple servers, the output returned to the stdout comes out all scrambled. For a better readability of those, we have included a prefix tag to the log messages which indicates the hostname for which the action is being executed. Additionally, we have included a summary of all tasks with hostname and final status for the action at the completion of all tasks which reads SUCCESSFUL or FAILED accordingly. Note that the prefix hostname on the output logs will only be visible when running with
WARNING: The following graphs might be disturbing to some readers.
Taking into consideration the aforementioned improvements and the previous lack of those, the performance results we got are outstanding.
These results illustrated here show a magnificent improvement in our execution time against multiple servers. It’s also worth to mention the performance improvement for a single node which is mostly benefiting from the alru_cache and some additional re-factoring.
The future of Badfish
Along with these changes we also included the use of setuptools for the independence of Badifsh, which can now be installed via the setup.py script. In the future we would like the Badfish code to be completely removed from inside the QUADS code base and for it to be brought by as an external library dependency to be imported as either an RPM or a PyPI package. We expect with these changes to encourage other projects to integrate Badfish into their code base as a handy tool, verified and tested, for wrangling an heterogeneous set of servers.