A new product that automates the process of tuning up general-purpose servers and networking components has been launched by two-year-old New York startup Concertio. The Concertio Optimizer suite of tools was released in February, and can speed up a baseline system by up to eight times in some use cases, according to the company. It's powered by an artificial intelligence engine developed in-house. The initial product is a static optimization tool aimed at performance engineers, but Optimizer Runtime (under development) is a dynamic tool that could automatically adapt the infrastructure as workload requirements change. That's when things start to become interesting.
The 451 Take
Performance engineers aren't going anywhere – at least for a while. Many have specialized skills that have taken a lifetime to learn. However, most have a fairly narrow area of expertise (CPU, networking or databases, for instance), and when they tune up a system, that is where they concentrate their efforts. Many other tunable parameters – now available in the hundreds – are ignored and left on their default settings. Operational management tools are being transformed by the addition of automation and machine learning. Cloud datacenter operators, in particular, will require dynamic, automated tools if they are to repurpose their pools of infrastructure to fit the unpredictable demands of their customers. Application vendors could offer optimizations out of the box for their products. There's also a growing opportunity in commercial companies and HPC shops as scale-out clusters, traditionally hard to tune up, gain traction.
Founded in 2016 under the name DatArcs by Technicon alums Tomer Morad (CEO) and Andrey Gelman (CTO), Concertio came out of the Jacobs Technion-Cornell Institute's Runway startup incubator program. It's part of a joint venture between Cornell Tech in New York and Technion (aka the Israel Institute of Technology) in Haifa, Israel. Morad is a technology veteran with experience in the Israel Defense Forces, followed by technical stints at Horizon Semiconductors (decoder chips) and tranSpot (digital advertising), the latter of which he cofounded and served as CTO and later CEO. Gelman, an expert in server and embedded Linux, also served as a technical officer in the Defense Forces and at Horizon on firmware. After that, he worked as a software engineer at LSI and at CompuLab.
Increasing hardware and software complexity has made it too complicated for performance engineers to cover everything. There are now hundreds of tunable parameters that can influence performance – from controlling the behavior of the CPU, peripheral devices and firmware to the operating system and user-level software. These vary depending on the exact hardware configuration and type of workload, and must be considered in context, since many settings are connected, making tuning a moving target. Today, baseline servers from the factory are sold as general-purpose systems and set up out of the box to match the widest range of workloads possible. In most cases, those tunable parameters remain on their default settings. Optimizer Studio takes advantage of machine learning technologies to tune systems for peak performance. The software suite monitors and learns from interactions between applications and systems through a built-in workload classification engine that can detect the different execution phases of a particular workload. It uses reinforcement learning to evaluate and optimize each phase, testing various configuration settings until the performance improves, and then produces a report that performance engineers and IT professionals can apply.
What sort of tunable parameters (or 'knobs' as the company calls them) are we talking about? Out-of-the-box support is provided for things such as task affinity, NUMA page migration, PCIe optimization, IO scheduler choice, task scheduling granularity, dynamic voltage and frequency scaling policy, symmetric multi-threading, and CPU last level cache prefetching. More can be added by editing configuration files to support additional components. Initially, Optimizer runs on x86 servers and can configure certain Intel CPU settings. Operating systems supported are Centos and Debian Linux. Other platforms are under evaluation. However, Optimizer extends beyond the server itself to associated networking cards; storage appliances ASICs; and software products, web-scale applications and mobile devices. It can also be applied to complete systems, including cloud and on-premises datacenters – in fact, anything that interacts with tunable parameters for optimal performance.
Optimizer Runtime, currently in beta testing, is the dynamic version of the product that works automatically and in real time. It also monitors – and learns from – interactions between applications and systems, but instead of producing a report, it can adapt and reconfigure systems dynamically as the requirements to support different workloads change over time. Using the Runtime framework – which can be embedded within appliances and applications, and used on bare metal, in VMs and in containers – tunable settings are discovered automatically, either to maximize application performance or to minimize cloud and datacenter costs by using fewer resources while keeping to the required SLAs. Optimizer Studio is priced via annual subscription on a per-user basis. Runtime is priced per device.
It's still early for Concertio, so its go-to-market strategy is still coming together. A lot depends on proving out its story, which is why the company recently set up what it calls 'an experiment' with Mellanox where a specific networking use case involving Mellanox's ConnectX-3 Pro Ethernet cards was tuned by Optimizer Studio, as well as manually by Mellanox's own performance engineers. Optimizer achieved a performance improvement of 80%, compared with 62% from the manual tuning. Concertio has also worked with New York-based bare-metal cloud provider Packet, which runs Dell, SuperMicro, Quanta and Foxconn Technology servers in its datacenters. Packet customers use the infrastructure for a broad range of activities, with many running cloud-native workloads. Users take advantage of automation to swap out servers frequently, add new hardware, or scale resources up and down based on demand. Manual tuning to optimize performance becomes impractical in these circumstances. For end users, maximizing performance or lowering cloud and datacenter costs are the main drivers. Hardware or software vendors might use Optimizer to discover the best market-ready default configurations for their specific products and target markets.
Many factors influence performance, and if application code has been badly executed or the infrastructure resources are underspecified, then optimization won't be much help. In some ways, Concertio is extending the well-established software performance testing sector with a dynamic, real-time alternative. Meanwhile, vendors are increasingly looking to add automation tools to their specific products. In servers, for instance, HPE has developed its Intelligent System Tuning suite in conjunction with Intel. In software, Microsoft has added automated optimization tools to SQL Server 2017, Oracle has Java performance-tuning tools and Cisco has recently introduced its Assurance Engine (initially more about compatibility and compliance than performance). Of course, in the client systems world, where things are much less complicated, performance-optimization tools are commodity utilities, but they typically achieve only small, incremental improvements that mostly go unnoticed.
However, there's a growing opportunity as the deployment of software-defined, converged infrastructure and hybrid systems gains momentum. There's an affinity here with declarative modeling techniques from companies such as Vnomic, which models workloads to better match the infrastructure, and with automation engines such as Ansible (now part of RedHat). Application performance monitoring tools such as AppDynamics, Dynatrace and New Relic are expanding into adjacent territories, such as clouds, containers and serverless; turning their tools into more extensible platforms; and adding support for machine learning.