Training neural networks takes a long time, even with the fastest and most expensive accelerators on the market. Therefore, it is not surprising that many startups are looking for ways to speed up the process at the software level and eliminate some of the existing bottlenecks in the learning process. Recently accepted into the Y Combinator Winter ’22 class for Sydney-based Australian startup Strong Compute, all of his efforts are focused on addressing these shortcomings in the learning process. Thus, the team claims that it can speed up the learning process by 100 times or more.
“Pytorch is great, and so is TensorFlow. These toolkits are great, but their simplicity and ease of implementation come at the cost of inefficient stuff inside,” said Ben Sand, CEO and founder of StrongCompute, a former co-founder of Were. From AR company Meta (before Facebook used the name).
While there are companies that optimize models themselves, and Strong Compute will do so if its customers ask, Sand said it could jeopardize the results. Instead, the team focuses on everything related to the model. This could be a slow data pipeline or a set of values precomputed prior to training. Sand also noted that the company has optimized some of its commonly used data extension libraries.
The company recently hired Richard Prawse, a former Cisco chief engineer, to focus on eliminating network bottlenecks in the training pipeline that can cause very high latency. But of course, hardware can also make a big difference, which is why Strong Compute works with its customers to make sure the models also work on the right platforms.
“Strong Compute cut our basic algorithm training from thirty hours to five minutes and processed hundreds of terabytes of data,” said Miles Penn, CEO of Amteler, which specializes in custom-made clothing for its online customers. “Deep learning engineers are probably the most valuable resource on the planet and Strong Compute has made us 10 times more productive. Iteration time and experimentation are key levers for improving machine learning performance, and without robust calculations, we were lost.”
Sand argues that the big cloud providers have no real incentive to do what their company does because their business model is based on people using their machines for as long as possible, which Y Combinator CEO Michael Seibel agrees. “Strong Compute is aiming for a major move away from encouraging cloud computing, where the faster results that customers value are less beneficial to vendors,” Sibel said.
For the time being, Teams still offers a white-glove service to its customers, though developers shouldn’t see much of a difference since the Customizable integration shouldn’t actually change their workflow. What Strong Compute promises here is that it can “speed up your development cycle by 10x”. Looking ahead, the idea is to automate the process as much as possible.
“AI companies can focus on their customers, data, and underlying algorithms, where their core intellectual property and value lies, while all the tuning and editing work is left to reliable computing,” Sand said. This not only gives them the rapid iteration they need to succeed, but also ensures that their developers only focus on work that adds value to the business. Today, they spend two-thirds of their time on complex system administration tasks called “machine learning operations,” which are largely common in AI companies and often outside of their area of expertise – by no means internal.”
bonus: Here’s a video of our very own Lucas Matney trying out the Meta 2 AR headset from Sand’s latest company in 2016.