May 26, 2022

You might think that synchronizing clocks across fleets of modern servers is a solved problem, but it’s actually quite a challenge, especially if you want nanosecond precision. It also means that it is an axiom of computer science that you should never build a system based on clocks. Clockwork.io, announcing today a $21 million Series A funding round, promises to change that with timing accuracy of up to 5 nanoseconds with hardware timestamps and hundreds of nanoseconds with software timestamps.

Taking this step forward, the company is also launching its first product today, Latency Sensi, which can provide its users with ultra-thin latency data in their cloud, on-premises and hybrid environments that they can then access. used. And addressing barriers to voting. their network. The company’s clients already include Nasdaq, Wells Fargo and RBC.

image credit: Look

The startup was founded by Ilong Geng, Deepak Merugu and Stanford “VMware Founder Computer Science Professor” Balaji Prabhakar, while VMware co-founder and Stanford Computer Science Professor Mendel Rosenblum was a board member and chief scientist. Given the origins of this group, it is not surprising that the main research on the Clockwork system is based on basic scientific research conducted by the Stanford team.

The Network Time Synchronization Protocol (NTP), the standard format most computers use today to synchronize their clocks, is widely used but not very accurate. There has been some work done to improve this, such as when Facebook provided a hardware solution to the Open Compute project last year, but the Clockwork team promises much more accuracy.

“Sometimes in data centers I can’t agree for a second. My phone and the room here probably agree on something else. Then you get thinner and thinner and thinner – microseconds and nanoseconds. It is very difficult. It is very difficult for two watches to tell what nanosecond they are running,” Prabhakar explains. He said that even synchronizing these clocks once is not enough. You must also sync them. You can put a high-precision clock in the server that is immune to temperature fluctuations and vibrations, but this clock will quickly become more expensive than the server itself.

image credit: Look

To solve this problem, the team created a machine learning system and model that can measure very precisely how long it takes to get a timestamp on a given server. This isn’t much different from how NTP works, but then the team takes it a few steps further by looking at different timestamps and then getting both the clock offset and the relative frequency difference. All of this is then fed into the machine learning model. In addition, the team has also created systems so that different watches can communicate with each other and detect (and accurately) when they are out of sync.

Due to the lack of reliable timestamps, distributed systems have long relied on clockless designs, which added an extra layer of complexity to building complex systems. The Clockwork team hopes that its work will allow researchers to experiment with new time-based algorithms in several problem areas such as database consistency, sequencing of events, consensus protocols, and registries.

The original research by Rosenblum and Prabhakar’s team was about what you can do if you can rely on a clock in a distributed system.

“Currently, no one uses Time, except maybe Spanner, CockroachDB at Google, or someone who does databases,” Rosenblum said. “We believe that there are still many places, especially as more and more time-critical problems appear. We can do time sync because we figured out how to do it. And so we asked if this is part of the trend where we’re going to program these systems differently? As well as [researchers] I’m a little excited about the prospect that we can do this.”

So now that the sync issues have been resolved, the Clockwork team is going to build products here, starting with Latency Sensei. But Prabhakar also revealed that the team is already working on another project that will make it easier to detect congestion in data centers. He said that TCP is great for wide area networks, but is practically useless inside a data center. But you can use additional information about the network and its latency to give TCP a better idea of ​​how packets should be routed in the data center.

The Series A round was led by NEA and featured high-profile angel investors including MIPS co-founder John Hennessy, Google early investor Ram Shriram, and Yahoo co-founder Jerry Yang.

Leave a Reply

Your email address will not be published.