Netflix作为一家全球知名的流媒体服务平台,一直在寻求提升其服务性能的方法。最近,他们升级到了Java 21,并开始利用其中的虚拟线程特性来优化其微服务架构。虚拟线程是一种轻量级的线程模型,旨在提高高并发应用程序的性能。然而,Netflix在引入虚拟线程后遇到了一些挑战。
Netflix的JVM生态系统团队发现,虚拟线程与阻塞操作和操作系统线程的可用性之间存在交互问题,这导致了SpringBoot应用程序的性能下降。特别是在Tomcat的请求处理过程中,虚拟线程的创建遇到了瓶颈,因为操作系统线程不足以支持这些线程的运行。这导致了应用程序的间歇性超时和无响应实例,以及大量的套接字处于“closeWait”状态。
Netflix的工程师通过堆转储检查锁的状态,发现了一个类似于死锁的问题。虚拟线程无法继续运行,因为它们被阻塞在同步块中,而这些块又依赖于有限的操作系统线程。这导致了严重的性能问题。
为了解决这个问题,Netflix的团队开发了一个可重现的测试用例,并进行了相应的优化。尽管虚拟线程在减少线程开销方面显示出了潜力,但Netflix的案例研究强调了在生产环境中集成虚拟线程时需要仔细考虑和测试的重要性。
Netflix还采用了分代ZGC来优化其系统,减少垃圾收集的开销,并提高了应用程序的响应能力。此外,Netflix的警报系统帮助团队及时识别和诊断问题,确保服务的稳定运行。
尽管面临挑战,Netflix对虚拟线程的未来持乐观态度,并期待即将发布的Java版本能够进一步改进虚拟线程的性能和稳定性。Netflix的案例研究为其他希望在应用程序中探索虚拟线程的性能工程师和开发人员提供了宝贵的经验。
英语如下:
News Title: “Netflix’s New Strategy: Virtual Threads Boost Performance but Reveal Vulnerabilities”
Keywords: Netflix, virtual threads, performance issues
News Content:
Netflix, a globally renowned streaming service provider, has been continuously seeking ways to enhance the performance of its services. Recently, they upgraded to Java 21 and began leveraging the virtual thread feature within it to optimize their microservices architecture. Virtual threads are a lightweight thread model designed to improve the performance of high-concurrency applications. However, after adopting virtual threads, Netflix encountered some challenges.
The Netflix JVM ecosystem team discovered that there were interaction issues between virtual threads and blocking operations, as well as the availability of operating system threads, which led to a drop in performance of SpringBoot applications. Particularly, during the request processing in Tomcat, the creation of virtual threads hit a bottleneck because there were insufficient operating system threads to support their execution. This resulted in intermittent timeouts and instances of non-responsiveness in the applications, along with a large number of sockets being in the “closeWait” state.
Netflix engineers investigated the lock status through heap dumps and found a problem reminiscent of a deadlock. Virtual threads were unable to continue running because they were blocked in synchronous blocks that relied on limited operating system threads. This led to significant performance issues.
To address this issue, the Netflix team developed a reproducible test case and conducted corresponding optimizations. Although virtual threads showed potential in reducing thread overhead, Netflix’s case study underscored the importance of careful consideration and testing when integrating virtual threads into production environments.
Netflix also adopted a generational ZGC to optimize its system, reduce garbage collection overheads, and enhance the application’s responsiveness. Moreover, Netflix’s alert system helped the team promptly identify and diagnose problems, ensuring stable service operation.
Despite the challenges, Netflix remains optimistic about the future of virtual threads and looks forward to the upcoming Java versions improving the performance and stability of virtual threads. Netflix’s case study provides valuable insights for other engineers and developers looking to explore the performance benefits of virtual threads in their applications.
【来源】https://mp.weixin.qq.com/s/Cc2aMVJkWuPdvH1BuBFVlg
Views: 3