ElasticDL: An Open‑Source Distributed Deep Learning Framework with Elastic Scheduling
ElasticDL is an open‑source distributed deep learning framework built on TensorFlow 2.x and Kubernetes that simplifies programming by letting users define models with the Keras API, while providing elastic scheduling, fault tolerance, and significant performance gains demonstrated through extensive benchmarks.