Abstract: There is growing interest in applying distributed machine learning to edge computing, forming federated edge learning. Federated edge learning faces non-i.i.d. and heterogeneous data, and the communication between edge workers, possibly through distant locations and with unstable wireless networks, is more costly than their local computational overhead. In this work, we propose DONE, a distributed approximate Newton-type algorithm with fast convergence rate for communication-efficient federated edge learning. First, with strongly convex and smooth loss functions, DONE approximates the Newton direction in a distributed manner using the classical Richardson iteration on each edge worker. Second, we prove that DONE has linear-quadratic convergence and analyze its communication complexities. Finally, the experimental results with non-i.i.d. and heterogeneous data show that DONE attains a comparable performance to Newton’s method. Notably, DONE requires fewer communication iterations compared to distributed gradient descent and outperforms DANE, FEDL, and GIANT, state-of-the-art approaches, in the case of non-quadratic loss functions.