Increasing energy costs of large-scale server systems has led to a demand for innovative methods for optimizing resource utilization in these systems. Such methods aim to reduce server energy consumption, cooling requirements, carbon footprint, and so on, thereby leading to improved holistic sustainability of the overall server infrastructure. At the core of many of these methods lie reliable workload-prediction techniques that guide in identifying servers, time intervals, and other parameters that are needed for building sustainability solutions based on techniques like virtualization and server consolidation for server systems. Many workload prediction methods have been proposed in the recent literature, but unfortunately they do not deal adequately with the issues that arise specifically in large-scale server systems, viz., extensive non-stationarity of server workloads, and massive online streaming data. In this paper, we fill this gap by proposing two online ensemble learning methods for workload prediction, which address these issues in large-scale server systems. The proposed algorithms are motivated from the Weighted Majority and Simulatable Experts approaches, which we extend and adapt to the large-scale workload prediction problem. We demonstrate the effectiveness of our algorithms using real and synthetic data sets, and show that using the proposed algorithms, the workloads of approximately 91% of servers in a real data center can be predicted with accuracy greater than 89%, whereas using baseline approaches, the workloads of only 13--24% of the servers can be predicted with similar accuracy.