5G is the upcoming evolution for the current cellular networks that aims at satisfying the future demand for data services. Heterogeneous cloud radio access networks (H-CRANs) are envisioned as a new trend of 5G that exploits the advantages of heterogeneous and cloud radio access networks to enhance spectral and energy efficiency. Remote radio heads (RRHs) are small cells utilized to provide high data rates for users with high quality of service (QoS) requirements, while high power macro base station (BS) is deployed for coverage maintenance and low QoS users service. Inter-tier interference between macro BSs and RRHs and energy efficiency are critical challenges that accompany resource allocation in H-CRANs. Therefore, we propose an efficient resource allocation scheme using online learning, which mitigates interference and maximizes energy efficiency while maintaining QoS requirements for all users. The resource allocation includes resource blocks (RBs) and power. The proposed scheme is implemented using two approaches: centralized, where the resource allocation is processed at a controller integrated with the baseband processing unit and decentralized, where macro BSs cooperate to achieve optimal resource allocation strategy. To foster the performance of such sophisticated scheme with a model free learning, we consider users' priority in RB allocation and compact state representation learning methodology to improve the speed of convergence and account for the curse of dimensionality during the learning process. The proposed scheme including both approaches is implemented using software defined radios testbed. The obtained results and simulation results confirm that the proposed resource allocation solution in H-CRANs increases the energy efficiency significantly and maintains users' QoS.