Q-LAtte: An Efficient and Versatile LSTM Model for Quantized Attention-Based Time Series Forecasting in Building Energy Applications
- 주제(키워드) Computational modeling , Predictive models , Quantization (signal) , Long short term memory , Real-time systems , Data models , Buildings , Artificial intelligence , Deep learning , Optimization methods , Artificial Intelligence , building energy , deep learning acceleration , optimization , quantization
- 주제(기타) Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications
- 설명문(일반) [Kang, Jieui; Choi, Soeun] Ewha Womans Univ, Artificial Intelligence Convergence, Seoul 03760, South Korea; [Park, Jihye] Ewha Womans Univ, Dept Architectural & Urban Syst Engn, Seoul 03760, South Korea; [Sim, Jaehyeong] Ewha Womans Univ, Dept Comp Sci & Engn, Seoul 03760, South Korea
- 관리정보기술 faculty
- 등재 SCIE, SCOPUS
- OA유형 Gold Open Access
- 발행기관 IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- 발행년도 2024
- 총서유형 Journal
- URI http://www.dcollection.net/handler/ewha/000000240900
- 본문언어 영어
- Published As https://doi.org/10.1109/ACCESS.2024.3400588
초록/요약
Long Short-Term Memory (LSTM) networks, coupled with attention mechanisms, have demonstrated their proficiency in handling time-series data, particularly in the architectural energy prediction industry. However, their high computational complexity and resource-intensive nature pose significant challenges for real-time applications and on edge devices. Traditional methods of mitigating these issues, such as quantization, often lead to a compromise on model performance due to approximation errors introduced during the process. In this paper, we propose Q-LAtte, a novel, quantization-friendly attention-based LSTM model, as a solution to these challenges. Q-LAtte incorporates an innovative approach to quantization that preserves the efficiency benefits while significantly reducing the performance degradation typically associated with standard quantization techniques. The key to its superior performance lies in its distribution-aware quantization process. By effectively conserving the output distribution of the model parameters before and after quantization, Q-LAtte ensures the retention of subtle but significant variations integral to decision-making processes like prediction or classification. Compared to traditional quantized models, Q-LAtte exhibits a notable improvement in performance. Specifically, our method reduces the Mean Average Percentage Error (MAPE) from 17.56 to 8.48 and the Mean Absolute Scaled Error (MASE) by 48%, while minimizing the time cost. These results highlight the efficacy of Q-LAtte in striking a balance between efficiency and accuracy, significantly enhancing the feasibility of deploying attention-LSTM networks on resource-constrained devices for real-time, on-site data analysis and decision-making.
more