人工智慧發展迅速,大型語言模型(LLM) 在許多AI應用中,展現傑出的語言理解與內容生成能力,而大型多模態模型 (LMM),結合影像、聲音、人機互動與多元外界訊號,對行動裝置來說,更是重要趨勢。但這些模型的參數高達數十億到兆參數不等,計算量龐大,消耗資源過高。 不但對雲端的AI運算產生影響,更對資源受限的邊緣設備端帶來極大的衝擊。即使目前有將LLM與LMM精簡的趨勢,但仍然需要遠超過以往的運算規模,對系統架構設計、運算資源、記憶體容量與頻寬,都造成巨大的挑戰。
在這裡,我們討論在 LLM 與LMM 趨勢下與行動處理器設計趨勢,以及各種新的AI運算架構技術。並討論行動處理器中的裝置端與雲端的協作,提供更好的效能。這些 LLM/LMM 趨勢和新的使用情境,將塑造未來的運算架構設計。
梁伯嵩博士現為聯發科技企業策略與前瞻技術資深處長。兼任國立臺灣大學電機資訊學院資訊工程系與重點科技研究學院合聘之客座教授,與國立陽明交通大學產學創新研究學院智能系統研究所之教授級專業技術人員。並擔任 IEEE CASS (電路與系統協會)台北分會主席。
梁博士於國立交通大學電子研究所獲得博士學位,並於國立臺灣大學管理學院 EMBA商學組畢業。曾獲得中華民國十大傑出青年 (科技發展類)、三度獲得經濟部智慧財產局國家發明創作獎 (發明獎一金二銀) 、經濟部技術處產業科技發展獎-傑出青年創新獎、資訊月傑出資訊人才獎、李國鼎青年研究獎等重要獎項。
In order to continue Moore's Law, 3D packaging has become a solution discussed by everyone. It has developed from application fields such as mobile phone chips, high-performance computing (HPC), AI PC, new energy vehicles, etc. Especially with the rise of AI applications, the demand for application performance is far greater than the speed of chip process evolution. Let’s discuss the evolution of various 3D packaging applications.
Sam Lin is a Manager, Corporate R&D at Siliconware Precision Industries Co., Ltd., Taichung, Taiwan, R.O.C. He has 13 years of industrial experience focusing on package electrical analysis, measurement and product application analysis. In recent years, he has focused on 2.5D, 3DIC, FO-MCM and FO-EB advanced packaging research.
Generative AI models, capable of producing diverse and creative content, hold immense potential for various applications. However, their computational complexity and resource demands pose challenges for deployment on edge devices. This presentation explores a hardware-software co-design approach to accelerate generative AI at the edge. In Google, we investigate optimized model architectures, efficient inference algorithms, specialized hardware accelerators, and software frameworks tailored for edge devices. Our approach aims to reduce latency, minimize power consumption, and improve overall performance while preserving the quality of generated content. By synergistically leveraging hardware and software advancements, we unlock the possibilities of real-time, interactive, and privacy-preserving generative AI experiences directly on edge devices.
Enhao Chang earned his B.S. (2011) and M.S. (2013) degrees in Computer Architecture from National Cheng Kung University (NCKU). He is currently a TPU Architect at Google, specializing in TPU architecture to enable on-device artificial intelligence for portable systems. Previously, he was a SoC Performance Architect at Google, responsible for the performance and power efficiency of Google's mobile SoCs. Prior to Google, he worked as a Power Architect at MediaTek.