본문으로 건너뛰기

Juhyeon's Blog

❯

❯

❯

폴더: AI/Papers/Architecture

16건의 항목

2026년 4월 13일
A Survey on Mixture of Experts in Large Language Models
- L
- H
2026년 4월 13일
Architecture
2026년 4월 13일
Attention Is All You Need
2026년 4월 13일
Attention Residuals
2026년 4월 13일
Attention, Learn to Solve Routing Problems!
2026년 4월 13일
BERT - Pre-training of Deep Bidirectional Transformers for Language Understanding
2026년 4월 13일
Efficiently Modeling Long Sequences with Structured State Spaces
2026년 4월 13일
GQA - Training Generalized Multi-Query Transformer Models
2026년 4월 13일
Hyena Hierarchy - Towards Larger Convolutional Language Models
2026년 4월 13일
Improving Language Understandingby Generative Pre-Training
- GPT1
2026년 4월 13일
Mamba - Linear-Time Sequence Modeling with Selective State Spaces
2026년 4월 13일
RoFormer - Enhanced Transformer with Rotary Position Embedding
2026년 4월 13일
Sentence-BERT-Sentence Embeddings using Siamese BERT-Networks
2026년 4월 13일
StripedHyena - Moving Beyond Transformers with Hybrid Signal Processing Models
2026년 4월 13일
SwiGLU - GLU Variants Improve Transformer
2026년 4월 13일
Transformer Attention Variants Survey

키보드 단축키

`/` 또는 `Ctrl`+`K`	검색
`?`	단축키 도움말
`Esc`	모달 닫기

Created with Quartz v4.5.2 © 2026

GitHub
Blog