I. Introduction
II. Background
III. Framework
III-I. Unsupervised Pre-training
III-II. Supervised Fine-tuning
III-III. Task-specific input Transformation
IV. Experiment
IV-I. Setup
Datasets
Model Specification
IV-II. Supervised Fine-tuning
Hyper-parameters
- LR :
- lr-decay : .2 with warmup
- .5
- batch-size : 32
- dropout : .1
- epochs : 3