Tony Shin
2.61K subscribers
1:16
VeRA: Vector-based Random Matrix Adaptation
Tony Shin
246 views • 7 months ago
0:50
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Tony Shin
421 views • 7 months ago
1:25
HyperAttention: Long-context Attention in Near-Linear Time
Tony Shin
364 views • 7 months ago
1:15
Fast Feedforward Networks
Tony Shin
411 views • 7 months ago
1:14
Nougat: Neural Optical Understanding for Academic Documents
Tony Shin
222 views • 7 months ago
1:05
Retentive Network: A Successor to Transformer for Large Language Models
Tony Shin
790 views • 7 months ago
1:09
LLava: Visual Instruction Tuning
Tony Shin
872 views • 7 months ago
1:56
BloombergGPT: A Large Language Model for Finance
Tony Shin
417 views • 1 year ago
3:02
ImageBind: One Embedding Space To Bind Them All
Tony Shin
852 views • 1 year ago
2:00
Segment Anything
Tony Shin
461 views • 1 year ago
2:17
Are Emergent Abilities of Large Language Models a Mirage?
Tony Shin
2.6K views • 1 year ago
1:12
Synthetic Data Boosts ImageNet Classification
Tony Shin
211 views • 1 year ago
0:47
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Tony Shin
710 views • 1 year ago
23:34
[Tutorial] Image Super Resolution without Photoshop
Tony Shin
1.1K views • 2 years ago
10:32
YOLO9000: Better, Faster, Stronger
Tony Shin
1.1K views • 2 years ago
15:10
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Tony Shin
1K views • 2 years ago
10:27
Florence: A New Foundation Model for Computer Vision
Tony Shin
1.2K views • 2 years ago
8:03
DSSD: Deconvolutional Single Shot Detector
Tony Shin
572 views • 2 years ago
8:02
MAE: Masked Autoencoders Are Scalable Vision Learners
Tony Shin
4.7K views • 2 years ago
5:01
PVANet: Deep but Lightweight Neural Networks forReal-time Object Detection
Tony Shin
371 views • 2 years ago
5:36
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Tony Shin
4.1K views • 2 years ago
6:32
R-FCN: Object Detection via Region-based Fully Convolutional Networks
Tony Shin
1K views • 2 years ago
5:28
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Tony Shin
2.1K views • 2 years ago
7:33
Pix2Seq: A Language Modeling Framework for Object Detection
Tony Shin
1.6K views • 2 years ago
2:41
Improved Regularization of Convolutional Neural Networks with Cutout
Tony Shin
386 views • 2 years ago
7:13
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
Tony Shin
2.5K views • 3 years ago
3:23
SSD: Single Shot MultiBox Detector
Tony Shin
5.5K views • 3 years ago
4:33
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Tony Shin
2K views • 3 years ago
5:22
MLP-Mixer: An all-MLP Architecture for Vision
Tony Shin
1.6K views • 3 years ago
4:09
YOLO: Unified, Real-Time Object Detection
Tony Shin
841 views • 3 years ago
Load More