논문 리뷰 10

EfficientNet(2020.09.V5): Rethinking Model Scaling for Convolutional Neural Networks

논문 링크: https://arxiv.org/abs/1905.11946 EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksConvolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing narxiv.org한 줄 정리 CNN(합성곱 신경망) 모..

논문 리뷰 2025.02.06

RetinaNet(2018.02.V2): Focal Loss for Dense Object Detection

논문 링크: https://arxiv.org/abs/1708.02002 Focal Loss for Dense Object DetectionThe highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense samplarxiv.org한 줄 정리 Focal Loss 적용Object detection에서 클래스 불균형 문제 해결 Fea..

논문 리뷰 2025.02.04

RoBERTa(2019.7.V1): A Robustly Optimized BERT Pretraining Approach

논문 링크: https://arxiv.org/abs/1907.11692 RoBERTa: A Robustly Optimized BERT Pretraining ApproachLanguage model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperpararxiv.org한 줄 정리 BERT 구조는 그대로 유지하면서 학습 방식만 개선하여 ..

논문 리뷰 2025.02.04

BERT(2019.05.V5): Pre-training of Deep Bidirectional Transformers for Language Understanding

논문 링크: [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding  BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingWe introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep..

논문 리뷰 2025.01.30

ELMO(2018.04): Deep contextualized word representations

논문 링크: https://arxiv.org/abs/1802.05365 Deep contextualized word representationsWe introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors arearxiv.org한 줄 정리 Deep contextualized word representationELMo는 단..

논문 리뷰 2025.01.20

Transformer(2017.06): Attention Is All You Need

논문 링크: https://arxiv.org/abs/1706.03762 Attention Is All You NeedThe dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a newarxiv.org한 줄 정리 Transformer는 읽을 때 제대로 읽어야겠다 하는 마음에 논문 리뷰를 미루고 미루다 이제 하게 되었다.....

논문 리뷰 2025.01.14

MobileNets(2017.04): Efficient Convolutional Neural Networks for Mobile Vision Applications

논문 링크: [1704.04861] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision ApplicationsWe present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light ..

논문 리뷰 2025.01.12

Seq2Seq(2014.10): Sequence to Sequence Learning with Neural Networks

논문 링크: [1409.3215] Sequence to Sequence Learning with Neural Networks Sequence to Sequence Learning with Neural NetworksDeep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paparxiv.org한 줄 정리가변 입출력을..

논문 리뷰 2025.01.10

SPPNet(2014.06): Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

논문 링크: [1406.4729] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Spatial Pyramid Pooling in Deep Convolutional Networks for Visual RecognitionExisting deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224x224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/..

논문 리뷰 2025.01.10

InceptionV2/3(2015.09): Rethinking the Inception Architecture for Computer Vision

논문 링크: [1512.00567] Rethinking the Inception Architecture for Computer Vision Rethinking the Inception Architecture for Computer VisionConvolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although incrarxiv.o..

논문 리뷰 2025.01.07