본문 바로가기

👩‍💻 도비는 공부중/💼 하계연구연수생(2023)

[공유 세미나] 2023 상반기 학회

CVPR - Detecting Everything in the open World: Towards universal object detection

 


Everything | Open World | Universal

 

image + text 공간의 alignment > multiple sources & heterogeneous label space image 

 

Open World

Unseen classes: Open-vocabulary object detection

Various domains: Object detection in the  Wild

 

 

partitioned label space의 성능이 가장 좋음

 

ROI Head 추가

 


AAAI - SWL -Adapt: An unsupervised domain adaptation model with sample weight learning for corss-user wearable human activity recognition

 


 

 

Feature extractor | Domain discriminator | Classifier | + Weight Allocator

automatically fit task-specific weighting functions

 


ICML 2023 Conference

 

 Expo: Talk Panel: Colossal-AI: Breakthroughs in Efficient AI

┗ 3 layer(Efficient Memory system | 2-dim parallelism system | Large-scale Optimizaion)  딥러닝 모델 학습 효과적 진행 도와줌

 

Tutorials

 Tutorial on Multimodal Machine Learning: Principles, Challenges and Open Questions (link)

 How to DP-fy ML: A Practical Tutorial to Machine Learning with Differential Privacy (link)

 

Fairness/Ethical AI

Learning Fair representations (link)

 


 

1. Learning-Rate-Free Learning by D-Adaptation (paper) (github)

- Distance-to-solution의 LB를 가지고, step size 자동 조절 알고리즘 제시

- SGD, AdaGrad and Adam 적용

 

2. A Watermark for Large Language Models (paper) (github)

- LLM 모델 | 사람의 응답을 구별하기 위한 Watermark 방법론

- 재학습 필요 X

 

3. Generative Casual Representation Learning for Out-of-Distribution Motion Forecasting (paper) (github)

- 동작 예측 > domain 변화에 강한 예측기 제안

 

4. Motion Question Answering  via Modular Motion Programs (link)

- motion feature 추출, 선후 관계에 대한 학습 > 동작에 대한  Q&A 가능

 

5. Masked Trajectory Models for Prediction, Representation, and Control (link)

- 강인제어 문제에 MTM(Maksed Trajectory Modeling)

 

MTM learns versatile networks that can take on different roles or capabilities, by simply choosing appropriate masks at inference time

 

 


CVPR - Conditional Generation of Audio from Video via Foley Analogies


Foley = sound effects

 

"generate a soundtrack for an input silent video from a user-provided conditional audio-visual example"

learn to adapt the exaemplar sound to match the timeing of the visual content of a silentt video

 

 

❍ Conditional Foley generation - Transformer

❍ Vector-quantized audio representation - VOGAN

❍ Video input representation

❍ Conditional audio representation: vector quntized embeddings

❍ Combine tokends into a single sequence

❍ Model training

❍ Autoregressive sound prediction

❍ Generating a waveform

❍ Re-ranking based on audio-visual synchronization