Skip to content

Commit 60aa0e2

Browse files
author
alexrame
committed
wip
1 parent a8f8b9a commit 60aa0e2

16 files changed

+877
-11
lines changed

404.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,10 @@ <h1>Page not found</h1>
260260

261261
<h2>Publications</h2>
262262

263+
<ul>
264+
<li><a href="/publication/rlwa/">Rewarded soups: towards Pareto-optimality by interpolating weights fine-tuned on diverse rewards</a></li>
265+
</ul>
266+
263267
<ul>
264268
<li><a href="/publication/ratatouille/">Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization</a></li>
265269
</ul>
@@ -276,10 +280,6 @@ <h2>Publications</h2>
276280
<li><a href="/publication/fishr/">Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization</a></li>
277281
</ul>
278282

279-
<ul>
280-
<li><a href="/publication/mixmo/">MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks</a></li>
281-
</ul>
282-
283283

284284

285285

index.html

Lines changed: 69 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -491,6 +491,72 @@ <h1>Selected Publications</h1>
491491

492492

493493

494+
<div class="media stream-item" itemscope itemtype="http://schema.org/Event">
495+
<div class="media-body">
496+
497+
<h3 class="article-title mb-0 mt-0" itemprop="name">
498+
Rewarded soups: towards Pareto-optimality by interpolating weights fine-tuned on diverse rewards
499+
</h3>
500+
501+
<div class="stream-meta article-metadata">
502+
<div itemprop="author">
503+
Alexandre Ramé, Guillaume Couairon, Corentin Dancette, Jean-Baptiste Gaya, Mustafa Shukor, Laure Soulier, Matthieu Cord</div>
504+
soon on arXiv<br>
505+
506+
</div>
507+
508+
<div class="ml-3">
509+
510+
511+
512+
<img src="/publication/rlwa/featured_hu3f61ec8d614379be647f7d7ca55a4688_77418_400x0_resize_box_2.png" itemprop="image">
513+
514+
</div>
515+
516+
517+
518+
519+
520+
521+
<div class="article-style" itemprop="articleBody">
522+
We introduce rewarded soup, a new strategy to trade-off between multiple rewards when fine-tuning foundation models with RLHF; we first learn one network for each reward, and then linearly interpolate their weights.
523+
</div>
524+
525+
526+
<div class="talk-links">
527+
528+
529+
530+
531+
532+
533+
534+
535+
536+
537+
538+
539+
540+
541+
542+
543+
544+
545+
546+
547+
548+
549+
550+
551+
552+
553+
554+
555+
</div><br />
556+
</div>
557+
</div>
558+
559+
494560
<div class="media stream-item" itemscope itemtype="http://schema.org/Event">
495561
<div class="media-body">
496562

@@ -501,7 +567,7 @@ <h3 class="article-title mb-0 mt-0" itemprop="name">
501567
<div class="stream-meta article-metadata">
502568
<div itemprop="author">
503569
Alexandre Ramé, Kartik Ahuja, Jianyu Zhang, Matthieu Cord, Léon Bottou, David Lopez-Paz</div>
504-
arXiv<br>
570+
ICML 2023<br>
505571

506572
</div>
507573

@@ -1158,8 +1224,8 @@ <h3 class="article-title mb-0 mt-0" itemprop="name">
11581224

11591225
<div class="stream-meta article-metadata">
11601226
<div itemprop="author">
1161-
Alexandre Rame, Emilien Garreau, Hedi Ben-Younes, Charles Ollion</div>
1162-
arXiv Preprint<br>
1227+
Alexandre Ramé, Emilien Garreau, Hedi Ben-Younes, Charles Ollion</div>
1228+
arXiv<br>
11631229

11641230
</div>
11651231

index.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
[{"authors":["Alexandre Ramé","Kartik Ahuja","Jianyu Zhang","Matthieu Cord","Léon Bottou","David Lopez-Paz"],"categories":null,"content":"","date":1671490800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1671490800,"objectID":"2e3ae69c28a112205a7743774c87e419","permalink":"/publication/ratatouille/","publishdate":"2022-12-20T00:00:00+01:00","relpermalink":"/publication/ratatouille/","section":"publication","summary":"","tags":null,"title":"Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization","type":"publication"},{"authors":["Alexandre Ramé","Matthieu Kirchmeyer","Thibaud Rahier","Alain Rakotomamonjy","Patrick Gallinari","Matthieu Cord"],"categories":null,"content":"","date":1652997600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1652997600,"objectID":"3468ac7cd0e597d0fd2adf7470645538","permalink":"/publication/diwa/","publishdate":"2022-05-20T00:00:00+02:00","relpermalink":"/publication/diwa/","section":"publication","summary":"","tags":null,"title":"Diverse Weight Averaging for Out-of-Distribution Generalization","type":"publication"},{"authors":["Arthur Douillard","Alexandre Ramé","Guillaume Couairon","Matthieu Cord"],"categories":null,"content":"","date":1630965600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1630965600,"objectID":"7541e62fdd60405a9055994724ce4f05","permalink":"/publication/dytox/","publishdate":"2021-09-07T00:00:00+02:00","relpermalink":"/publication/dytox/","section":"publication","summary":"","tags":null,"title":"DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion","type":"publication"},{"authors":["Alexandre Ramé","Corentin Dancette","Matthieu Cord"],"categories":null,"content":"","date":1630965600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1630965600,"objectID":"ac87dc1e61d871e29bbd68d2acf9280c","permalink":"/publication/fishr/","publishdate":"2021-09-07T00:00:00+02:00","relpermalink":"/publication/fishr/","section":"publication","summary":"","tags":null,"title":"Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization","type":"publication"},{"authors":[],"categories":null,"content":"Un condensé de la magie de Liu Cixin: la volonté de coller au plus près à la science malgré quelques pré-requis SF, une poésie toute chinoise sublimant ce monde qui apparaît devant les yeux du lecteur, une réflexion philosophique à tiroirs infinis poussant parfois à relire plusieurs fois les mêmes passages et à y penser sous 4 angles différents, des échos énormes aux plus grandes problématiques actuelles. Nous savons maintenant que notre Terre est finie, limitée. Mais dans cette nouvelle, c\u0026rsquo;est notre lumière dans le noir, notre point de repère universel qui est en danger: c\u0026rsquo;est bien notre Soleil qui est sur le point de mourir.\n","date":1607382000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1607382000,"objectID":"e5cc49057d4b944e5fceab456ba8e768","permalink":"/books/wandering-earth/","publishdate":"2020-12-08T00:00:00+01:00","relpermalink":"/books/wandering-earth/","section":"books","summary":"Un condensé de la magie de Liu Cixin: la volonté de coller au plus près à la science malgré quelques pré-requis SF, une poésie toute chinoise sublimant ce monde qui apparaît devant les yeux du lecteur, une réflexion philosophique à tiroirs infinis poussant parfois à relire plusieurs fois les mêmes passages et à y penser sous 4 angles différents, des échos énormes aux plus grandes problématiques actuelles. Nous savons maintenant que notre Terre est finie, limitée.","tags":[],"title":"Terre Errante","type":"books"},{"authors":["Alexandre Ramé","Rémy Sun","Matthieu Cord"],"categories":null,"content":"","date":1583017200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1583017200,"objectID":"a95f4e852f0d2d3f9ad26f51e699360a","permalink":"/publication/mixmo/","publishdate":"2020-03-01T00:00:00+01:00","relpermalink":"/publication/mixmo/","section":"publication","summary":"","tags":null,"title":"MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks","type":"publication"},{"authors":["Alexandre Ramé","Matthieu Cord"],"categories":null,"content":"","date":1577833200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1577833200,"objectID":"2b5b8ce4b0f374ae9c148c76097c2837","permalink":"/publication/dice/","publishdate":"2020-01-01T00:00:00+01:00","relpermalink":"/publication/dice/","section":"publication","summary":"","tags":null,"title":"DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation","type":"publication"},{"authors":["Alexandre Ramé","Arthur Douillard","Charles Ollion"],"categories":null,"content":"https://arxiv.org/abs/2010.02849\n","date":1569880800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1569880800,"objectID":"417506d233c5c4f087a23ed0a76dab2e","permalink":"/publication/core/","publishdate":"2019-10-01T00:00:00+02:00","relpermalink":"/publication/core/","section":"publication","summary":"https://arxiv.org/abs/2010.02849","tags":null,"title":"CORE: Color Regression for Multiple Colors Fashion Garments","type":"publication"},{"authors":["Alexandre Rame","Emilien Garreau","Hedi Ben-Younes","Charles Ollion"],"categories":null,"content":"","date":1543618800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1543618800,"objectID":"b83854a9c7581f99c5d80f267f9e7b4a","permalink":"/publication/omnia/","publishdate":"2018-12-01T00:00:00+01:00","relpermalink":"/publication/omnia/","section":"publication","summary":"","tags":null,"title":"OMNIA Faster R-CNN: Detection in the Wild through Dataset Merging and Soft Distillation","type":"publication"},{"authors":[],"categories":null,"content":"Bouquin unique, belle porte d\u0026rsquo;entrée à la hard sciences, qui plaira aux adeptes de SF comme aux fans de thrillers et aux ufologues. Premier tome d\u0026rsquo;une trilogie qui aura remporté le prix Hugo.\n","date":1526162400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1526162400,"objectID":"1c93f05b818253d22a35052bf95641b8","permalink":"/books/three-body/","publishdate":"2018-05-13T00:00:00+02:00","relpermalink":"/books/three-body/","section":"books","summary":"Bouquin unique, belle porte d\u0026rsquo;entrée à la hard sciences, qui plaira aux adeptes de SF comme aux fans de thrillers et aux ufologues. Premier tome d\u0026rsquo;une trilogie qui aura remporté le prix Hugo.","tags":[],"title":"The Three-Body Problem trilogy","type":"books"},{"authors":["Charles Corbiere","Hedi Ben-Younes","Alexandre Ramé","Charles Ollion"],"categories":null,"content":"https://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w32/Corbiere_Leveraging_Weakly_Annotated_ICCV_2017_paper.pdf\n","date":1470002400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1470002400,"objectID":"a3a7e1a45c6d8c61fd6c2aaba9d068ea","permalink":"/publication/weakly/","publishdate":"2016-08-01T00:00:00+02:00","relpermalink":"/publication/weakly/","section":"publication","summary":"https://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w32/Corbiere_Leveraging_Weakly_Annotated_ICCV_2017_paper.pdf","tags":null,"title":"Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction","type":"publication"}]
1+
[{"authors":["Alexandre Ramé","Guillaume Couairon","Corentin Dancette","Jean-Baptiste Gaya","Mustafa Shukor","Laure Soulier","Matthieu Cord"],"categories":null,"content":"","date":1684620000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1684620000,"objectID":"7c9c768efb988234acbc742398cea03e","permalink":"/publication/rlwa/","publishdate":"2023-05-21T00:00:00+02:00","relpermalink":"/publication/rlwa/","section":"publication","summary":"","tags":null,"title":"Rewarded soups: towards Pareto-optimality by interpolating weights fine-tuned on diverse rewards","type":"publication"},{"authors":["Alexandre Ramé","Kartik Ahuja","Jianyu Zhang","Matthieu Cord","Léon Bottou","David Lopez-Paz"],"categories":null,"content":"","date":1671490800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1671490800,"objectID":"2e3ae69c28a112205a7743774c87e419","permalink":"/publication/ratatouille/","publishdate":"2022-12-20T00:00:00+01:00","relpermalink":"/publication/ratatouille/","section":"publication","summary":"","tags":null,"title":"Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization","type":"publication"},{"authors":["Alexandre Ramé","Matthieu Kirchmeyer","Thibaud Rahier","Alain Rakotomamonjy","Patrick Gallinari","Matthieu Cord"],"categories":null,"content":"","date":1652997600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1652997600,"objectID":"3468ac7cd0e597d0fd2adf7470645538","permalink":"/publication/diwa/","publishdate":"2022-05-20T00:00:00+02:00","relpermalink":"/publication/diwa/","section":"publication","summary":"","tags":null,"title":"Diverse Weight Averaging for Out-of-Distribution Generalization","type":"publication"},{"authors":["Arthur Douillard","Alexandre Ramé","Guillaume Couairon","Matthieu Cord"],"categories":null,"content":"","date":1630965600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1630965600,"objectID":"7541e62fdd60405a9055994724ce4f05","permalink":"/publication/dytox/","publishdate":"2021-09-07T00:00:00+02:00","relpermalink":"/publication/dytox/","section":"publication","summary":"","tags":null,"title":"DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion","type":"publication"},{"authors":["Alexandre Ramé","Corentin Dancette","Matthieu Cord"],"categories":null,"content":"","date":1630965600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1630965600,"objectID":"ac87dc1e61d871e29bbd68d2acf9280c","permalink":"/publication/fishr/","publishdate":"2021-09-07T00:00:00+02:00","relpermalink":"/publication/fishr/","section":"publication","summary":"","tags":null,"title":"Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization","type":"publication"},{"authors":[],"categories":null,"content":"Un condensé de la magie de Liu Cixin: la volonté de coller au plus près à la science malgré quelques pré-requis SF, une poésie toute chinoise sublimant ce monde qui apparaît devant les yeux du lecteur, une réflexion philosophique à tiroirs infinis poussant parfois à relire plusieurs fois les mêmes passages et à y penser sous 4 angles différents, des échos énormes aux plus grandes problématiques actuelles. Nous savons maintenant que notre Terre est finie, limitée. Mais dans cette nouvelle, c\u0026rsquo;est notre lumière dans le noir, notre point de repère universel qui est en danger: c\u0026rsquo;est bien notre Soleil qui est sur le point de mourir.\n","date":1607382000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1607382000,"objectID":"e5cc49057d4b944e5fceab456ba8e768","permalink":"/books/wandering-earth/","publishdate":"2020-12-08T00:00:00+01:00","relpermalink":"/books/wandering-earth/","section":"books","summary":"Un condensé de la magie de Liu Cixin: la volonté de coller au plus près à la science malgré quelques pré-requis SF, une poésie toute chinoise sublimant ce monde qui apparaît devant les yeux du lecteur, une réflexion philosophique à tiroirs infinis poussant parfois à relire plusieurs fois les mêmes passages et à y penser sous 4 angles différents, des échos énormes aux plus grandes problématiques actuelles. Nous savons maintenant que notre Terre est finie, limitée.","tags":[],"title":"Terre Errante","type":"books"},{"authors":["Alexandre Ramé","Rémy Sun","Matthieu Cord"],"categories":null,"content":"","date":1583017200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1583017200,"objectID":"a95f4e852f0d2d3f9ad26f51e699360a","permalink":"/publication/mixmo/","publishdate":"2020-03-01T00:00:00+01:00","relpermalink":"/publication/mixmo/","section":"publication","summary":"","tags":null,"title":"MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks","type":"publication"},{"authors":["Alexandre Ramé","Matthieu Cord"],"categories":null,"content":"","date":1577833200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1577833200,"objectID":"2b5b8ce4b0f374ae9c148c76097c2837","permalink":"/publication/dice/","publishdate":"2020-01-01T00:00:00+01:00","relpermalink":"/publication/dice/","section":"publication","summary":"","tags":null,"title":"DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation","type":"publication"},{"authors":["Alexandre Ramé","Arthur Douillard","Charles Ollion"],"categories":null,"content":"https://arxiv.org/abs/2010.02849\n","date":1569880800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1569880800,"objectID":"417506d233c5c4f087a23ed0a76dab2e","permalink":"/publication/core/","publishdate":"2019-10-01T00:00:00+02:00","relpermalink":"/publication/core/","section":"publication","summary":"https://arxiv.org/abs/2010.02849","tags":null,"title":"CORE: Color Regression for Multiple Colors Fashion Garments","type":"publication"},{"authors":["Alexandre Ramé","Emilien Garreau","Hedi Ben-Younes","Charles Ollion"],"categories":null,"content":"","date":1543618800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1543618800,"objectID":"b83854a9c7581f99c5d80f267f9e7b4a","permalink":"/publication/omnia/","publishdate":"2018-12-01T00:00:00+01:00","relpermalink":"/publication/omnia/","section":"publication","summary":"","tags":null,"title":"OMNIA Faster R-CNN: Detection in the Wild through Dataset Merging and Soft Distillation","type":"publication"},{"authors":[],"categories":null,"content":"Bouquin unique, belle porte d\u0026rsquo;entrée à la hard sciences, qui plaira aux adeptes de SF comme aux fans de thrillers et aux ufologues. Premier tome d\u0026rsquo;une trilogie qui aura remporté le prix Hugo.\n","date":1526162400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1526162400,"objectID":"1c93f05b818253d22a35052bf95641b8","permalink":"/books/three-body/","publishdate":"2018-05-13T00:00:00+02:00","relpermalink":"/books/three-body/","section":"books","summary":"Bouquin unique, belle porte d\u0026rsquo;entrée à la hard sciences, qui plaira aux adeptes de SF comme aux fans de thrillers et aux ufologues. Premier tome d\u0026rsquo;une trilogie qui aura remporté le prix Hugo.","tags":[],"title":"The Three-Body Problem trilogy","type":"books"},{"authors":["Charles Corbiere","Hedi Ben-Younes","Alexandre Ramé","Charles Ollion"],"categories":null,"content":"https://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w32/Corbiere_Leveraging_Weakly_Annotated_ICCV_2017_paper.pdf\n","date":1470002400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1470002400,"objectID":"a3a7e1a45c6d8c61fd6c2aaba9d068ea","permalink":"/publication/weakly/","publishdate":"2016-08-01T00:00:00+02:00","relpermalink":"/publication/weakly/","section":"publication","summary":"https://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w32/Corbiere_Leveraging_Weakly_Annotated_ICCV_2017_paper.pdf","tags":null,"title":"Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction","type":"publication"}]

index.xml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,15 @@
1010
<lastBuildDate>Wed, 20 Apr 2016 00:00:00 +0200</lastBuildDate>
1111
<atom:link href="/" rel="self" type="application/rss+xml" />
1212

13+
<item>
14+
<title>Rewarded soups: towards Pareto-optimality by interpolating weights fine-tuned on diverse rewards</title>
15+
<link>/publication/rlwa/</link>
16+
<pubDate>Sun, 21 May 2023 00:00:00 +0200</pubDate>
17+
18+
<guid>/publication/rlwa/</guid>
19+
<description></description>
20+
</item>
21+
1322
<item>
1423
<title>Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization</title>
1524
<link>/publication/ratatouille/</link>

0 commit comments

Comments
 (0)