Superalignment Fast Grants Needs A Question To Prioritize "Low Compute/Fast Feedback"
Here's Why
On December 14th, 2023, the OpenAI Superalignment team released their first paper, “Weak-To-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision” [LINK] and 10 million dollars in grants [LINK] through the “Superalignment Fast Grants” program.
Many non-OpenAI folks have covered the Superalignment paper in extensive detail, prominently seen in debates under the X/Twitter post from Jan Leike [LINK], but my conclusion from reading the paper and the application for the Fast Grants program is that OpenAI (and any other alignment grantmaking organization) should add this question to their forms:
“How will you prioritize fast feedback loops and feasibility with small amounts of compute in your research methods?"
[Response Length: Short, 1-2 sentences, bullet points are acceptable]
I argue that prioritizing fast feedback loops and feasibility with small amounts of compute through this question would be in the best interest of the alignment community for two reasons:
The purpose of the Superalignment Fast Grants is to “rally the best researchers and engineers in the world to meet this challenge—and… to bring new people into the field.” [LINK] The number of potential technically proficient software engineers and ML researchers is quite high, but limiting resources are compute and healthy hours in the day to conduct research. Therefore, fast feedback loops and feasibility with small amounts of compute would enable more technical folk to contribute.
While a nuanced bystander might argue, “Well, part of why the Superalignment Weak-To-Strong Generalization paper methods were so fast and scalable was that they were more of a proof-of-concept than a final alignment strategy”, I would counter-argue that my question does not imply that a researcher’s method has to have fast feedback loops or be feasible with small amounts of compute. Rather, it simply frames these points as something to be prioritized much more heavily than they are now, because iterative, accessible methods can (in addition to my first “more contribution” point) allow for lots of empirical progress to be made. [LINK]