OpenAI Crimson Teaming Community

Q: What’s going to becoming a member of the community entail?

A: Being a part of the community means you might be contacted about alternatives to check a brand new mannequin, or check an space of curiosity on a mannequin that’s already deployed. Work performed as part of the community is performed beneath a non-disclosure settlement (NDA), although we now have traditionally printed a lot of our pink teaming findings in System Playing cards and weblog posts. You may be compensated for time spent on pink teaming tasks.

Q: What’s the anticipated time dedication for being part of the community?

A: The time that you just determine to commit will be adjusted relying in your schedule. Be aware that not everybody within the community shall be contacted for each alternative, OpenAI will make choices primarily based on the best match for a specific pink teaming undertaking, and emphasize new views in subsequent pink teaming campaigns. Whilst little as 5 hours in a single 12 months would nonetheless be invaluable to us, so don’t hesitate to use in case you are however your time is restricted.

Q: When will candidates be notified of their acceptance?

A: OpenAI shall be deciding on members of the community on a rolling foundation and you may apply till December 1, 2023. After this software interval, we are going to re-evaluate opening future alternatives to use once more.

Q: Does being part of the community imply that I shall be requested to pink staff each new mannequin?

A: No, OpenAI will make choices primarily based on the best match for a specific pink teaming undertaking, and you shouldn’t count on to check each new mannequin.

Q: What are some standards you’re in search of in community members?

A: Some standards we’re in search of are:

  • Demonstrated experience or expertise in a specific area related to pink teaming
  • Captivated with bettering AI security
  • No conflicts of curiosity
  • Numerous backgrounds and historically underrepresented teams
  • Numerous geographic illustration
  • Fluency in multiple language
  • Technical skill (not required)

Q: What are different collaborative security alternatives?

A: Past becoming a member of the community, there are different collaborative alternatives to contribute to AI security. As an example, one possibility is to create or conduct security evaluations on AI methods and analyze the outcomes.

OpenAI’s open-source Evals repository (launched as a part of the GPT-4 launch) gives user-friendly templates and pattern strategies to jump-start this course of.

Evaluations can vary from easy Q&A exams to more-complex simulations. As concrete examples, listed below are pattern evaluations developed by OpenAI for evaluating AI behaviors from a lot of angles:


  • MakeMeSay: How nicely can an AI system trick one other AI system into saying a secret phrase?
  • MakeMePay: How nicely can an AI system persuade one other AI system to donate cash?
  • Ballot Proposal: How nicely can an AI system affect one other AI system’s assist of a political proposition?

Steganography (hidden messaging)

  • Steganography: How nicely can an AI system ​​go secret messages with out being caught by one other AI system?
  • Text Compression: How nicely can an AI system compress and decompress messages, to allow hiding secret messages?
  • Schelling Point: How nicely can an AI system coordinate with one other AI system, with out direct communication?

We encourage creativity and experimentation in evaluating AI methods. As soon as accomplished, we welcome you to contribute your analysis to the open-source Evals repo to be used by the broader AI neighborhood.

You may as well apply to our Researcher Access Programwhich gives credit to assist researchers utilizing our merchandise to review areas associated to the accountable deployment of AI and mitigating related dangers.

Date: 2023-09-19 03:00:00

Source link



Related articles

Malicious npm Packages Discovered Utilizing Picture Information to Cover Backdoor Code

Jul 16, 2024NewsroomOpen Supply / Software program Provide Chain Cybersecurity...

Studying cloud value administration the exhausting means

The fast adoption of cloud applied sciences has outpaced...

Void Banshee APT Exploits Microsoft MHTML Flaw to Unfold Atlantida Stealer

Jul 16, 2024NewsroomKnowledge Safety / Vulnerability A sophisticated persistent risk...
Alina A, Toronto
Alina A, Toronto
Alina A, an UofT graduate & Google Certified Cyber Security analyst, currently based in Toronto, Canada. She is passionate for Research and to write about Cyber-security related issues, trends and concerns in an emerging digital world.


Please enter your comment!
Please enter your name here