8000
Skip to content
View Arcs-ur's full-sized avatar
🍀
Orange!Pumpkin!
🍀
Orange!Pumpkin!

Highlights

  • Pro

Block or report Arcs-ur

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This includes the original implementation of "AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning".

5 Updated Dec 20, 2025

PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML …

Python 445 42 Updated Feb 26, 2024

A fast + lightweight implementation of the GCG algorithm in PyTorch

Python 306 73 Updated May 13, 2025

Closed-form Continuous-time Neural Networks

Python 996 158 Updated Jul 5, 2024

一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出 - An AI-native PPT generator based on nano banana pro🍌

Python 5,905 655 Updated Dec 23, 2025

Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".

Python 321 87 Updated Jun 13, 2025

Aligning pretrained language models with instruction data generated by themselves.

Python 4,550 524 Updated Mar 27, 2023

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

HTML 8,282 768 Updated Oct 16, 2024

LongBench v2 and LongBench (ACL 25'&24')

Python 1,047 112 Updated Jan 15, 2025
Python 1,738 77 Updated Dec 16, 2025

[New Preprint] PISanitizer: Preventing Prompt Injection to Long-Context LLMs via Prompt Sanitization

Python 8 Updated Dec 10, 2025

Universal and Transferable Attacks on Aligned Language Models

Python 4,409 587 Updated Aug 2, 2024

This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".

Python 76 4 Updated Jul 10, 2025

[NAACL 2025] SIUO: Cross-Modality Safety Alignment

HTML 123 11 Updated Jan 31, 2025

A multi-platform proxy client based on ClashMeta,simple and easy to use, open-source and ad-free.

Dart 27,892 1,683 Updated Dec 23, 2025

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Jupyter Notebook 814 122 Updated Aug 16, 2024

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,929 295 Updated Aug 9, 2025

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 30,262 4,029 Updated Jul 17, 2024

official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries

Python 56 3 Updated Nov 10, 2025

The official repository for guided jailbreak benchmark

Python 26 Updated Jul 28, 2025

Finetune Llama-3-8b on the MathInstruct dataset

Python 115 26 Updated Oct 17, 2024

BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

Makefile 171 6 Updated Oct 27, 2023

Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"

Python 86 6 Updated Feb 26, 2025

The Oyster series is a set of safety models developed in-house by Alibaba-AAIG, devoted to building a responsible AI ecosystem. | Oyster 系列是 Alibaba-AAIG 自研的安全模型,致力于构建负责任的 AI 生态。

Python 57 3 Updated Sep 11, 2025

A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.

Python 73 2 Updated May 22, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 81,526 12,192 Updated Dec 21, 2025
Python 100 14 Updated Oct 21, 2025

[PKU EPIC Lab] 面向小白的具身智能入门指南

616 21 Updated Dec 3, 2025

让你一眼惊艳的prompt

GCC Machine Description 946 186 Updated Nov 27, 2025

神级Cursor Rule

2,218 312 Updated Oct 5, 2025
Next
0