๐Ÿ“š Bibliography

The page contains an organized list of all papers used by this course. The papers are organized by topic.

To cite this course, use the provided citation in the Github repository.

๐Ÿ”ต = Paper directly cited in this course. Other papers have informed my understanding of the topic.

Note: since neither the GPT-3 nor the GPT-3 Instruct paper correspond to davinci models, I attempt not to cite them as such.

Prompt Engineering Strategies

Chain of Thought(@wei2022chain) ๐Ÿ”ต

Zero Shot Chain of Thought(@kojima2022large) ๐Ÿ”ต

Self Consistency(@wang2022selfconsistency) ๐Ÿ”ต

What Makes Good In-Context Examples for GPT-3?(@liu2021makes) ๐Ÿ”ต

Ask-Me-Anything Prompting(@arora2022ama) ๐Ÿ”ต

Generated Knowledge(@liu2021generated) ๐Ÿ”ต

Recitation-Augmented Language Models(@sun2022recitationaugmented) ๐Ÿ”ต

Rethinking the role of demonstrations(@min2022rethinking) ๐Ÿ”ต

Scratchpads(@nye2021work)

Maieutic Prompting(@jung2022maieutic)

STaR(@zelikman2022star)

Least to Most(@zhou2022leasttomost) ๐Ÿ”ต

Reliability

MathPrompter(@imani2023mathprompter) ๐Ÿ”ต

The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning(@ye2022unreliability) ๐Ÿ”ต

Prompting GPT-3 to be reliable(@si2022prompting)

Diverse Prompts(@li2022advance) ๐Ÿ”ต

Calibrate Before Use: Improving Few-Shot Performance of Language Models(@zhao2021calibrate) ๐Ÿ”ต

Enhanced Self Consistency(@mitchell2022enhancing)

Bias and Toxicity in Zero-Shot CoT(@shaikh2022second) ๐Ÿ”ต

Constitutional AI: Harmlessness from AI Feedback (@bai2022constitutional) ๐Ÿ”ต

Compositional Generalization - SCAN(@lake2018scan)

Automated Prompt Engineering

AutoPrompt(@shin2020autoprompt) ๐Ÿ”ต

Automatic Prompt Engineer(@zhou2022large)

Models

Language Models

GPT-3(@brown2020language) ๐Ÿ”ต

GPT-3 Instruct(@ouyang2022training) ๐Ÿ”ต

PaLM(@chowdhery2022palm) ๐Ÿ”ต

BLOOM(@scao2022bloom) ๐Ÿ”ต

BLOOM+1 (more languages/ 0 shot improvements)(@yong2022bloom1)

Jurassic 1(@lieberjurassic) ๐Ÿ”ต

GPT-J-6B(@wange2021gptj)

Roberta(@liu2019roberta)

Image Models

Stable Diffusion(@rombach2021highresolution) ๐Ÿ”ต

DALLE(@ramesh2022hierarchical) ๐Ÿ”ต

Soft Prompting

Soft Prompting(@lester2021power) ๐Ÿ”ต

Interpretable Discretized Soft Prompts(@khashabi2021prompt) ๐Ÿ”ต

Datasets

MultiArith(@roy-roth-2015-solving) ๐Ÿ”ต

GSM8K(@cobbe2021training) ๐Ÿ”ต

HotPotQA(@yang2018hotpotqa) ๐Ÿ”ต

Fever(@thorne2018fever) ๐Ÿ”ต

BBQ: A Hand-Built Bias Benchmark for Question Answering(@parrish2021bbq) ๐Ÿ”ต

Image Prompt Engineering

Taxonomy of prompt modifiers(@oppenlaender2022taxonomy)

DiffusionDB(@wang2022diffusiondb)

The DALLE 2 Prompt Book(@parsons2022dalleprompt) ๐Ÿ”ต

Prompt Engineering for Text-Based Generative Art(@oppenlaender2022prompt) ๐Ÿ”ต

With the right prompt, Stable Diffusion 2.0 can do hands.(@blake2022with) ๐Ÿ”ต

Optimizing Prompts for Text-to-Image Generation(@hao2022optimizing)

Prompt Engineering IDEs

Prompt IDE(@strobelt2022promptide) ๐Ÿ”ต

Prompt Source(@bach2022promptsource) ๐Ÿ”ต

PromptChainer(@wu2022promptchainer) ๐Ÿ”ต

PromptMaker(@jiang2022promptmaker) ๐Ÿ”ต

Tooling

LangChain(@Chase_LangChain_2022) ๐Ÿ”ต

TextBox 2.0: A Text Generation Library with Pre-trained Language Models(@tang2022textbox) ๐Ÿ”ต

OpenPrompt: An Open-source Framework for Prompt-learning(@ding2021openprompt) ๐Ÿ”ต

GPT Index(@Liu_GPT_Index_2022) ๐Ÿ”ต

Applied Prompt Engineering

Language Model Cascades(@dohan2022language)

MRKL(@karpas2022mrkl) ๐Ÿ”ต

ReAct(@yao2022react) ๐Ÿ”ต

PAL: Program-aided Language Models(@gao2022pal) ๐Ÿ”ต

User Interface Design

Design Guidelines for Prompt Engineering Text-to-Image Generative Models(@liu2022design)

Prompt Injection

Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods(@crothers2022machine) ๐Ÿ”ต

Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples(@branch2022evaluating) ๐Ÿ”ต

Prompt injection attacks against GPT-3(@simon2022inject) ๐Ÿ”ต

Exploiting GPT-3 prompts with malicious inputs that order the model to ignore its previous directions(@goodside2022inject) ๐Ÿ”ต

adversarial-prompts(@chase2021adversarial) ๐Ÿ”ต

GPT-3 Prompt Injection Defenses(@goodside2021gpt) ๐Ÿ”ต

Talking to machines: prompt engineering & injection(@christoph2022talking)

Exploring Prompt Injection Attacks(@selvi2022exploring) ๐Ÿ”ต

Using GPT-Eliezer against ChatGPT Jailbreaking(@armstrong2022using) ๐Ÿ”ต

Microsoft Bing Chat Prompt(@kevinbing)

Jailbreaking

Ignore Previous Prompt: Attack Techniques For Language Models(@perez2022jailbreak)

Lessons learned on Language Model Safety and misuse(@brundage_2022)

Toxicity Detection with Generative Prompt-based Inference(@wang2022jailbreak)

New and improved content moderation tooling(@markov_2022)

OpenAI API(@openai_api) ๐Ÿ”ต

OpenAI ChatGPT(@openai_chatgpt) ๐Ÿ”ต

ChatGPT 4 Tweet(@alice2022jailbreak) ๐Ÿ”ต

Acting Tweet(@miguel2022jailbreak) ๐Ÿ”ต

Research Tweet(@derek2022jailbreak) ๐Ÿ”ต

Pretend Ability Tweet(@nero2022jailbreak) ๐Ÿ”ต

Responsibility Tweet(@nick2022jailbreak) ๐Ÿ”ต

Lynx Mode Tweet(@jonas2022jailbreak) ๐Ÿ”ต

Sudo Mode Tweet(@sudo2022jailbreak) ๐Ÿ”ต

Ignore Previous Prompt(@ignore_previous_prompt) ๐Ÿ”ต

Updated Jailbreaking Prompts (@AI_jailbreak) ๐Ÿ”ต

Surveys

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing(@liu2021pretrain)

PromptPapers(@ning2022papers)

Dataset Generation

Discovering Language Model Behaviors with Model-Written Evaluations(@perez2022discovering)

Selective Annotation Makes Language Models Better Few-Shot Learners(@su2022selective)

Applications

Atlas: Few-shot Learning with Retrieval Augmented Language Models(@izacard2022atlas)

STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension(@wang2022strudel)

Miscl

Prompting Is Programming: A Query Language For Large Language Models(@beurerkellner2022prompting)

Parallel Context Windows Improve In-Context Learning of Large Language Models(@ratner2022parallel)

Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models(@bursztyn2022learning)

Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks(@wang2022supernaturalinstructions)

Making Pre-trained Language Models Better Few-shot Learners(@gao2021making)

Grounding with search results(@livin2022large)

How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models(@dang2022prompt)

On Measuring Social Biases in Prompt-Based Multi-Task Learning(@akyrek2022measuring)

Plot Writing From Pre-Trained Language Models(@jin2022plot) ๐Ÿ”ต

StereoSet: Measuring stereotypical bias in pretrained language models(@nadeem-etal-2021-stereoset)

Survey of Hallucination in Natural Language Generation(@Ji_2022)

Examples(@2022examples)

Wordcraft(@yuan2022wordcraft)

PainPoints(@fadnavis2022pain)

Self-Instruct: Aligning Language Model with Self Generated Instructions(@wang2022selfinstruct)

From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models(@guo2022images)

Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference(@schick2020exploiting)

Ask-Me-Anything Prompting(@arora2022ama)

A Watermark for Large Language Models(@kirchenbauer2023watermarking)