This tutorial walks you through how to use the OpenAI GPT API in R to code and summarize free-text responses, such as user reviews. You will load a dataset of reviews, send each one to GPT to summarize it, and return the results programmatically.
🧰 What You Need
- R installed on your system
- An OpenAI API key – [see this guide]
- A CSV file with at least one column of data to code – [sample of nine reviews here]
- (Optional) The entire code in a RMarkdown file – [download here]
1. Setting Up Your R Environment
Install Required Packages
install.packages(c("dplyr", "stringr", "httr", "jsonlite", "readr"))
Load the Libraries
library(dplyr)
library(stringr)
library(httr)
library(jsonlite)
library(readr)
2. Load Your Dataset
In this example, we assume you have a file called Reviews_Small.csv containing a column review.
data <- read.csv("Reviews_Small.csv")
str(data)
Make sure your dataset has at least one column with text data to summarize.
3. Set Up the GPT API Call
Store Your API Key
Important: Replace the placeholder string below with your actual API key from OpenAI.
api_key <- "sk-REPLACE_WITH_YOUR_OWN_KEY"
Define a Function to Send Prompts
send_to_chatgpt <- function(text, temp, api_key, max_tokens) {
url <- "https://api.openai.com/v1/chat/completions"
headers <- add_headers(
`Content-Type` = "application/json",
`Authorization` = paste("Bearer", api_key)
)
data <- list(
model = "gpt-4o",
messages = list(list(role = "user", content = text)),
temperature = temp,
max_tokens = max_tokens
)
response <- POST(url, headers, body = toJSON(data, auto_unbox = TRUE))
parsed <- content(response, as = "parsed")
parsed$choices[[1]]$message$content
}
This function sends a prompt to GPT-4o and returns the generated output. You can control the “creativity” of the response using temperature (lower values = more deterministic).
4. Create the Summarization Prompt
instructions_template <- "
You are a helpful assistant. Summarize the following review in one sentence:
Review:
"
5. Loop Through the Reviews
The function below iterates over your dataset, sending each review to GPT and recording the raw output.
process_reviews <- function(num_judges, temperatures, data, instructions_template, api_key, max_tokens) {
num_reviews <- nrow(data)
for (temp in temperatures) {
for (judge in 1:num_judges) {
temp_col <- sprintf("GPT_Temp%.1f_Judge%d", temp, judge)
data[[temp_col]] <- sapply(data$review, function(review) {
prompt <- paste(instructions_template, review, sep = "\n")
send_to_chatgpt(prompt, temp, api_key, max_tokens)
})
}
}
return(data)
}
Run the Function
data <- process_reviews(
num_judges = 2,
temperatures = c(0.2, 1.5),
data = data,
instructions_template = instructions_template,
api_key = api_key,
max_tokens = 60
)
This sends every review to GPT twice, once at temperature 0.2 (more factual) and once at 1.5 (more creative).
6. Save Your Results
write.csv(data, "Reviews_coded.csv", row.names = FALSE)
This exports a new CSV file with the original reviews and GPT-coded summaries.
✅ Conclusion
Using GPT to code text data offers flexibility and scale, making it easier to process large sets of qualitative input. You can expand this framework to assign sentiment, categorize comments, or extract themes—just by changing the prompt.
Be sure to monitor your token usage if you’re doing large-scale work with the OpenAI API.
Leave a comment