library(ellmer)
claude_api_key <- "[your-key-goes-here]"
google_api_key <- "[your-key-goes-here]"
openai_api_key <- "[your-key-goes-here]"Introduction
In this tutorial we go over some basic functions of the ellmer package. This allows users to call on LLMs via API, including ChatGPT, Gemini, and others.1 This means you can chat with an LLM directly on your console, making some programming tasks easier. The package also makes working with structured data easier for R users, without requiring knowledge of other languages like python or JSON.
The first step is to obtain an API key. An API key is a secret password that lets your R session talk to an LLM provider such as OpenAI or Google Gemini. When you call a model through a package like ellmer, your key tells the provider who you are so it can check that you have permission and charge your account correctly.
Your key is linked to your account and payment. If someone else gets it, they could run models using your quota or spend money on your behalf. Never paste it into shared code, publish it on GitHub, or include it inside a file.2
Chatting with LLMs
Once your API key is set up, you can start talking to a model just like you would in ChatGPT — but from inside R. In the code below we are creating a chat connection to Anthropic’s model Claude.
The function chat_anthropic() comes from the ellmer package and uses your stored API key to authenticate you.
claude <- chat_anthropic(
api_key = claude_api_key
)Once we have defined our chat object claude, we use the code claude$chat() to send a message to Anthropic and receive a reply in the console.
claude$chat(
"Hello Claude! Write a haiku on why R is better than Python."
)Here's a haiku on why R is better than Python:
Data flows like streams—
ggplot paints statistical dreams,
vectors dance with ease.
This captures R's strengths in data manipulation (vectorization), statistical
analysis, and data visualization (particularly with ggplot2). Though I should
note that both R and Python are excellent tools with their own strengths
depending on the use case!
We can create several chat objects to call on later and adjust parameters such as the system prompt and the specific model we want to use from the provider.
In addition to calling on chat objects you can set up a live console to chat with an LLM with the syntax live_console(gemini). But remember that each call costs you a modest amount.
gemini <- chat_google_gemini(
api_key = google_api_key,
system_prompt = "Finish all outputs by adding '-- from Gemini' at the end"
)
chatgpt <- chat_openai(
api_key = openai_api_key,
model = "gpt-4o"
)Integrating LLMs into R workflows
One of the best things about the ellmer package is that it treats models like any other R function. You can loop, map, or pipe through them just like you would with any data wrangling task.
In the example below, R goes through each word in the list and asks ChatGPT whether it’s an animal. You can later store the outputs in a vector or tibble and combine them with your data frame, making the model’s answers part of your standard analysis pipeline.
example <- list("cat",
"screw",
"hyena",
"tomahawk")
for (i in example){
chatgpt$chat(
paste("Is a",
i,
"an animal?",
"Answer only yes or no.",
sep = " "))
}Yes.
No.
Yes.
No.
Structured Outputs
One of the main advantages of the ellmer package is its ability to return structured outputs. This allows you to supply a type specification (schema) that defines the object structure you want back from an LLM. Then the model returns output in that exact structure rather than free-text formatting.
The code below uses ellmer’s structured data interface to tell the model exactly what kind of answer to return when analyzing the photo above. The command chat_structured() sends the image (here a local photo from Unsplash) to Claude together with a schema built using type_object(). The schema defines three fields—description (text), people_count (integer), and confidence (number between 0 and 1).
claude$chat_structured(
content_image_file("cory-schadt-Hhcn6yy3Uo8-unsplash.jpg"),
type = type_object(
description = type_string(),
people_count = type_integer("Number of people clearly visible"),
confidence = type_number("Model confidence between 0 and 1")
)
)$description
[1] "A busy street scene showing a large crowd of people crossing at what appears to be the famous Shibuya crossing in Tokyo, Japan. The image captures the iconic urban landscape with tall buildings covered in colorful neon signs and advertisements. Many pedestrians are crossing the wide intersection simultaneously, creating the characteristic organized chaos that Shibuya crossing is known for. The scene appears to be taken during daytime with overcast lighting."
$people_count
[1] 45
$confidence
[1] 0.85
The model’s reply is returned as a regular R list, which means you can immediately store it, analyse it, or combine it with other data. In this way, ellmer turns unstructured data like text and images into something you can quantify, summarise, and visualise just like any other dataset.
This tutorial is syndicated via R-Bloggers.
Footnotes
For a comprehensive list of all supported LLM providers visit the Ellmer website.↩︎
In R it is common to put sensitive variables such as API keys into an
.Renvironfile which is excluded from any uploads (e.g..gitignore). For this tutorial I called on my environment variables with code like thisSys.getenv("CLAUDE_API_KEY").↩︎
