The ELLMER Package

Introduction

In this tutorial we go over some basic functions of the ellmer package. This allows users to call on LLMs via API, including ChatGPT, Gemini, and others.¹ This means you can chat with an LLM directly on your console, making some programming tasks easier. The package also makes working with structured data easier for R users, without requiring knowledge of other languages like python or JSON.²

The first step is to obtain an API key. An API key is a secret password that lets your R session talk to an LLM provider such as Anthropic’s Claude or Google Gemini. When you call a model through a package like ellmer, your key tells the provider who you are so it can check that you have permission and charge your account correctly.

Your key is linked to your account and payment. If someone else gets it, they could run models using your quota or spend money on your behalf. Never paste it into shared code, publish it on GitHub, or include it inside a file.³

library(ellmer)

claude_api_key <- "[your-key-goes-here]"
google_api_key <- "[your-key-goes-here]"
# openai_api_key <- "[your-key-goes-here]"

Chatting with LLMs

Once your API key is set up, you can start talking to a model just like you would in ChatGPT — but from inside R. In the code below we are creating a chat connection to Anthropic’s model Claude.

The function chat_anthropic() comes from the ellmer package and uses your stored API key to authenticate you.

claude <- chat_anthropic(
  api_key = claude_api_key
)

Once we have defined our chat object claude, we use the code claude$chat() to send a message to Anthropic and receive a reply in the console.

claude$chat(
  "Hello Claude! Write a haiku on why R is better than Python."
)

Here's a haiku on R versus Python:

Data frames just work,
ggplot2 makes beauty shine—
Stats built in the core.

We can create several chat objects to call on later and adjust parameters such as the system prompt and the specific model we want to use from the provider.

In addition to calling on chat objects you can set up a live console to chat with an LLM with the syntax live_console(gemini). But remember that each call costs you a modest amount.

gemini <- chat_google_gemini(
  api_key = google_api_key,
  system_prompt = "Finish all outputs by adding '-- from Gemini' at the end"
)

# chatgpt <- chat_openai(
#   api_key = openai_api_key,
#   model = "gpt-4o"
# )

Integrating LLMs into R workflows

One of the best things about the ellmer package is that it treats models like any other R function. You can loop, map, or pipe through them just like you would with any data wrangling task.

In the example below, R goes through each word in the list and asks ChatGPT whether it’s an animal. You can later store the outputs in a vector or tibble and combine them with your data frame, making the model’s answers part of your standard analysis pipeline.

example <- list("cat", 
                "screw", 
                "hyena", 
                "tomahawk")

for (i in example){
  claude$chat(
    paste("Is a", 
          i, 
          "an animal?", 
          "Answer only yes or no.",
          sep = " "))
}

Yes.
No.
Yes.
No.

Structured Outputs

One of the main advantages of the ellmer package is its ability to return structured outputs. This allows you to supply a type specification (schema) that defines the object structure you want back from an LLM. Then the model returns output in that exact structure rather than free-text formatting.

The code below uses ellmer’s structured data interface to tell the model exactly what kind of answer to return when analyzing the photo above. The command chat_structured() sends the image (here a local photo from Unsplash) to Claude together with a schema built using type_object(). The schema defines three fields—description (text), people_count (integer), and confidence (number between 0 and 1).

claude$chat_structured(
  content_image_file("cory-schadt-Hhcn6yy3Uo8-unsplash.jpg"),
  type = type_object(
    description  = type_string(),
    people_count = type_integer("Number of people clearly visible"),
    confidence   = type_number("Model confidence between 0 and 1")
  )
)

$description
[1] "A busy pedestrian crossing in what appears to be a major urban area, likely Shibuya crossing in Tokyo, Japan. The scene shows numerous people crossing a wide street intersection with colorful commercial buildings, digital signage, and storefronts in the background. The image captures the bustling energy of a crowded city center during daytime."

$people_count
[1] 45

$confidence
[1] 0.85

The model’s reply is returned as a regular R list, which means you can immediately store it, analyse it, or combine it with other data. In this way, ellmer turns unstructured data like text and images into something you can quantify, summarise, and visualise just like any other dataset.

This tutorial is syndicated via R-Weekly.

Footnotes

For a comprehensive list of all supported LLM providers visit the Ellmer website.↩︎
Prior versions used an OpenAI API key which has now been deprecated, the new examples use Claude.↩︎
In R it is common to put sensitive variables such as API keys into an .Renviron file which is excluded from any uploads (e.g. .gitignore). For this tutorial I called on my environment variables with code like this Sys.getenv("CLAUDE_API_KEY").↩︎