blob: 8ec6d23926e0c6d1cd31289ed4531a11474d1856 [file] [log] [blame]
ben-aaron188287b30b2022-09-11 16:46:37 +02001#' Obtains text embeddings for a single character (string) from the GPT-3 API
2#'
3#' @description
ben-aaron1882b89c2a2022-09-11 16:54:25 +02004#' `gpt_single_embedding()` sends a single [embedding request](https://beta.openai.com/docs/guides/embeddings) to the Open AI GPT-3 API.
ben-aaron188287b30b2022-09-11 16:46:37 +02005#' @details The function supports the text similarity embeddings for the four GPT-3 models as specified in the parameter list. The main difference between the four models is the sophistication of the embedding representation as indicated by the vector embedding size.
6#' - Ada (1024 dimensions)
7#' - Babbage (2048 dimensions)
8#' - Curie (4096 dimensions)
9#' - Davinci (12288 dimensions)
10#'
11#' Note that the dimension size (= vector length), speed and [associated costs](https://openai.com/api/pricing/) differ considerably.
12#'
13#' These vectors can be used for downstream tasks such as (vector) similarity calculations.
14#' @param input character that contains the text for which you want to obtain text embeddings from the GPT-3 model
15#' @param model a character vector that indicates the [similarity embedding model](https://beta.openai.com/docs/guides/embeddings/similarity-embeddings); one of "text-similarity-ada-001" (default), "text-similarity-curie-001", "text-similarity-babbage-001", "text-similarity-davinci-001"
16#' @return A numeric vector (= the embedding vector)
17#' @examples
18#' # First authenticate with your API key via `gpt3_authenticate('pathtokey')`
19#'
20#' # Once authenticated:
21#'
22#' ## Simple request with defaults:
23#' sample_string = "London is one of the most liveable cities in the world. The city is always full of energy and people. It's always a great place to explore and have fun."
ben-aaron1882b89c2a2022-09-11 16:54:25 +020024#' gpt_single_embedding(input = sample_string)
ben-aaron188287b30b2022-09-11 16:46:37 +020025#'
26#' ## Change the model:
ben-aaron1882b89c2a2022-09-11 16:54:25 +020027#' #' gpt_single_embedding(input = sample_string
ben-aaron188287b30b2022-09-11 16:46:37 +020028#' , model = 'text-similarity-curie-001')
29#' @export
ben-aaron1882b89c2a2022-09-11 16:54:25 +020030gpt_single_embedding = function(input
ben-aaron188287b30b2022-09-11 16:46:37 +020031 , model = 'text-similarity-ada-001'
32 ){
ben-aaron1883818e7c2022-09-08 17:49:01 +020033
ben-aaron188287b30b2022-09-11 16:46:37 +020034 parameter_list = list(model = model
35 , input = input)
ben-aaron1883818e7c2022-09-08 17:49:01 +020036
37 request_base = httr::POST(url = url.embeddings
38 , body = parameter_list
39 , httr::add_headers(Authorization = paste("Bearer", api_key))
40 , encode = "json")
41
42
43 output_base = httr::content(request_base)
44
ben-aaron188287b30b2022-09-11 16:46:37 +020045 embedding_raw = to_numeric(unlist(output_base$data[[1]]$embedding))
ben-aaron1883818e7c2022-09-08 17:49:01 +020046
47 return(embedding_raw)
48
49}