ben-aaron188 | ad8b3f3 | 2023-03-05 20:22:57 +0100 | [diff] [blame^] | 1 | % Generated by roxygen2: do not edit by hand |
| 2 | % Please edit documentation in R/chatgpt.R |
| 3 | \name{chatgpt} |
| 4 | \alias{chatgpt} |
| 5 | \title{Makes bunch chat completion requests to the ChatGPT API} |
| 6 | \usage{ |
| 7 | chatgpt( |
| 8 | prompt_role_var, |
| 9 | prompt_content_var, |
| 10 | id_var, |
| 11 | param_output_type = "complete", |
| 12 | param_model = "gpt-3.5-turbo", |
| 13 | param_max_tokens = 100, |
| 14 | param_temperature = 1, |
| 15 | param_top_p = 1, |
| 16 | param_n = 1, |
| 17 | param_stop = NULL, |
| 18 | param_presence_penalty = 0, |
| 19 | param_frequency_penalty = 0 |
| 20 | ) |
| 21 | } |
| 22 | \arguments{ |
| 23 | \item{prompt_role_var}{character vector that contains the role prompts to the ChatGPT request. Must be one of 'system', 'assistant', 'user' (default), see \url{https://platform.openai.com/docs/guides/chat}} |
| 24 | |
| 25 | \item{prompt_content_var}{character vector that contains the content prompts to the ChatGPT request. This is the key instruction that ChatGPT receives.} |
| 26 | |
| 27 | \item{id_var}{(optional) character vector that contains the user-defined ids of the prompts. See details.} |
| 28 | |
| 29 | \item{param_output_type}{character determining the output provided: "complete" (default), "text" or "meta"} |
| 30 | |
| 31 | \item{param_model}{a character vector that indicates the \href{https://platform.openai.com/docs/api-reference/chat/create#chat/create-model}{ChatGPT model} to use; one of "gpt-3.5-turbo" (default), "gpt-3.5-turbo-0301"} |
| 32 | |
| 33 | \item{param_max_tokens}{numeric (default: 100) indicating the maximum number of tokens that the completion request should return (from the official API documentation: \emph{The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens).})} |
| 34 | |
| 35 | \item{param_temperature}{numeric (default: 1.0) specifying the sampling strategy of the possible completions (from the official API documentation: \emph{What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or \code{top_p} but not both.})} |
| 36 | |
| 37 | \item{param_top_p}{numeric (default: 1) specifying sampling strategy as an alternative to the temperature sampling (from the official API documentation: \emph{An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10\% probability mass are considered. We generally recommend altering this or \code{temperature} but not both.})} |
| 38 | |
| 39 | \item{param_n}{numeric (default: 1) specifying the number of completions per request (from the official API documentation: \emph{How many chat completion choices to generate for each input message. \strong{Note: Because this parameter generates many completions, it can quickly consume your token quota.} Use carefully and ensure that you have reasonable settings for max_tokens and stop.})} |
| 40 | |
| 41 | \item{param_stop}{character or character vector (default: NULL) that specifies after which character value when the completion should end (from the official API documentation: \emph{Up to 4 sequences where the API will stop generating further tokens.})} |
| 42 | |
| 43 | \item{param_presence_penalty}{numeric (default: 0) between -2.00 and +2.00 to determine the penalisation of repetitiveness if a token already exists (from the official API documentation: \emph{Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.}). See also: \url{https://beta.openai.com/docs/api-reference/parameter-details}} |
| 44 | |
| 45 | \item{param_frequency_penalty}{numeric (default: 0) between -2.00 and +2.00 to determine the penalisation of repetitiveness based on the frequency of a token in the text already (from the official API documentation: \emph{Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.}). See also: \url{https://beta.openai.com/docs/api-reference/parameter-details}} |
| 46 | } |
| 47 | \value{ |
| 48 | A list with two data tables (if \code{output_type} is the default "complete"): [\link{1}] contains the data table with the columns \code{n} (= the mo. of \code{n} responses requested), \code{prompt_role} (= the role that was set for the prompt), \code{prompt_content} (= the content that was set for the prompt), \code{chatgpt_role} (= the role that ChatGPT assumed in the chat completion) and \code{chatgpt_content} (= the content that ChatGPT provided with its assumed role in the chat completion). [\link{2}] contains the meta information of the request, including the request id, the parameters of the request and the token usage of the prompt (\code{tok_usage_prompt}), the completion (\code{tok_usage_completion}), the total usage (\code{tok_usage_total}) and the \code{id} (= the provided \code{id_var} or its default alternative). |
| 49 | |
| 50 | If \code{output_type} is "text", only the data table in slot [\link{1}] is returned. |
| 51 | |
| 52 | If \code{output_type} is "meta", only the data table in slot [\link{2}] is returned. |
| 53 | } |
| 54 | \description{ |
| 55 | \code{chatgpt()} is the package's main function for the ChatGPT functionality and takes as input a vector of prompts and processes each prompt as per the defined parameters. It extends the \code{chatgpt_single()} function to allow for bunch processing of requests to the Open AI GPT API. |
| 56 | } |
| 57 | \details{ |
| 58 | The easiest (and intended) use case for this function is to create a data.frame or data.table with variables that contain the prompts to be requested from ChatGPT and a prompt id (see examples below). |
| 59 | For a general guide on the chat completion requests, see \url{https://platform.openai.com/docs/guides/chat/chat-completions-beta}. This function provides you with an R wrapper to send requests with the full range of request parameters as detailed on \url{https://platform.openai.com/docs/api-reference/chat/create} and reproduced below. |
| 60 | |
| 61 | If \code{id_var} is not provided, the function will use \code{prompt_1} ... \code{prompt_n} as id variable. |
| 62 | |
| 63 | Parameters not included/supported: |
| 64 | \itemize{ |
| 65 | \item \code{logit_bias}: \url{https://platform.openai.com/docs/api-reference/chat/create#chat/create-logit_bias} |
| 66 | \item \code{stream}: \url{https://platform.openai.com/docs/api-reference/chat/create#chat/create-stream} |
| 67 | } |
| 68 | } |
| 69 | \examples{ |
| 70 | # First authenticate with your API key via `gpt3_authenticate('pathtokey')` |
| 71 | |
| 72 | # Once authenticated: |
| 73 | # Assuming you have a data.table with 3 different prompts: |
| 74 | dt_prompts = data.table::data.table('prompts_content' = c('What is the meaning if life?', 'Write a tweet about London:', 'Write a research proposal for using AI to fight fake news:') |
| 75 | , 'prompts_role' = rep('user', 3) |
| 76 | , 'prompt_id' = c(LETTERS[1:3])) |
| 77 | chatgpt(prompt_role_var = dt_prompts$prompts_role |
| 78 | , prompt_content_var = dt_prompts$prompts_content |
| 79 | , id_var = dt_prompts$prompt_id) |
| 80 | |
| 81 | ## With more controls |
| 82 | chatgpt(prompt_role_var = dt_prompts$prompts_role |
| 83 | , prompt_content_var = dt_prompts$prompts_content |
| 84 | , id_var = dt_prompts$prompt_id |
| 85 | , param_max_tokens = 50 |
| 86 | , param_temperature = 0.5 |
| 87 | , param_n = 5) |
| 88 | |
| 89 | ## Reproducible example (deterministic approach) |
| 90 | chatgpt(prompt_role_var = dt_prompts$prompts_role |
| 91 | , prompt_content_var = dt_prompts$prompts_content |
| 92 | , id_var = dt_prompts$prompt_id |
| 93 | , param_max_tokens = 50 |
| 94 | , param_temperature = 0 |
| 95 | , param_n = 3) |
| 96 | |
| 97 | } |