added new second gen embeddingsas default
diff --git a/R/gpt3_embeddings.R b/R/gpt3_embeddings.R
index 2e7c167..2a7b7c7 100644
--- a/R/gpt3_embeddings.R
+++ b/R/gpt3_embeddings.R
@@ -2,9 +2,10 @@
#'
#' @description
#' `gpt3_embeddings()` extends the single embeddings function `gpt3_single_embedding()` to allow for the processing of a whole vector
-#' @details The returned data.table contains the column `id` which indicates the text id (or its generic alternative if not specified) and the columns `dim_1` ... `dim_{max}`, where `max` is the length of the text embeddings vector that the four different models return. For the default "Ada" model, these are 1024 dimensions (i.e., `dim_1`... `dim_1024`).
+#' @details The returned data.table contains the column `id` which indicates the text id (or its generic alternative if not specified) and the columns `dim_1` ... `dim_{max}`, where `max` is the length of the text embeddings vector that the different models (see below) return. For the default "Ada 2nd gen." model, these are 1536 dimensions (i.e., `dim_1`... `dim_1536`).
#'
-#' The function supports the text similarity embeddings for the four GPT-3 models as specified in the parameter list. The main difference between the four models is the sophistication of the embedding representation as indicated by the vector embedding size.
+#' The function supports the text similarity embeddings for the [five GPT-3 embeddings models](https://beta.openai.com/docs/guides/embeddings/embedding-models) as specified in the parameter list. It is strongly advised to use the second generation model "text-embedding-ada-002". The main difference between the five models is the size of the embedding representation as indicated by the vector embedding size and the pricing. The newest model (default) is the fastest, cheapest and highest quality one.
+#' - Ada 2nd generation `text-embedding-ada-002` (1536 dimensions)
#' - Ada (1024 dimensions)
#' - Babbage (2048 dimensions)
#' - Curie (4096 dimensions)
@@ -15,7 +16,7 @@
#' These vectors can be used for downstream tasks such as (vector) similarity calculations.
#' @param input_var character vector that contains the texts for which you want to obtain text embeddings from the GPT-3 model
#' #' @param id_var (optional) character vector that contains the user-defined ids of the prompts. See details.
-#' @param param_model a character vector that indicates the [similarity embedding model](https://beta.openai.com/docs/guides/embeddings/similarity-embeddings); one of "text-similarity-ada-001" (default), "text-similarity-curie-001", "text-similarity-babbage-001", "text-similarity-davinci-001"
+#' @param param_model a character vector that indicates the [embedding model](https://beta.openai.com/docs/guides/embeddings/embedding-models); one of "text-embedding-ada-002" (default), "text-similarity-ada-001", "text-similarity-curie-001", "text-similarity-babbage-001", "text-similarity-davinci-001"
#' @return A data.table with the embeddings as separate columns; one row represents one input text. See details.
#' @examples
#' # First authenticate with your API key via `gpt3_authenticate('pathtokey')`
@@ -35,7 +36,7 @@
#' @export
gpt3_embeddings = function(input_var
, id_var
- , param_model = 'text-similarity-ada-001'){
+ , param_model = 'text-embedding-ada-002'){
data_length = length(input_var)
if(missing(id_var)){
diff --git a/R/gpt3_single_embedding.R b/R/gpt3_single_embedding.R
index dc6c2ea..ac0cee5 100644
--- a/R/gpt3_single_embedding.R
+++ b/R/gpt3_single_embedding.R
@@ -3,6 +3,7 @@
#' @description
#' `gpt3_single_embedding()` sends a single [embedding request](https://beta.openai.com/docs/guides/embeddings) to the Open AI GPT-3 API.
#' @details The function supports the text similarity embeddings for the four GPT-3 models as specified in the parameter list. The main difference between the four models is the sophistication of the embedding representation as indicated by the vector embedding size.
+#' - Second-generation embeddings model `text-embedding-ada-002` (1536 dimensions)
#' - Ada (1024 dimensions)
#' - Babbage (2048 dimensions)
#' - Curie (4096 dimensions)
@@ -12,7 +13,7 @@
#'
#' These vectors can be used for downstream tasks such as (vector) similarity calculations.
#' @param input character that contains the text for which you want to obtain text embeddings from the GPT-3 model
-#' @param model a character vector that indicates the [similarity embedding model](https://beta.openai.com/docs/guides/embeddings/similarity-embeddings); one of "text-similarity-ada-001" (default), "text-similarity-curie-001", "text-similarity-babbage-001", "text-similarity-davinci-001"
+#' @param model a character vector that indicates the [similarity embedding model](https://beta.openai.com/docs/guides/embeddings/similarity-embeddings); one of "text-embedding-ada-002" (default), "text-similarity-ada-001", "text-similarity-curie-001", "text-similarity-babbage-001", "text-similarity-davinci-001". Note: it is strongly recommend to use the faster, cheaper and higher quality second generation embeddings model "text-embedding-ada-002".
#' @return A numeric vector (= the embedding vector)
#' @examples
#' # First authenticate with your API key via `gpt3_authenticate('pathtokey')`
@@ -28,7 +29,7 @@
#' , model = 'text-similarity-curie-001')
#' @export
gpt3_single_embedding = function(input
- , model = 'text-similarity-ada-001'
+ , model = 'text-embedding-ada-002'
){
parameter_list = list(model = model
diff --git a/R/request_prices.R b/R/request_prices.R
new file mode 100644
index 0000000..9e1ec54
--- /dev/null
+++ b/R/request_prices.R
@@ -0,0 +1,9 @@
+#' Contains the pricing for completion requests (see: [https://openai.com/api/pricing/#faq-completions-pricing](https://openai.com/api/pricing/#faq-completions-pricing))
+#'
+#' @description
+#' These are the prices listed for 1k tokens of requests for the various models. These are needed for the `rgpt3_cost_estimate(...)` function.
+#' @export
+price_base_davinci = 0.02
+price_base_curie = 0.002
+price_base_babbage = 0.0005
+price_base_ada = 0.0004