tree: eade14984d5fdac8b84bef7218273a0f54a1eb4c [path history] [tgz]
  1. .github/
  2. demo/
  3. inst/
  4. man/
  5. R/
  6. tests/
  7. .gitignore
  8. .Rbuildignore
  9. cran-comments.md
  10. DESCRIPTION
  11. LICENSE
  12. LICENSE.md
  13. NAMESPACE
  14. NEWS.md
  15. Readme.md
  16. RKorAPClient.Rproj
Readme.md

KorAP web service client package for R

CRAN_Status_Badge CRAN downloads Project Status: Active – The project has reached a stable, usable state and is being actively developed. Lifecycle:stable Codecov test coverage Last commit GitHub closed issues GitHub issues check-windows check-mac check-linux Github Stars

Description

R client package to access the web service API of the KorAP Corpus Analysis Platform developed at IDS Mannheim

Installation

System Dependencies on Linux

RKorAPClient uses some R packages with system dependencies you might need to install first:

#### Debian / Ubuntu
sudo apt install r-base-dev libcurl4-gnutls-dev libxml2-dev libsodium-dev

#### Fedora / CentOS >= 8 / RHEL >= 8
sudo dnf install R-devel libcurl-devel openssl-devel libxml2-devel libsodium-devel

#### CentOS < 8 / RHEL < 8
sudo yum install R-devel libcurl-devel openssl-devel libxml2-devel libsodium-devel

#### Arch Linux
pacman -S base-devel gcc-fortran libsodium curl

Package installation

CRAN version:

install.packages("RKorAPClient")

Development version (alternatives):

devtools::install_github("KorAP/RKorAPClient")
remotes::install_github("KorAP/RKorAPClient")
devtools::install_git("https://korap.ids-mannheim.de/gerrit/KorAP/RKorAPClient")
remotes::install_git("https://korap.ids-mannheim.de/gerrit/KorAP/RKorAPClient")

Examples

Hello world

library(RKorAPClient)
new("KorAPConnection", verbose=TRUE) %>% corpusQuery("Hello world") %>% fetchAll()

Frequencies over time and domains using ggplot2

library(RKorAPClient)
library(ggplot2)
kco <- new("KorAPConnection", verbose=TRUE)
expand_grid(condition = c("textDomain = /Wirtschaft.*/", "textDomain != /Wirtschaft.*/"), 
            year = (2002:2018)) %>%
    cbind(frequencyQuery(kco, "[tt/l=Heuschrecke]", paste0(.$condition," & pubDate in ", .$year)))  %>%
    ipm() %>%
    ggplot(aes(x = year, y = ipm, fill = condition, colour = condition)) +
    geom_freq_by_year_ci()

Percentages over time using highcharter

See the Highcharts license notes below.

library(RKorAPClient)
query = c("macht []{0,3} Sinn", "ergibt []{0,3} Sinn")
years = c(1980:2010)
as.alternatives = TRUE
vc = "textType = /Zeit.*/ & pubDate in"
new("KorAPConnection", verbose=T) %>%
  frequencyQuery(query, paste(vc, years), as.alternatives = as.alternatives) %>%
  hc_freq_by_year_ci(as.alternatives)

Proportion of "ergibt … Sinn"  versus "macht … Sinn" between 1980 and 2010 in newspapers and magazines

Identify in … setzen light verb constructions by using the new collocationAnalysis function

Lifecycle:experimental

library(RKorAPClient)
library(knitr)
new("KorAPConnection", verbose = TRUE) %>%
  collocationAnalysis(
    "focus(in [tt/p=NN] {[tt/l=setzen]})",
    leftContextSize = 1,
    rightContextSize = 0,
    exactFrequencies = FALSE,
    searchHitsSampleLimit = 1000,
    topCollocatesLimit = 20
  ) %>%
  mutate(LVC = sprintf("[in %s setzen](%s)", collocate, webUIRequestUrl)) %>%
  select(LVC, logDice, pmi, ll) %>%
  head(10) %>%
  kable(format="pipe", digits=2)
LVClogDicepmill
in Szene setzen9.6610.86465467.52
in Gang setzen9.2110.57256146.92
in Verbindung setzen8.469.62189682.19
in Kenntnis setzen8.289.81101112.02
in Bewegung setzen8.119.24149397.91
in Brand setzen8.109.33122427.05
in Anführungszeichen setzen7.5011.9633959.99
in Kraft setzen6.887.8877796.85
in Marsch setzen6.879.2722041.63
in Klammern setzen6.5510.0815643.27

Demos

More elaborate R scripts demonstrating the use of the package can be found in the demo folder.

Development and License

RKorAPClient

Authors: Marc Kupietz, Nils Diewald

Copyright (c) 2021, Leibniz Institute for the German Language, Mannheim, Germany

This package is developed as part of the KorAP Corpus Analysis Platform at the Leibniz Institute for German Language (IDS).

It is published under the BSD-2 License.

Further Affected Licenses and Terms of Services

Bundled Assets

The KorAP logo was designed by Norbert Cußler-Volz and is released under the terms of the Creative Commons License BY-NC-ND 4.0.

Highcharts

RKorAPClient imports parts of the highcharter package which has a dependency on Highcharts, a commercial JavaScript charting library. Highcharts offers both a commercial license as well as a free non-commercial license. Please review the licensing options and terms before using the highcharter plot options, as the RKorAPClient license neither provides nor implies a license for Highcharts.

Highcharts is a Highsoft product which is not free for commercial and governmental use.

Accessed API Services

By using RKorAPClient you agree to the respective terms of use of the accessed KorAP API services which will be printed upon opening a connection (new("KorAPConnection", ...).

Contributions

Contributions are very welcome!

Your contributions should ideally be committed via our Gerrit server to facilitate reviewing (see Gerrit Code Review - A Quick Introduction if you are not familiar with Gerrit). However, we are also happy to accept comments and pull requests via GitHub.

Please note that unless you explicitly state otherwise any contribution intentionally submitted for inclusion into this software shall – as this software itself – be under the BSD-2 License.

References