commit | 12a9fe8ababb12ef7299a11a7cb8ebc49045a952 | [log] [tgz] |
---|---|---|
author | Marc Kupietz <kupietz@ids-mannheim.de> | Tue Jun 16 14:27:57 2020 +0200 |
committer | Marc Kupietz <kupietz@ids-mannheim.de> | Fri Jun 19 13:05:45 2020 +0200 |
tree | e58fd825c18cb343e94c786ca427f9ef9f1ee919 |
Initial import Change-Id: I91c66e3ceb8d17e547f2e96bf273beb7a4b76762
Currently, there is no native KorAP client library for Python, yet. With rpy2, however, you can already use the KorAP client library for R from within Python.
#### Debian / Ubuntu sudo apt install r-base r-base-dev libcurl4-gnutls-dev libssl-dev libxml2-dev libsodium-dev python3-pip python3-rpy2 python3-pandas echo 'install.packages("RKorAPClient", repos="http://cran.rstudio.com/")' | R --vanilla pip3 install plotly-express #### Fedora / CentOS / RHEL sudo yum install r-base R-devel libcurl-devel openssl-devel libxml2-devel libsodium-devel python3-pandas echo 'install.packages("RKorAPClient", repos="http://cran.rstudio.com/")' | R --vanilla pip3 install rpy2 plotly-express
pip install rpy2
pip install plotly.express
import rpy2.robjects.packages as packages import rpy2.robjects.pandas2ri as pandas2ri import plotly.express as px pandas2ri.activate() QUERY = "Hello World" YEARS = range(2010, 2019) COUNTRIES = ["DE", "CH"] RKorAPClient = packages.importr('RKorAPClient') kcon = RKorAPClient.KorAPConnection(verbose=True) vcs = ["textType=/Zeit.*/ & pubPlaceKey=" + c + " & pubDate in " + str(y) for c in COUNTRIES for y in YEARS] df = RKorAPClient.ipm(RKorAPClient.frequencyQuery(kcon, QUERY, vcs)) df['Year'] = [y for c in COUNTRIES for y in YEARS] df['Country'] = [c for c in COUNTRIES for y in YEARS] fig = px.line(df, title=QUERY, x="Year", y="ipm", color="Country", error_y="conf.high", error_y_minus="conf.low") fig.show()
By using the KorAPClient you agree to the respective terms of use of the accessed KorAP API services which will be printed upon opening a connection.
Author: Marc Kupietz
Copyright (c) 2020, Leibniz Institute for the German Language, Mannheim, Germany
This package is developed as part of the KorAP Corpus Analysis Platform at the Leibniz Institute for German Language (IDS).
It is published under the BSD-2 License.
Contributions are very welcome!
Your contributions should ideally be committed via our Gerrit server to facilitate reviewing (see Gerrit Code Review - A Quick Introduction if you are not familiar with Gerrit). However, we are also happy to accept comments and pull requests via GitHub.
Please note that unless you explicitly state otherwise any contribution intentionally submitted for inclusion into this software shall – as this software itself – be under the BSD-2 License.
Kupietz, Marc / Margaretha, Eliza / Diewald, Nils / Lüngen, Harald / Fankhauser, Peter (2019): What’s New in EuReCo? Interoperability, Comparable Corpora, Licensing. In: Bański, Piotr/Barbaresi, Adrien/Biber, Hanno/Breiteneder, Evelyn/Clematide, Simon/Kupietz, Marc/Lüngen, Harald/Iliadi, Caroline (eds.): Proceedings of the International Corpus Linguistics Conference 2019 Workshop "Challenges in the Management of Large Corpora (CMLC-7)", 22nd of July Mannheim: Leibniz-Institut für Deutsche Sprache, 33-39.
Kupietz, Marc / Diewald, Nils / Margaretha, Eliza (forthcoming): RKorAPClient: An R package for accessing the German Reference Corpus DeReKo via KorAP. In: Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020). Marseille/Paris: European Language Resources Association (ELRA).