Add more information to Readme including License

Change-Id: I62395f3490d39a87105943fcc82e96d988205dd5
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..b462bd9
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,24 @@
+Copyright (c) 2022, IDS Mannheim
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright notice, 
+   this list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright notice, 
+   this list of conditions and the following disclaimer in the documentation 
+   and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 
+ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
+HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 
+LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 
+OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH 
+DAMAGE.
\ No newline at end of file
diff --git a/Readme.md b/Readme.md
index dd220ce..21683e4 100644
--- a/Readme.md
+++ b/Readme.md
@@ -1,6 +1,7 @@
 # KorAP-Docker
 
-KorAP consists of several components,
+The [KorAP Corpus Analysis Platform](http://korap.ids-mannheim.de/)
+consists of several independent components,
 but they can easily be installed together using
 [Docker](https://www.docker.com/).
 This repository contains a recipe to install all
@@ -9,11 +10,12 @@
 
 In addition, all relevant tools are installed and
 made available that are necessary for data conversion
-and indexing of corpora in the widely used TEI-P5 (I5)
-format for KorAP.
+and indexing of corpora in the widely used TEI-P5
+([I5](https://www.ids-mannheim.de/en/digspra/corpus-linguistics/projects/corpus-development/ids-text-model/)) format for KorAP.
 For different options of the tools we refer to the
 respective repositories.
 
+
 ## Requirements
 
 Install [docker](https://www.docker.com/) and
@@ -22,7 +24,7 @@
 
 ## Starting
 
-To download, intialize and run KorAP pointing to a certain directory index
+To download, intialize and run KorAP pointing to an existing index
 (in this example `index` in the local directory), run
 
 ```shell
@@ -35,14 +37,17 @@
 
 ## Corpus Conversion
 
-Depending on the corpus data to be indexed, it must first be converted.
-In the case of a conversion from TEI P5 (I5) format,
+In order to create an index based on existing
+corpus data, some conversion steps are usually
+necessary.
+In the case of a conversion from TEI P5
+([I5](https://www.ids-mannheim.de/en/digspra/corpus-linguistics/projects/corpus-development/ids-text-model/)) format,
 the tools required for this have already been installed
 with the command above.
 
-In the following we take the
+In the following we take the open part of the
 [Dortmunder Chatkorpus 2.2](https://www.uni-due.de/germanistik/chatkorpus/)
-as an example to build an index.
+(Beißwenger & Storrer 2008) as an example to build an index.
 
 The file is located at `example/dck-part1.i5.xml`.
 
@@ -56,11 +61,19 @@
   --input /data/dck-part1.i5.xml > dck.zip
 ```
 
-... will convert the i5 file into a KorAP-XML file using
+... will convert the i5 file into a
+[KorAP-XML](https://github.com/KorAP/KorAP-XML-Krill#about-korap-xml)
+file using
 [tei2korapxml](https://github.com/KorAP/KorAP-XML-TEI).
 
+This format is designed to add further arbitrary annotations
+to the primary data. In this example, however, we will stick
+with the inline annotations that the example corpus already
+contains and will make available later under the label `cmc`.
+
 To convert the KorAP-XML archive in a second step
-into individual Krill JSON files, the following command ...
+into individual [Krill](https://github.com/KorAP/Krill) compatible
+JSON files, the following command ...
 
 ```shell
 $ mkdir json
@@ -81,11 +94,15 @@
 Depending on how the source data is designed,
 different parameters must be specified for the conversion.
 
+Here, the inline token annotation is used as the basis for
+word tokenization, and the included document structure is 
+used for default annotation of sentence and paragraph boundaries.
+
 
 ## Index Creation
 
 [Krill](https://github.com/KorAP/Krill)'s indexer tool can now
-be used to index the json files:
+be used to index the JSON files:
 
 ```shell
 $ mkdir index
@@ -96,3 +113,27 @@
 
 After that, the index can be loaded with the aforementioned
 call and is searchable via the browser.
+
+## Development and License
+
+**Authors**: [Nils Diewald](https://www.nils-diewald.de/), Harald Lüngen, Marc Kupietz
+
+Copyright (c) 2022, [IDS Mannheim](https://www.ids-mannheim.de/), Germany
+
+KorAP-Docker is published under the BSD-2 License.
+
+The example corpus corresponds to the *release part* of the
+[Dortmunder Chatkorpus 2.2](https://www.uni-due.de/germanistik/chatkorpus/)
+as prepared by
+[DeReKo](https://www.ids-mannheim.de/digspra/kl/projekte/korpora/).
+The corpus is released under the [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) License.
+Legal restrictions may arise from data protection legislation.
+
+
+## Bibliography
+
+Beißwenger, Michael / Storrer, Angelika (2008):
+Corpora of Computer-Mediated Communication.
+In: Anke Lüdeling & Merja Kytö (Eds): *Corpus Linguistics. An International Handbook.*
+Volume 1. Berlin. New York (Handbooks of Linguistics and Communication Science 29.1),
+pp. 292--308.