Updated readme and removed KorAP URI configuration via kustvakt.conf.
Change-Id: I908f02ed1ed8d71acd6e497bc292885a915dfdcc
diff --git a/readme.md b/readme.md
index 18561ef..1068e8c 100644
--- a/readme.md
+++ b/readme.md
@@ -7,25 +7,39 @@
CLARIN defines FCS specifications to allow distributed search across multiple heterogenous search engines in a uniform way. FCS specifications are built on the [SRU/CQL protocol](http://www.loc.gov/standards/sru/) for communications between its client and endpoint. FCS 1.0 specification supports SRU (Search Retrieve via URL) 1.2 and FCS 2.0 specification supports SRU 2.0.
-[KorapSRU 1.0.1 release](https://github.com/KorAP/KorapSRU/releases/tag/release-1.0.1) implements FCS 1.0 specification and supports basic search using simple CQL (Contextual Query Language) for term query, phrase query and boolean query. FCS 2.0 specification is implemented in the newest version of KorapSRU, but it has not been released yet. It supports extended search (e.g. annotation search) that can be formulated using FCS Query Language (FCSQL) developed based on Corpus Query Processor ([CQP](http://cwb.sourceforge.net/files/CQP_Tutorial/)).
+[KorapSRU 1.0.1 release](https://github.com/KorAP/KorapSRU/releases/tag/release-1.0.1) implements FCS 1.0 specification and supports basic search using simple CQL (Contextual Query Language) for term query, phrase query and boolean query. FCS 2.0 specification is implemented in the newest version of KorapSRU, but it has not been released yet. It supports extended search (e.g. annotation search) that can be formulated using FCS Query Language (FCSQL) developed based on Corpus Query Processor ([CQP](http://cwb.sourceforge.net/files/CQP_Tutorial/)). FCSQL is only available with SRU version 2.0, whilst CQL is available with SRU version 1.1, 1.2 and 2.0.
Usually CQL and FCSQL queries are translated into the native language of a search engine in an FCS endpoint. Since KorAP supports multiple query languages and has its own query translator [Koral](https://github.com/KorAP/Koral), the translation is implemented in Koral, not in KorapSRU. Therefore, KorAP users will also be able to use CQL and FCSQL.
## Supported SRU requests
-* SRU explain request
+### SRU explain request
- gives general information about KorapSRU and some default search settings, for instance the number of records it retrieves per page (see [http://clarin.ids-mannheim.de/korapsru?operation=explain](http://clarin.ids-mannheim.de/korapsru?operation=explain)). To obtain more information such as supported annotation layers needed for requesting an extended search,
+gives general information about KorapSRU and some default search settings, for instance the number of records it retrieves per page. See:
+> [http://clarin.ids-mannheim.de/korapsru?operation=explain](http://clarin.ids-mannheim.de/korapsru?operation=explain)
- ```
- x-fcs-endpoint-description=true
- ```
- must be added as an extra request parameter (see [http://clarin.ids-mannheim.de/korapsru?operation=explain&x-fcs-endpoint-description=true](http://clarin.ids-mannheim.de/korapsru?operation=explain&x-fcs-endpoint-description=true)).
+To obtain more information such as supported annotation layers needed for requesting an extended search,
-* SRU search retrieve request
+```
+x-fcs-endpoint-description=true
+```
- contains a CQL or FCSQL query, for example [http://clarin.ids-mannheim.de/korapsru?operation=searchRetrieve&query=das%20Buch&version=1.2](http://clarin.ids-mannheim.de/korapsru?operation=searchRetrieve&query=das%20Buch&version=1.2). KorapSRU forwards the CQL or FCSQL query in an SRU search retrieve request URL to [Kustvakt](https://github.com/KorAP/Kustvakt), the API provider of KorAP managing the communications among all KorAP components. Moreover, KorapSRU transforms the query results from Kustvakt into an SRU response.
+must be added as an extra request parameter. See:
+> [http://clarin.ids-mannheim.de/korapsru?operation=explain&x-fcs-endpoint-description=true](http://clarin.ids-mannheim.de/korapsru?operation=explain&x-fcs-endpoint-description=true)
+
+### SRU search retrieve request
+
+contains a CQL or FCSQL query. KorapSRU forwards the CQL or FCSQL query in an SRU search retrieve request URL to [Kustvakt](https://github.com/KorAP/Kustvakt), the API provider of KorAP managing the communications among all KorAP components. Moreover, KorapSRU transforms the query results from Kustvakt into an SRU response.
+
+Examples:
+* Basic search using CQL
+> [http://clarin.ids-mannheim.de/korapsru?operation=searchRetrieve&query=Buch&version=1.2](http://clarin.ids-mannheim.de/korapsru?operation=searchRetrieve&query=Buch&version=1.2)
+
+* Annotation search using FCSQL
+> http://clarin.ids-mannheim.de/korapsru?operation=searchRetrieve&query=\[tt:lemma=".*bar"\]&queryType=fcs
+
+ The query must not be URL-encoded.
## Software Requirements
@@ -37,10 +51,13 @@
## Installation
-Configure the service URI in the ```src/main/resource/kustvakt.conf``` file to a Kustvakt server URI, for example:
+Configure the service URI in ```/src/main/webapp/WEB-INF/web.xml``` to a Kustvakt server URI, for example:
```
-korapsru.client.service.uri=http://localhost:8089/api/v0.1/
+<context-param>
+ <param-name>korap.service.uri</param-name>
+ <param-value>http://localhost:8089/api/</param-value>
+</context-param>
```
KorapSRU is built based on the FCSSimpleEndpoint library provided by CLARIN. KorapSRU 1.0.2-SNAPSHOT uses FCSSimpleEndpoint version 1.3.0 available from CLARIN Nexus repository. To allow Maven to download the library using JDK 1.7, an additional Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files 7 is needed.
diff --git a/src/main/java/de/mannheim/ids/korap/sru/KorapClient.java b/src/main/java/de/mannheim/ids/korap/sru/KorapClient.java
index a131e9d..359c7d7 100644
--- a/src/main/java/de/mannheim/ids/korap/sru/KorapClient.java
+++ b/src/main/java/de/mannheim/ids/korap/sru/KorapClient.java
@@ -37,9 +37,8 @@
*/
public class KorapClient {
- private static String serviceUri;
- private static final String CONFIGURATION_FILE =
- "kustvakt.conf";
+ private String serviceUri;
+ private static final String CONFIGURATION_FILE = "kustvakt.conf";
private static final String SERVICE_URI_PROPERTY =
"korapsru.client.service.uri";
private static final String DEFAULT_CONTEXT_TYPE = "sentence";
@@ -52,44 +51,19 @@
private static Logger logger =
(Logger) LoggerFactory.getLogger(KorapClient.class);
-
/**
* Constructs a KorapClient with the given number of records per
* page and the maximum number of records.
*
+ * @param serviceUri
+ * KorAP service URI
* @param numOfRecords
* the number of records per page
* @param maxRecords
* the number of maximum records/matches to retrieve
* @throws FileNotFoundException
*/
- public KorapClient (int numOfRecords, int maxRecords)
- throws FileNotFoundException {
- this.defaultNumOfRecords = numOfRecords;
- this.defaultMaxRecords = maxRecords;
-
- Properties properties = new Properties();
- String path = System.getenv("$HOME")+"/"+CONFIGURATION_FILE;
- InputStream is = getClass().getClassLoader()
- .getResourceAsStream(path);
- try {
- properties.load(is);
- }
- catch (IOException e) {
- throw new FileNotFoundException("Configuration file "
- + CONFIGURATION_FILE + " is not found.");
- }
- if (properties.containsKey(SERVICE_URI_PROPERTY)) {
- serviceUri = properties.getProperty("korapsru.client.service.uri");
- logger.info(serviceUri);
- }
- else {
- throw new NullPointerException("Please specify korapsru.client."
- + "service.uri in the configuration file.");
- }
- }
-
- public KorapClient (String serviceUri, int numOfRecords, int maxRecords){
+ public KorapClient (String serviceUri, int numOfRecords, int maxRecords) {
this.defaultNumOfRecords = numOfRecords;
this.defaultMaxRecords = maxRecords;
this.serviceUri = serviceUri;
@@ -335,7 +309,7 @@
* @throws IOException
* @throws URISyntaxException
*/
- public static String retrieveAnnotations (String resourceId,
+ public String retrieveAnnotations (String resourceId,
String documentId, String textId, String matchId, String foundry)
throws IOException, URISyntaxException {
@@ -420,7 +394,7 @@
* @return a HttpGet request
* @throws URISyntaxException
*/
- private static HttpGet createMatchInfoRequest (String resourceId,
+ private HttpGet createMatchInfoRequest (String resourceId,
String documentId, String textId, String matchId, String foundry)
throws URISyntaxException {
diff --git a/src/main/java/de/mannheim/ids/korap/sru/KorapEndpointDescription.java b/src/main/java/de/mannheim/ids/korap/sru/KorapEndpointDescription.java
index 0e29f1f..ee6a04a 100644
--- a/src/main/java/de/mannheim/ids/korap/sru/KorapEndpointDescription.java
+++ b/src/main/java/de/mannheim/ids/korap/sru/KorapEndpointDescription.java
@@ -4,7 +4,6 @@
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URI;
-import java.net.URISyntaxException;
import java.net.URL;
import java.util.ArrayList;
import java.util.HashMap;
@@ -20,11 +19,11 @@
import eu.clarin.sru.server.SRUConstants;
import eu.clarin.sru.server.SRUException;
import eu.clarin.sru.server.fcs.DataView;
-import eu.clarin.sru.server.fcs.utils.SimpleEndpointDescriptionParser;
import eu.clarin.sru.server.fcs.DataView.DeliveryPolicy;
import eu.clarin.sru.server.fcs.EndpointDescription;
import eu.clarin.sru.server.fcs.Layer;
import eu.clarin.sru.server.fcs.ResourceInfo;
+import eu.clarin.sru.server.fcs.utils.SimpleEndpointDescriptionParser;
/**
* Contains information for generating a response of SRU explain
diff --git a/src/main/java/de/mannheim/ids/korap/sru/KorapSRU.java b/src/main/java/de/mannheim/ids/korap/sru/KorapSRU.java
index 14bd9e1..616ddeb 100644
--- a/src/main/java/de/mannheim/ids/korap/sru/KorapSRU.java
+++ b/src/main/java/de/mannheim/ids/korap/sru/KorapSRU.java
@@ -1,6 +1,5 @@
package de.mannheim.ids.korap.sru;
-import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.List;
import java.util.Map;
@@ -85,28 +84,56 @@
QueryLanguage queryLanguage = parseQueryLanguage(request);
String queryType = request.getQueryType();
+ if (!queryType.equals("fcs") && !queryType.equals("cql")){
+ throw new SRUException(SRUConstants.SRU_UNSUPPORTED_PARAMETER_VALUE,
+ "Query type "+ queryType+ " is not supported.");
+ }
logger.info("Query language: " + queryType);
-
+
+ SRUVersion sruVersion = request.getVersion();
+ // EM: actually not necessary because query type is only available in SRU 2.0
+// if (!isVersionCorrect(queryType, sruVersion)){
+// throw new SRUException(SRUConstants.SRU_GENERAL_SYSTEM_ERROR,
+// "Query type "+queryType+" "+ "and version "+
+// sruVersion.toString() +" do not match.");
+// }
+ String version = parseVersion(sruVersion);
+
String queryStr = request.getQuery().getRawQuery();
if ((queryStr == null) || queryStr.isEmpty()) {
throw new SRUException(SRUConstants.SRU_EMPTY_TERM_UNSUPPORTED,
- "An empty term is not supported.");
+ "Empty term is not supported.");
}
logger.info("korapsru query: " + queryStr);
- String version = parseVersion(request.getVersion());
-
KorapResult korapResult = sendQuery(queryStr, request, version,
queryLanguage);
checkKorapResultError(korapResult, queryLanguage,
isRewitesAllowed(request), diagnostics);
logger.info("Number of records: "+korapResult.getTotalResults());
- return new KorapSRUSearchResultSet(diagnostics, korapResult, dataviews,
+ return new KorapSRUSearchResultSet(korapClient, diagnostics, korapResult, dataviews,
korapEndpointDescription.getTextLayer(),
korapEndpointDescription.getAnnotationLayers());
}
+ private boolean isVersionCorrect (String queryType, SRUVersion version) {
+ if (queryType.equals("fcs")){
+ if (version.equals(SRUVersion.VERSION_2_0)){
+ return true;
+ }
+ }
+ else if(queryType.equals("cql")){
+ if (version.equals(SRUVersion.VERSION_1_1) ||
+ version.equals(SRUVersion.VERSION_1_2) ||
+ version.equals(SRUVersion.VERSION_2_0) ){
+ return true;
+ }
+ }
+
+ return false;
+ }
+
private String parseVersion(SRUVersion version) throws SRUException {
if (version == SRUVersion.VERSION_1_1) {
return "1.1";
diff --git a/src/main/java/de/mannheim/ids/korap/sru/KorapSRUSearchResultSet.java b/src/main/java/de/mannheim/ids/korap/sru/KorapSRUSearchResultSet.java
index 95fed85..db78ca2 100644
--- a/src/main/java/de/mannheim/ids/korap/sru/KorapSRUSearchResultSet.java
+++ b/src/main/java/de/mannheim/ids/korap/sru/KorapSRUSearchResultSet.java
@@ -45,9 +45,11 @@
private SAXParser saxParser;
private Layer textLayer;
private AnnotationHandler annotationHandler;
+ private KorapClient korapClient;
/**
* Constructs a KorapSRUSearchResultSet for the given KorapResult.
+ * @param korapClient
*
* @param diagnostics
* a list of SRU diagnostics
@@ -61,7 +63,7 @@
* the list of annotation layers
* @throws SRUException
*/
- public KorapSRUSearchResultSet (SRUDiagnosticList diagnostics,
+ public KorapSRUSearchResultSet (KorapClient korapClient, SRUDiagnosticList diagnostics,
KorapResult korapResult, List<String> dataviews, Layer textlayer,
List<AnnotationLayer> annotationLayers) throws SRUException {
super(diagnostics);
@@ -74,6 +76,7 @@
throw new SRUException(SRUConstants.SRU_GENERAL_SYSTEM_ERROR, e);
}
+ this.korapClient = korapClient;
this.korapResult = korapResult;
this.dataviews = dataviews;
this.textLayer = textlayer;
@@ -160,7 +163,7 @@
}
try {
- String annotationSnippet = KorapClient.retrieveAnnotations(
+ String annotationSnippet = korapClient.retrieveAnnotations(
match.getCorpusId(), match.getDocId(), match.getTextId(),
match.getPositionId(), "*");
InputStream is = new ByteArrayInputStream(
diff --git a/src/main/resources/kustvakt.conf b/src/main/resources/kustvakt.conf
deleted file mode 100644
index 8060d0f..0000000
--- a/src/main/resources/kustvakt.conf
+++ /dev/null
@@ -1,4 +0,0 @@
-# KorAP configuration file that is also used in Krill and Kustvakt.
-
-# Configuration for KorapSRU
-korapsru.client.service.uri = [KUSTVAKT SERVER SERVICE URI]
\ No newline at end of file
diff --git a/src/main/webapp/WEB-INF/web.xml b/src/main/webapp/WEB-INF/web.xml
index 8574a4b..1a06230 100644
--- a/src/main/webapp/WEB-INF/web.xml
+++ b/src/main/webapp/WEB-INF/web.xml
@@ -7,7 +7,7 @@
<context-param>
<param-name>korap.service.uri</param-name>
- <param-value>http://10.0.10.52:9000/api/</param-value>
+ <param-value>http://localhost:8089/api/</param-value>
</context-param>
<servlet>
@@ -55,6 +55,10 @@
<param-value>loc</param-value>
</init-param>
<init-param>
+ <param-name>eu.clarin.sru.server.sruSupportedVersionDefault</param-name>
+ <param-value>2.0</param-value>
+ </init-param>
+ <init-param>
<param-name>eu.clarin.sru.server.sruSupportedVersionMax</param-name>
<param-value>2.0</param-value>
</init-param>
diff --git a/src/test/java/de/mannheim/ids/korap/test/KorapClientTest.java b/src/test/java/de/mannheim/ids/korap/test/KorapClientTest.java
index a6cbdd7..4c518d0 100644
--- a/src/test/java/de/mannheim/ids/korap/test/KorapClientTest.java
+++ b/src/test/java/de/mannheim/ids/korap/test/KorapClientTest.java
@@ -35,11 +35,6 @@
private KorapResult result;
private KorapMatch match;
- public KorapClientTest () throws FileNotFoundException {
- c = new KorapClient(25, 50);
- }
-
-
@Test
public void testCQLQuery () throws HttpResponseException, IOException {
result = c.query("der", QueryLanguage.CQL, "1.2", 1, 25,
@@ -76,7 +71,7 @@
@Test
public void testRetrieveAnnotations() throws IOException, URISyntaxException {
- String annotationSnippet = KorapClient.retrieveAnnotations(
+ String annotationSnippet = c.retrieveAnnotations(
"GOE", "AGF", "00000",
"p7667-7668", "*");
@@ -86,7 +81,7 @@
@Test
public void testRetrieveNonexistingAnnotation() throws IOException, URISyntaxException {
- String annotationSnippet = KorapClient.retrieveAnnotations(
+ String annotationSnippet = c.retrieveAnnotations(
"WPD15", "D18", "06488",
"p588-589", "*");