KorAP-SRU

KorAP has been integrated to the CLARIN technology and infrastructure, especially the CLARIN-FCS (Federated Content Search). CLARIN-FCS is an interface specification implementing the Search Retrieve via URL / Contextual Query Language (SRU/CQL), where SRU is a client-server standard XML-based protocol formulating CQL queries in URL to perform a search. CLARIN-FCS allows searching within resource content stored in CLARIN repositories.

KorAP-SRU, an implementation of the CLARIN-FCS, namely an endpoint, has been released. It allows searching in IDS Mannheim repository via KorAP. KorAP-SRU currently has the basic search capability as defined by CLARIN-FCS supporting term-only (e.g Hund) and boolean (AND and OR) queries. Moreover, it interprets the queries as case-sensitive.

Typically an FCS endpoint needs to translate a query in an SRU search retrieve request into the query language of the search engine. Since KorAP can accept various query languages including CQL, the KorAP-SRU endpoint does not need to alter the CQL query. It simply includes the query in an HTTP request and sent it to KorAP public search service. The KorAP service sends back query results serialized in JSON format and KorAP-SRU translates this into CLARIN-FCS result format.

The KorAP-SRU endpoint has been registered in the CLARIN center registry, specifically in the IDS center information. It is connected to the Aggregator a CLARIN-FCS client sending search requests to multiple CLARIN repositories, collecting and displaying the results. In the near future, it will be integrated to Weblicht and can be used as a tool in building a linguistic processing tool chain or pipeline.