The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries

Things to take care

1) Set the contentType on the jsp via

<%@ page contentType="text/html;charset=UTF-8" %>

2) Set this tag inside the jsp:
<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8" />

3) Add the below filter to the web.xml

<!– Ensure UTF-8 encoded pages so that certain characters are displayed and submitted correctly –>


4) Make sure the database fields have appropriate collation like utf-8, utf-16


Sometimes this issue may related to the protocol which we are using to connect with aws dynamodb. By default it is using the protocol https to connect with the AWS. So try the http protocol.

To resolve this issue try the below code:

amazonDynamoDBClient = new AmazonDynamoDBClient(credentials);
amazonDynamoDBClient .setEndpoint(“http:///&#8221;);

Note: replace the with the appropriate endpoint