2025-01-03
- Trying to get search results for a large boolean query given to me by some researchers
- When searching via the Angular frontend I see an error in the Tomcat logs:
Jan 03 09:08:40 dspace tomcat9[876]: Jan 03, 2025 9:08:40 AM org.apache.coyote.http11.Http11Processor service
Jan 03 09:08:40 dspace tomcat9[876]: INFO: Error parsing HTTP request header
Jan 03 09:08:40 dspace tomcat9[876]: Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
Jan 03 09:08:40 dspace tomcat9[876]: java.lang.IllegalArgumentException: Request header is too large
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:778)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.coyote.http11.Http11InputBuffer.parseHeader(Http11InputBuffer.java:892)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.coyote.http11.Http11InputBuffer.parseHeaders(Http11InputBuffer.java:593)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:279)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:937)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1791)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
Jan 03 09:08:40 dspace tomcat9[876]: at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
Jan 03 09:08:40 dspace tomcat9[876]: at java.base/java.lang.Thread.run(Thread.java:840)
- The size of the query itself is 5362 bytes
- Increasing the
maxHttpHeaderSize
from the default of 8192 bytes to 16384 allows the search to complete successfully
- I notice that we had previously increased the
maxHttpHeaderSize
on the HTTP connector in Tomcat 7, which we are no longer using in Tomcat 9, so this is an overdue change
2025-01-27
- Export publishers, affiliations, and donors from CGSpace to do batch corrections:
localhost/dspace= ☘ \COPY (SELECT DISTINCT text_value AS "dcterms.publisher", count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 178 GROUP BY "dcterms.publisher" ORDER BY count DESC) to /tmp/2025-01-publishers.csv WITH CSV HEADER;
COPY 4925
localhost/dspace= ☘ \COPY (SELECT DISTINCT text_value AS "cg.contributor.affiliation", count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 211 GROUP BY "cg.contributor.affiliation" ORDER BY count DESC) to /tmp/2025-01-affiliations.csv WITH CSV HEADER;
COPY 15264
localhost/dspace= ☘ \COPY (SELECT DISTINCT text_value AS "cg.contributor.donor", count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 248 GROUP BY "cg.contributor.donor" ORDER BY count DESC) to /tmp/2025-01-donors.csv WITH CSV HEADER;
COPY 3531